In order to guarantee the user experience of online apps, we generally monitor the crash rate of online apps in real time. Once spike is detected, we can immediately investigate the cause, but all this is on the premise that crash logs can be reported accurately.

There are two difficulties in reporting crash logs:

  • The code should be absolutely stable until crash Handler is installed

    If the log collector crashes before it is successfully started, logs cannot be collected. There’s not a lot of skill in this, except to strictly limit the code that can be executed before handler starts.

  • Reported when App infinite loop crash

    When a crash log is reported, a network request is sent. What can I do if the App crashes again before the request succeeds? Users may even crash in an infinite loop.

This article describes how to accurately report crash logs when the second case occurs.

First of all, we need a reliable way to determine whether the startup crash happened last time when the app is started. Introduce a feasible idea.

How do I detect continuous blinks

Continuous flashback contains two elements, flashback and continuous. Only if these two elements are present at the same time will it affect our log upload. The definition of flashback can be simply written as

App crash time - app startup time <= 5s (or other threshold)Copy the code

Continuous is defined as at least two or more consecutive occurrences. Generally, 2 times is enough. In many cases, users will give up trying after two consecutive flashbacks.

We can try to restore the life cycle of App crash scenario by recording several special time points timestamp.

  • App starts TIMESTAMP, defined as launchTs

    When App starts each time, it records the current time and writes the time array.

  • App Crash timestamp, defined as crashTs

    When the App starts each time, the time stamp of the last crash report is obtained through the Crash collection library and written into the time array.

  • App exits timestamp normally and is defined as terminateTs

    When the App to receive UIApplicationWillTerminateNotification notice, record the current timestamp, write time array. Note that there are many other types of App exit timestamps that cannot be accurately recorded.

TerminateTs is recorded to exclude a special case where the user manually kills the App immediately after starting it. If we correctly record the above three time stamps, we can get a time line related to App crash behavior. Such as:

launchTs => crashTs => launchTs => terminateTsCopy the code

or

launchTs => launchTs => launchTsCopy the code

or

launchTs => crashTs => launchTs => crashTs => launchTsCopy the code

Imagine the behavior of the three timelines above. Apparently, the third timeline appears to crash twice in a row. We just need to add the interval judgment, and we can tell if it’s two flashes in a row. Note that if there are terminateTs between two crashTs, it is not considered to be a continuous flash back. The detection code is relatively simple, I will not paste.

This timeline only records crash-related App startup and exit behaviors, and there are many special time points that are not recorded, such as App out of memory (FOOM) in the foreground. The main thread of the App stuck in the foreground was killed by Watch Dog, the App was forcibly killed when the iOS system was upgraded, and the App was forcibly killed when the App was upgraded from AppStore, etc. These special time points were not recorded, but these do not affect the continuous flash backoff detection of our App. So we can ignore it.

It should be noted here that since the time line record is read from disk during startup, which involves disk read and write, the startup time of App will be affected. An optimization point is to remove the old timestamp at each write time point, for example, only record the last 5 time stamps. Or when crash logs are not read, the whole process of continuous blink detection is not even started.

Next, let’s look at how we can continue uploading logs assuming that continuous flash backs are detected.

Wait for Crash logs to be uploaded simultaneously

The most straightforward way is to wait for the log to be uploaded successfully before the App code continues to execute.

Change the network request to synchronous? This will block the UI thread and will be forcibly killed by the system Watch Dog in the scenario of poor network, which is obviously not desirable.

We can still keep the asynchronous network request, but temporarily interrupt the FLOW of the UI thread and let the entire App wait in the RUNloop of the UI thread. Once the network request is successful, we will jump back to the original code flow of the UI thread.

Looking at the simple implementation, a few details need to be noted. First of all, we need to add an App interaction. Once we enter runloop waiting, a loading interface will be displayed to inform users to wait patiently. Secondly, the waiting time should not be too long. I personally suggest that the waiting time should not exceed 5s. Once the waiting time exceeds 5s, the original code flow of App should be restored regardless of whether the request uploaded by crash log succeeds or not. The failure to upload logs within 5 seconds should be relatively small, unless the log file is too large.

The drawbacks of this approach are obvious: (1) it is a major change (modifying the original code flow), (2) it requires new UI interactions, and (3) it increases the user wait time.

Let’s look at another trick.

Enable the background process to upload Crash logs

In fact, the optimal log upload is to upload the request to a different process, so that even if the App flashes back again, the execution of the code of the other process will not be affected.

The problem is that iOS apps are in a sandbox environment and the system does not allow code to fork a new process.

Fortunately, starting with iOS 8, a background session feature has been added to NSURLSession. This feature allows NSURLSession to execute network requests in a separate process. Personally, I feel that this feature was originally designed to enhance the experience of downloading audio and video resources in the background of some apps. When I actually tested it, I found that we could put the web request into another process, whether it was downloaded or uploaded. The code is also very simple, for example, I write a section of the following test code:

NSURLSessionConfiguration *config = [NSURLSessionConfiguration backgroundSessionConfigurationWithIdentifier:@"com.mrpeak.background.crashupload"]; NSURLSession *session = [NSURLSession sessionWithConfiguration:config delegate:self delegateQueue:[NSOperationQueue new]]; NSURL *url = [NSURL URLWithString: @ "https://images.unsplash.com/photo-1515816949419-7caf0a210607?ixlib=rb-0.3.5&ixid=eyJhcHBfaWQiOjEyMDd9&s= f46b60857b4826e733da34993ec26a2f&auto=format&fit=crop&w=1534&q=80"]; NSURLSessionDownloadTask *task = [session downloadTaskWithURL:url]; [task resume]; exit(0);Copy the code

After execution, we can see the following log on the console:

You can clearly see how the Nsurlsessiond process completes the network request for us and tries to wake up the App that has unexpectedly quit.

Of course, there are some details that need to be handled in this optimal way. For example, how to tell the App that a crash log was uploaded successfully and removed from the local. Because the App with continuous flashbacks is in an extremely unstable state, no code logic can ensure smooth completion.

Personally, I think an ideal way is to add a special flag to the logs reported by the background process, and then in the background, the client request ID and this flag are used to redo and sort.

Continuous flash withdrawal of online App is an extremely bad and terrible fault. The terrible thing is that when a large number of continuous flash withdrawal occurs and cannot be monitored, you are singing a ditty and tapping the code, and the boss suddenly finds that the App cannot be started on his mobile phone. When he opens the AppStore, he finds one-star bad reviews flooding in. If mainstream apps even get tech news, it’s not hard to expect a dark cauldron to take shape. “Fire Peter” will definitely appear in the next App update.

The full text.