Today’s RunLoop usage scenario is interesting and useful for long-term projects that need to track and solve user problems. Use RunLoop to detect the main thread being stuck and save the thread stack information for the next upload to the server.

The resources

Here’s how to use RunLoop to detect main thread lag:

  • Wechat iOS stuck monitoring system (this article should be read first to understand what situations will cause the main thread stuck and how to deal with it after it is detected)
  • Method of monitoring Caton (using RunLoop to monitor caton snippet code)
  • Simple monitoring of Caton in iOS demo (using RunLoop to monitor Caton example)

The principle of

The official documentation explains the order in which runloops are executed:

1. Notify observers that the run loop has been entered.
2. Notify observers that any ready timers are about to fire.
3. Notify observers that any input sources that are not port based are about to fire.
4. Fire any non-port-based input sources that are ready to fire.
5. If a port-based input sourceis ready and waiting to fire, process the event immediately. Go to step 9. 6. Notify observers that the thread is about to sleep. 7. Put the thread to  sleep until one of the following events occurs: * An event arrivesfor a port-based input source.
 * A timer fires.
 * The timeout value set for the run loop expires.
 * The run loop is explicitly woken up.
8. Notify observers that the thread just woke up.
9. Process the pending event.
 * If a user-defined timer fired, process the timer event and restart the loop. Go to step 2.
 * If an input source fired, deliver the event.
 * If the run loop was explicitly woken up but has not yet timed out, restart the loop. Go to step 2.
10. Notify observers that the run loop has exited.
Copy the code

The pseudo-code implementation looks like this:

Observers create AutoreleasePool: _objc_autoreleasePoolPush(); /// Observers create AutoreleasePool: _objc_autoreleasePoolPush(); __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopEntry);do{/// 2. Notify Observers that a Timer callback is about to occur. __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopBeforeTimers); Notifying Observers that Source (non-port-based,Source0) callback is about to occur. __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopBeforeSources); __CFRUNLOOP_IS_CALLING_OUT_TO_A_BLOCK__(block); /// 4. Trigger the Source0 (non-port-based) callback. __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__(source0); __CFRUNLOOP_IS_CALLING_OUT_TO_A_BLOCK__(block); /// Observers are in this state torelease and create AutoreleasePool: _objc_autoreleasePoolPop(); _objc_autoreleasePoolPush(); __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopBeforeWaiting); /// 7. sleep towaitmsg. mach_msg() -> mach_msg_trap(); Observers, __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopAfterWaiting); /// 8. /// 9. If the Timer wakes up, call Timer __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__(Timer); /// 9. Block __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__(dispatched_block); __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__() __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__(source1);
 
 
    } while(...). ; Observers are about torelease AutoreleasePool: _objc_autoreleasePoolPop(); __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopExit); }Copy the code

The main thread RunLoop is automatically started at startup and has no timeout, so normally the main thread RunLoop will only loop indefinitely between 2 and 9.

So we just need to add an observer to the RunLoop of the main thread to detect if the time from kCFRunLoopBeforeSources to kCFRunLoopBeforeWaiting is too long.

If it takes longer than a certain threshold, we consider it stalled, dump the current thread stack to a file, and upload the stalled information file to the server at a later appropriate time.

Implementation steps

After looking at the above two monitoring Caton sample demos, I wrote a Demo along the lines described above, which should be easier to understand. The first step is to create a child thread and start its RunLoop when the thread starts.

+ (instancetype)shareMonitor
{
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        instance = [[[self class] alloc] init];
        instance.monitorThread = [[NSThread alloc] initWithTarget:self selector:@selector(monitorThreadEntryPoint) object:nil];
        [instance.monitorThread start];
    });
    
    return instance;
}

+ (void)monitorThreadEntryPoint
{
    @autoreleasepool {
        [[NSThread currentThread] setName:@"FluencyMonitor"];
        NSRunLoop *runLoop = [NSRunLoop currentRunLoop];
        [runLoop addPort:[NSMachPort port] forMode:NSDefaultRunLoopMode]; [runLoop run]; }}Copy the code

The second step is to start monitoring by adding an observer to the RunLoop of the main thread and a timer to the child thread that checks the elapsed time every 0.5 seconds.

- (void)start
{
    if (_observer) {
        return; CFRunLoopObserverContext context = {0,(__bridge void*)self, NULL, NULL, NULL}; _observer = CFRunLoopObserverCreate(kCFAllocatorDefault, kCFRunLoopAllActivities, YES, 0, &runLoopObserverCallBack, &context); // 2. Add observer to main RunLoop CFRunLoopAddObserver(CFRunLoopGetMain(), _observer, kCFRunLoopCommonModes); // create a timer. And add to the child thread RunLoop [self performSelector: @ the selector (addTimerToMonitorThread) onThread: self. MonitorThread withObject: nilwaitUntilDone:NO modes:@[NSRunLoopCommonModes]];
}

- (void)addTimerToMonitorThread
{
    if (_timer) {
        return; } // create a timer CFRunLoopRef currentRunLoop = CFRunLoopGetCurrent(); CFRunLoopTimerContext context = {0, (__bridge void*)self, NULL, NULL, NULL}; _timer = CFRunLoopTimerCreate(kCFAllocatorDefault, 0.1, 0.01, 0, 0, &runLoopTimerCallBack, &context); // Add to the child thread's RunLoop CFRunLoopAddTimer(currentRunLoop, _timer, kCFRunLoopCommonModes); }Copy the code

The third step is to supplement the observer callback processing

static void runLoopObserverCallBack(CFRunLoopObserverRef observer, CFRunLoopActivity activity, void *info){
    FluencyMonitor *monitor = (__bridge FluencyMonitor*)info;
    NSLog(@"MainRunLoop---%@",[NSThread currentThread]);
    switch (activity) {
        case kCFRunLoopEntry:
            NSLog(@"kCFRunLoopEntry");
            break;
        case kCFRunLoopBeforeTimers:
            NSLog(@"kCFRunLoopBeforeTimers");
            break;
        case kCFRunLoopBeforeSources:
            NSLog(@"kCFRunLoopBeforeSources");
            monitor.startDate = [NSDate date];
            monitor.excuting = YES;
            break;
        case kCFRunLoopBeforeWaiting:
            NSLog(@"kCFRunLoopBeforeWaiting");
            monitor.excuting = NO;
            break;
        case kCFRunLoopAfterWaiting:
            NSLog(@"kCFRunLoopAfterWaiting");
            break;
        case kCFRunLoopExit:
            NSLog(@"kCFRunLoopExit");
            break;
        default:
            break; }}Copy the code

According to the printed information, the RunLoop can go to sleep for a very short period of time, sometimes as little as a millisecond, sometimes even less than that, and when stationary, it can go to sleep for a long time.

Since blocks, interaction events, and other tasks in the main thread are executed before kCFRunLoopBeforeSources to kCFRunLoopBeforeWaiting, I timed Sources just before I started, And set the mark of the executing task to YES, and set the mark of the executing task to NO when it is about to go to sleep.

The fourth step is to supplement the timer callback processing

static void runLoopTimerCallBack(CFRunLoopTimerRef timer, void *info)
{
    FluencyMonitor *monitor = (__bridge FluencyMonitor*)info;
    if(! monitor.excuting) {return; } // If the main thread is executing a task, and the loop is still executing, Then need to compute time NSTimeInterval excuteTime = [[NSDate date] timeIntervalSinceDate: monitor. The startDate]; NSLog(@"Timer --%@",[NSThread currentThread]);
    NSLog(@"Main thread executed --%f seconds",excuteTime);
    
    if(excuteTime >= 0.01) {NSLog(@"Thread stuck %f seconds",excuteTime); [monitor handleStackInfo]; }}Copy the code

The timer is executed every 0.01 seconds. If the status of the currently executing task is YES and the time from the beginning to the present is greater than the threshold, the stack information is saved for later processing.

In order to capture the stack information, I set the timer interval very low (0.01) and the threshold rated as stuck very low (0.01). In actual use, these two values should be relatively large, the timer interval is 1s, and the Catton threshold is 2s.

RunLoopDemo03[957:16300] this is the first time that a group of people who are involved in the RunLoopDemo03 event can be found to be very different from each other. 68BAB24C-3224-46C8-89BF-F9AABA2E3530 CrashReporter Key: TODO Hardware Model: x86_64 Process: RunLoopDemo03 [957] Path: /Users/harvey/Library/Developer/CoreSimulator/Devices/6ED39DBB-9F69-4ACB-9CE3-E6EB56BBFECE/data/Containers/Bundle/Applic ation/5A94DEFE-4E2E-4D23-9F69-7B1954B2C960/RunLoopDemo03.app/RunLoopDemo03 Identifier: com.Haley.RunLoopDemo03 Version: 1.0 (1) Code Type: x86-64 Parent Process: DebugServer [958] Date/Time: 2016-12-15 00:56:38 +0000 OS Version: Mac OS X 10.1 (16A323) Report Version: 104 Exception Type: SIGTRAP Exception Codes: TRAP_TRACE at 0x1063da728 Crashed Thread: 4 Thread 0: 0 libsystem_kernel.dylib 0x000000010a14341a mach_msg_trap + 10 1 CoreFoundation 0x0000000106f1e7b4 __CFRunLoopServiceMachPort + 212 2 CoreFoundation 0x0000000106f1dc31 __CFRunLoopRun + 1345 3 CoreFoundation 0x0000000106f1d494 CFRunLoopRunSpecific + 420 4 GraphicsServices 0x000000010ad8aa6f GSEventRunModal + 161 5 UIKit 0x00000001073b7964 UIApplicationMain + 159 6 RunLoopDemo03 0x00000001063dbf8f main + 111 7 libdyld.dylib 0x0000000109d7468d start + 1 Thread 1: 0 libsystem_kernel.dylib 0x000000010a14be5e kevent_qos + 10 1 libdispatch.dylib 0x0000000109d13074 _dispatch_mgr_invoke + 248 2 libdispatch.dylib 0x0000000109d12e76 _dispatch_mgr_init + 0 Thread 2: 0 libsystem_kernel.dylib 0x000000010a14b4e6 __workq_kernreturn + 10 1 libsystem_pthread.dylib 0x000000010a16e221 start_wqthread + 13 Thread 3: 0 libsystem_kernel.dylib 0x000000010a14341a mach_msg_trap + 10 1 CoreFoundation 0x0000000106f1e7b4 __CFRunLoopServiceMachPort + 212 2 CoreFoundation 0x0000000106f1dc31 __CFRunLoopRun + 1345 3 CoreFoundation 0x0000000106f1d494 CFRunLoopRunSpecific + 420 4 Foundation 0x00000001064d7ff0 -[NSRunLoop runMode:beforeDate:] + 274 5 Foundation 0x000000010655f991 -[NSRunLoop runUntilDate:] + 78 6 UIKit 0x0000000107e3d539 -[UIEventFetcher threadMain] + 118 7 Foundation 0x00000001064e7ee4 __NSThread__start__ + 1243 8 libsystem_pthread.dylib 0x000000010a16eabb _pthread_body + 180 9 libsystem_pthread.dylib 0x000000010a16ea07 _pthread_body + 0 10 libsystem_pthread.dylib 0x000000010a16e231 thread_start + 13 Thread 4 Crashed: 0 RunLoopDemo03 0x00000001063dfae5 -[PLCrashReporter generateLiveReportWithThread:error:] + 632 1 RunLoopDemo03 0x00000001063da728 -[FluencyMonitor handleStackInfo] + 152 2 RunLoopDemo03 0x00000001063da2cf runLoopTimerCallBack + 351  3 CoreFoundation 0x0000000106f26964 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20 4 CoreFoundation 0x0000000106f265f3 __CFRunLoopDoTimer + 1075 5 CoreFoundation 0x0000000106f2617a __CFRunLoopDoTimers + 250 6 CoreFoundation 0x0000000106f1df01 __CFRunLoopRun + 2065 7 CoreFoundation 0x0000000106f1d494 CFRunLoopRunSpecific + 420 8  Foundation 0x00000001064d7ff0 -[NSRunLoop runMode:beforeDate:] + 274 9 Foundation 0x00000001064d7ecb -[NSRunLoop run] +  76 10 RunLoopDemo03 0x00000001063d9cbd +[FluencyMonitor monitorThreadEntryPoint] + 253 11 Foundation 0x00000001064e7ee4  __NSThread__start__ + 1243 12 libsystem_pthread.dylib 0x000000010a16eabb _pthread_body + 180 13 libsystem_pthread.dylib  0x000000010a16ea07 _pthread_body + 0 14 libsystem_pthread.dylib 0x000000010a16e231 thread_start + 13 Thread 4 crashed with X86-64 Thread State: rip: 0x00000001063dfae5 rbp: 0x000070000f53fc50    rsp: 0x000070000f53f9c0    rax: 0x000070000f53fa20 
   rbx: 0x000070000f53fb60    rcx: 0x0000000000005e0b    rdx: 0x0000000000000000    rdi: 0x00000001063dfc6a 
   rsi: 0x000070000f53f9f0     r8: 0x0000000000000014     r9: 0xffffffffffffffec    r10: 0x000000010a1433f6 
   r11: 0x0000000000000246    r12: 0x000060800016b580    r13: 0x0000000000000000    r14: 0x0000000000000006 
   r15: 0x000070000f53fa40 rflags: 0x0000000000000206     cs: 0x000000000000002b     fs: 0x0000000000000000 
    gs: 0x0000000000000000 

Copy the code

All that remains is to save the string to a file and upload it to the server.

We can not set the threshold of the card too small, nor can we upload all the card information, for two reasons. First, it is too waste of user traffic. 2. Too many files will take up space for App storage and server storage after uploading.

We can refer to the practice of wechat, delete files more than 7 days ago, upload files randomly, and compress files before uploading.

The sample code in this article comes from RunLoopDemo03 in RunLoopDemos