Originally written by Matt Galloway

The original address: www.galloway.me.uk/2012/10/a-l…


Today, I took a look at how Blocks works from a compiler’s perspective. I’m talking about blocks, the closure apple added to C, which is now part of the language from a Clang /LLVM perspective. I’ve always wondered how “blocks” work, how “blocks” magically appear as Objective-C objects (e.g., you can copy, retain, release blocks).


basis

A block looks like this:

void(^block)(void) = ^ {NSLog(@"I'm a block!");
};
Copy the code

The above code creates a variable called block, which is assigned to a simple block. That’s easy, but is that it? No! I want to know all the details of how the compiler compiled the above code.

In addition, you can pass a variable to a block:

void(^block)(int a) = ^{
    NSLog(@"I'm a block! a = %i", a);
};
Copy the code

Or return a value from the block:

int(^block)(void) = ^ {NSLog(@"I'm a block!");
    return 1;
};
Copy the code

As a closure, a block captures the context of its location:

int a = 1;
void(^block)(void) = ^ {NSLog(@"I'm a block! a = %i", a);
};
Copy the code

What I’m interested in is how the compiler handles this code.


Explore a simple example

My initial idea was to see how the compiler compiled a simple block. Consider the following code:

#import <dispatch/dispatch.h>

typedef void(^BlockA)(void);

__attribute__((noinline))
void runBlockA(BlockA block) {
    block();
}

void doBlockA() {
    BlockA block = ^{
        // Empty block
    };
    runBlockA(block);
}
Copy the code

The reason I put two methods here is because I want to see how blocks are set and called. The code for how to set and call is written in a method, and the compiler is clever enough to tune out the details we want to see. Since I wrote a noinline method called runBlockA, the compiler does not inline this method in doBlockA, optimizing the two methods into one.

The relevant bits of this code are compiled as follows (armv7, 03):

.globl  _runBlockA
    .align  2
    .code   16                      @ @runBlockA
    .thumb_func     _runBlockA
_runBlockA:
@ BB#0:
    ldr     r1, [r0, #12]
    bx      r1
Copy the code

This is the compiled instruction set for the runBlockA method. So, it’s easy. Reviewing the source code for this method, it simply calls a block. In ARM’s EABI, r0(register r0) is set as the first argument to the method. Therefore, the first instruction means that the value in the r0+12 block is loaded into R1. You can think of this as dereferencing a pointer, reading 12 bytes into it. And then let’s look at the address of R1. Note that R1 is used, which also means that R0 is still the block itself. So it is likely that the function called takes a block as its first argument.

I can conclude here that a block is a structure in which the function it calls is stored in a 12-byte structure. When a block is passed, a pointer to these structures is passed.

Now, look at the doBlockA method:

.globl  _doBlockA
    .align  2
    .code   16                      @ @doBlockA
    .thumb_func     _doBlockA
_doBlockA:
    movw    r0, :lower16:(___block_literal_global-(LPC1_0+4))
    movt    r0, :upper16:(___block_literal_global-(LPC1_0+4))
LPC1_0:
    add     r0, pc
    b.w     _runBlockA
Copy the code

Well, that’s easy too. This is a program counter related to loading. You can think of this as loading the address of the __block_literal_gobal variable into R0. The runBlockA method is then called. We can see that the block object passed to the runBlockA method is __block_literal_gobal from the assembly instruction set above.

Now we’re making some progress. But what exactly is __block_literal_gobal? We found the following through the assembly instruction set:

.align  2                       @ @__block_literal_global
___block_literal_global:
    .long   __NSConcreteGlobalBlock
    .long   1342177280              @ 0x50000000
    .long   0                       @ 0x0
    .long   ___doBlockA_block_invoke_0
    .long   ___block_descriptor_tmp
Copy the code

Aha, this looks like a structure. There are five values in this structure, and each value takes up four bytes (longs). This structure must be the block object that runBlockA operates on. See, the 12-byte value in this structure called ___doBlockA_block_invoke_0 is more like a pointer. Remember, this is where the runBlockA method jumps.

But what is __NSConcreteGlobalBlock? We’ll look at that in a minute. ___doBlockA_block_invoke_0 and ___block_descriptor_tmp are of interest because they also appear in the following assembler assembly:

.align 2 .code 16 @ @__doBlockA_block_invoke_0 .thumb_func ___doBlockA_block_invoke_0 ___doBlockA_block_invoke_0: bx lr .section __DATA,__const .align 2 @ @__block_descriptor_tmp ___block_descriptor_tmp: .long 0 @ 0x0 .long 20 @ 0x14 .long L_.str .long L_OBJC_CLASS_NAME_ .section __TEXT,__cstring,cstring_literals L_.str: @ @.str .asciz "v4@? 0" .section __TEXT,__objc_classname,cstring_literals L_OBJC_CLASS_NAME_: @ @"\\01L_OBJC_CLASS_NAME_" .asciz "\\001"Copy the code

This ___doBlockA_block_invoke_0 looks more like the actual block’s implementation of itself, even though we’re using an empty block. This function returns directly, which is exactly how we expect empty functions to be compiled.

Now look at ___block_descriptor_tmp. This seems to be another structure, and this structure has four values in it. The second value is 20, which is the size of the ___block_literal_global structure. Guess this might be a size value? There’s also a C string called STR, which is v4 at? 0. This looks like some kind of coded identifier. This may be an identifier of type block (returning a null type with no arguments). The other values, I have no idea.


The source code is not deduced out?

Yes, the source code can be deduced. This is part of a project in LLVM called Compiler-RT. Read the source code for this project and find the following definition in the block_private. h file:

struct Block_descriptor { unsigned long int reserved; unsigned long int size; void (*copy)(void *dst, void *src); void (*dispose)(void *); }; struct Block_layout { void *isa; int flags; int reserved; void (*invoke)(void *, ...) ; struct Block_descriptor *descriptor; /* Imported variables. */ };Copy the code

Amazing similarity! The Block_layout structure is our parsing ___block_literal_global, and the Block_descriptor structure is our parsing ___block_descriptor_tmp. My guess that the second value in the descriptor is size is correct. The odd thing is the third and fourth values in the Block_descriptor. These values look like Pointers to the function, but in our compiled instruction set they are two strings. These two values are not listed for the moment.

Isa in Block_layout is interesting because it may be _NSConcreteGlobalBlock. And it may be the key to how a block can have the behavior of an Objective-C object. If _NSConcreteGlobalBlock is a class, then the Objective-C messaging system is happy to treat a block object as a normal object. This is similar to toll-free bridging. For more information on toll-free Bridging, read Mike Ash’s excellent post on bridging.

Putting the above bits and pieces together, the compiler looks like it’s doing something like this:

#import <dispatch/dispatch.h>

__attribute__((noinline))
void runBlockA(struct Block_layout *block) {
    block->invoke();
}

void block_invoke(struct Block_layout *block) {
    // Empty block function
}

void doBlockA() {
    struct Block_descriptor descriptor;
    descriptor->reserved = 0;
    descriptor->size = 20;
    descriptor->copy = NULL;
    descriptor->dispose = NULL;

    struct Block_layout block;
    block->isa = _NSConcreteGlobalBlock;
    block->flags = 1342177280;
    block->reserved = 0;
    block->invoke = block_invoke;
    block->descriptor = descriptor;

    runBlockA(&block);
}
Copy the code

Now, the details of how the block works are easy to understand.


The next step

Next, I’ll explore how blocks with parameters capture variables from scope. This will definitely make a difference, so stay tuned!


Related series

  • A Look Inside Blocks: Episode 1
  • A Look Inside Blocks: Episode 2
  • A Look Inside Blocks Episode 3 (Block_copy)