One, study Clang

1, Clang understanding

  • Clang is aC, C++, and Objective-CLightweight compiler. The source code is distributed under the BSD protocol. Clang will support its normal lambda expressions, simplified handling of return types, and better handling of constexpr keywords.
  • Clang was created byAppleLead to write, based on LLVM C/C++/ objective-C compiler
  • In April 2013,Clang has full support for the C++11 standard and has begun implementing C++1y features (also known as C++14, the next minor update to C++). Clang will support its normal lambda expressions, simplified handling of return types, and better handling of constexpr keywords.
  • Clang is a C++ written, based on LLVM, published in LLVMBSD licenseC/C++/Objective-C/Objective-C++ compiler.It is almost completely compatible with the GNU C language specification(Of course, there are some incompatibilities, including a slightly different compilation option), and it adds additional syntactic features, such as C function overloading (which modifiers functions with __attribute__((overloadable)), whose goal (among other things) is to go beyondGCC.

2, Clang features

  • End User Features: Fast compilation and less memory usage Has diagnostic function; Compatible with GCC;
  • Utilities and applications: Infrastructure module library; Meet various customer requirements (code refactoring, dynamic analysis, code generation, etc.); Allows integration into a variety of ides; Use LLVM’BSD’ protocol;
  • Internal design and implementation: a non-portable code base; A non-general-purpose parser for C, Objective-C, C++, and Objective-C++; Highly consistent with C/C++/Objective-C and their derivative languages;

3. C language association

  • Library:C Standard function library, glibc, Dietlibc, uClibc, Newlib, EGLIBC, Bionic
  • Features:String, Syntax, Preprocessor, Variable types and declarations, Functions
  • Extended related programming languages:C++, Objective-C, D, C#
  • C and other programming languages:Compatibility, operator, Comparison of Pascal and C, C to Java bytecode Compiler
  • Compiler:Borland Turbo C,ClangGCCVisual C++/CLI, C++/CX, Watcom C/C++ compiler

4, Clang use

Create a new project and go to the file directory where main.m is located

// By default, main.m generates main.cpp
clang -rewrite-objc main.m
// Discover that main.cpp is generated in the current file directory

// We can also generate debug or release. CPP for comparison purposes
// This is necessary for projects with multiple versions
clang -rewrite-objc main.m -o mainDebug.cpp
clang -rewrite-objc main.m -o mainRelease.cpp


Copy the code
  • encounterfatal error: 'UIKit/UIKit.h' file not foundwhen
  • Method 1
/ / the default
clang -x objective-c  -rewrite-objc -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk main.m

// The same is true for versions
clang -x objective-c -rewrite-objc -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk main.m -o mainclang.cpp


Copy the code
  • Method 2
xcrun -sdk iphonesimulator clang -rewrite-objc main.m
Copy the code
// Version differentiation xcrun-sdk iphonesimulator clang-rewrite-objc main.m -o maixcrun.cppCopy the code
  • Change the path as required in different scenarios
// Apple provides seven different SDKS for developers
/ / 1 TV real machine
AppleTVOS.platform
//2 Real TV simulator
AppleTVSimulator.platform
//3 Real phone
iPhoneOS.platform
//4 Mobile phone simulator
iPhoneSimulator.platform
//5 MacBook
MacOSX.platform
/ / 6 watch real machine
WatchOS.platform
//7 Watch Emulator
WatchSimulator.platform
Copy the code

Meets the fatal error: ‘libkern/machine/OSByteOrder. H file not found

Replace -rewrite-objc with -arch arm64 -rewrite-objcCopy the code

Clang is up to now, and can be added later if necessary

Two, union and bitfield

1. Consortium

  • New consortium
union unionA{
    int a;/ / 4
    short b;/ / 2
    char c;/ / 1
};
Copy the code
  • Run the code
    union unionA  person;
    person.a = 97;
    NSLog(@"a=%d---b=%d---c=%c",person.a,person.b,person.c);
    person.b = 98;
    NSLog(@"a=%d---b=%d---c=%c",person.a,person.b,person.c);
    person.c = 'c';
    NSLog(@"a=%d---b=%d---c=%c",person.a,person.b,person.c);
    NSLog(@"%lu---%lu".sizeof(person),sizeof(union unionA));
Copy the code
  • The output
202107 -- 22 19:12:59.705962+0800Relationship between objects and ISA [9115:282714] a=97---b=97---c=a
202107 -- 22 19:12:59.706048+0800Relationship between objects and ISA [9115:282714] a=98---b=98---c=b
202107 -- 22 19:12:59.706077+0800Relationship between objects and ISA [9115:282714] a=99---b=99---c=c
202107 -- 22 19:12:59.706097+0800Relationship between objects and ISA [9115:282714] 4--4 -

//97-99 corresponds to ASSCII code A-C
// The log shows that the three elements of the union have the same value three times
// The union memory is 4 bytes and the same as a's memory

Copy the code

conclusion

  • All elements of a unionShared memoryTo giveAny elementThe assignment can beCovering the memory
  • A union can define multiple members of different types, the union ofMemory sizeBy includingThe size of the largest memberdecision

Advantages and disadvantages of consortium

  • Advantages: More flexible memory usage, save memory.
  • The bad: Not tolerant enough.

ASSCII code comparison

ASSCII code is in decimal 0-255 and hexadecimal 00-FF, refer to the fragment

2, a domain

  • The bit field is necessary to understand the x/ NUF addr first
N to the number of real memory unit -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- u said each unit bytes single-byte h b double byte four bytes g w eight bytes -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- f display mode, desirable value as follows: x16Hexadecimal d10Hexadecimal u10Base unsigned O8Hexadecimal t2In system a.16Base I instruction C character F floating pointCopy the code

The foundation should be solid

  • Enter the exploration

Create two new constructs


/ / structure
struct struct1 {
    BOOL A;
    BOOL B;
    BOOL C;
    BOOL D;
    BOOL E;
};

/ / a domain
struct struct2 {
    BOOL A:1;
    BOOL B:1;
    BOOL C:1;
    BOOL D:1;
    BOOL E:1;
};
Copy the code

Output logs and debug breakpoints

    struct struct1 truct1;
    struct struct2 truct2;
    NSLog(@"----%lu----%lu".sizeof(truct1),sizeof(truct2));
    
    truct1.A = YES;
    truct1.B = YES;
    
    truct2.A = YES;
    truct2.B = YES;

Copy the code

Truct1 = 5 bytes truct2 = 1 byte
202107 -- 23 10:43:14.583061+0800Relationship between objects and ISA [12570:373355] -- 5---- 1

// Check truct1 to see 8 bytes in hexadecimal
ABCDE = 0x01 0x01 0x00 0x00 0x00 0x00(i.e. 11000)
(lldb) p &truct1
(struct1 *) $0 = 0x00007ffeef028308
(lldb) x/gx 0x00007ffeef028308
0x7ffeef028308: 0x0000000000000101

//truct2 I check a byte in base 2
ABCDE = 0b1 0B1 0B0 0B0 0B0 0B0 0B0 (11000)
(lldb) p &truct2 
(struct2 *) $1 = 0x00007ffeef028300
(lldb) x/bt 0x00007ffeef028300
0x7ffeef028300: 0b00000011

Copy the code
  • Structure:0x0000000000000101
  • A domain:0b00000011
  • It’s obvious that bitfields make the most of binary dataStorage and transmissionHas an advantage in terms of

Third, object research

  • Create an object in main.m
@interface NBPerson : NSObject{
    NSString * height;
}
@property(nonatomic.copy)NSString   *name;
@property(nonatomic.assign)NSInteger  age;

@end

@implementation NBPerson

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        NSLog(@"Hello, World!");
    }
    return 0;
}
Copy the code

Generate main. CPP to find NBPerson

#ifndef _REWRITER_typedef_NBPerson
#define _REWRITER_typedef_NBPerson


// Classes and objects exist as structures in the CPP file
/ / NBPerson objc_object inheritance
// Objc_object is the underlying implementation of NSObject
typedef struct objc_object NBPerson;
typedef struct {} _objc_exc_NBPerson;
#endif

extern "C" unsigned long OBJC_IVAR_$_NBPerson$_name;
extern "C" unsigned long OBJC_IVAR_$_NBPerson$_age;
struct NBPerson_IMPL {


	struct NSObject_IMPL NSObject_IVARS;// Nested structures inherit member variables
	NSString *height;//NBPerson member variable
	NSString *_name;/ / NBPerson properties
	NSInteger _age;/ / NBPerson properties
};

// @property(nonatomic, copy)NSString *name;
// @property(nonatomic,assign)NSInteger age;

/* @end */

// Reference structure
struct NSObject_IMPL {
	Class isa;
};

Copy the code
  • Find method
// @implementation NBPerson

/ / _name getter method
static NSString * _I_NBPerson_name(NBPerson * self, SEL _cmd) { return(* (NSString((* *)char *)self + OBJC_IVAR_$_NBPerson$_name)); }
extern "C" __declspec(dllimport) void objc_setProperty (id, SEL, long.id.bool.bool);

/ / _name setter methods
static void _I_NBPerson_setName_(NBPerson * self, SEL _cmd, NSString *name) { objc_setProperty (self, _cmd, __OFFSETOFIVAR__(struct NBPerson, _name), (id)name, 0.1); }

/ / _age getter method
static NSInteger _I_NBPerson_age(NBPerson * self, SEL _cmd) { return(* (NSInteger((*)char *)self + OBJC_IVAR_$_NBPerson$_age)); }
/ / _age setter methods
static void _I_NBPerson_setAge_(NBPerson * self, SEL _cmd, NSInteger age) { (*(NSInteger((*)char *)self + OBJC_IVAR_$_NBPerson$_age)) = age; }
// @end
Copy the code

This code generates the setter and getter methods for name and age

  • Commonly used type

//Class is a pointer to the struct type of objc_class
typedef struct objc_class *Class;

// Objc_object has only one member variable
struct objc_object {
    Class _Nonnull isa __attribute__((deprecated));
};

// Similarly, id is the pointer to the objc_object structure
typedef struct objc_object *id;

//SEL is the pointer to the objc_Selector structure
typedef struct objc_selector *SEL;

Copy the code

Fourth, associate isa

  • Code breakpoints
NBPerson *nb = [[NBPerson alloc]init];
    nb.name = @"nb";
    NSLog(@ "% @",nb.name);
Copy the code
  • To find the source
/ / 1
+ (id)alloc
/ / 2
_objc_rootAlloc(Class cls)
/ / 3
callAlloc(cls, false/*checkNil*/.true/*allocWithZone*/);
/ / 4
_objc_rootAllocWithZone(cls, nil);
/ / 5
_class_createInstanceFromZone(cls, 0, nil,OBJECT_CONSTRUCT_CALL_BADALLOC);
/ / 6
initInstanceIsa(cls, hasCxxDtor)
Copy the code
inline void 
objc_object: :initInstanceIsa(Class cls, bool hasCxxDtor){ ASSERT(! cls->instancesRequireRawIsa()); ASSERT(hasCxxDtor == cls->hasCxxDtor()); initIsa(cls,true, hasCxxDtor);
}
Copy the code
inline void 
objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT boolhasCxxDtor) { ASSERT(! isTaggedPointer()); isa_t newisa(0);

    if(! nonpointer) { newisa.setClass(cls,this);
    } else{ ASSERT(! DisableNonpointerIsa); ASSERT(! cls->instancesRequireRawIsa());#if SUPPORT_INDEXED_ISA
        ASSERT(cls->classArrayIndex() > 0);
        newisa.bits = ISA_INDEX_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
        newisa.has_cxx_dtor = hasCxxDtor;
        newisa.indexcls = (uintptr_t)cls->classArrayIndex();
#else
        newisa.bits = ISA_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
#   if ISA_HAS_CXX_DTOR_BIT
        newisa.has_cxx_dtor = hasCxxDtor;
#   endif
        newisa.setClass(cls, this);
#endif
        newisa.extra_rc = 1;
    }

    // This write must be performed in a single store in some cases
    // (for example when realizing a class because other threads
    // may simultaneously try to use the class).
    // fixme use atomics here to guarantee single-store and to
    // guarantee memory order w.r.t. the class index table
    / /... but not too atomic because we don't want to hurt instantiation
    isa = newisa;
}
Copy the code
  • The key to class association
inline void
isa_t::setClass(Class newCls, UNUSED_WITHOUT_PTRAUTH objc_object *obj)
{
    // Match the conditional in isa.h.
#if __has_feature(ptrauth_calls) || TARGET_OS_SIMULATOR
#   if ISA_SIGNING_SIGN_MODE == ISA_SIGNING_SIGN_NONE
    // No signing, just use the raw pointer.
    uintptr_t signedCls = (uintptr_t)newCls;

#   elif ISA_SIGNING_SIGN_MODE == ISA_SIGNING_SIGN_ONLY_SWIFT
    // We're only signing Swift classes. Non-Swift classes just use
    // the raw pointer
    uintptr_t signedCls = (uintptr_t)newCls;
    if (newCls->isSwiftStable())
        signedCls = (uintptr_t)ptrauth_sign_unauthenticated((void *)newCls, ISA_SIGNING_KEY, ptrauth_blend_discriminator(obj, ISA_SIGNING_DISCRIMINATOR));

#   elif ISA_SIGNING_SIGN_MODE == ISA_SIGNING_SIGN_ALL
    // We're signing everything
    uintptr_t signedCls = (uintptr_t)ptrauth_sign_unauthenticated((void *)newCls, ISA_SIGNING_KEY, ptrauth_blend_discriminator(obj, ISA_SIGNING_DISCRIMINATOR));

#   else
#       error Unknown isa signing mode.
#   endif

Shiftcls = shiftcls = shiftcls
    shiftcls_and_sig = signedCls >> 3;

#elif SUPPORT_INDEXED_ISA
    // Indexed isa only uses this method to set a raw pointer class.
    // Setting an indexed class is handled separately.
    cls = newCls;

#else // Nonpointer isa, no ptrauth
Shiftcls = shiftcls = shiftcls
    shiftcls = (uintptr_t)newCls >> 3;
#endif
}
Copy the code
  • And it turns out that ISA is actually a combination called ISA_t
union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    uintptr_t bits;

private:
    // Accessing the class requires custom ptrauth operations, so
    // force clients to go through setClass/getClass by making this
    // private.
    Class cls;

public:
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };

    bool isDeallocating() {
        return extra_rc == 0 && has_sidetable_rc == 0;
    }
    void setDeallocating() {
        extra_rc = 0;
        has_sidetable_rc = 0;
    }
#endif

    void setClass(Class cls, objc_object *obj);
    Class getClass(bool authenticated);
    Class getDecodedClass(bool authenticated);
};
Copy the code
  • The ISA_T contains bits, CLS (private), and an anonymous structure, so these three are actually shared memory, occupying 8 bytes of 64 bits.

  • ISA_BITFIELD source analysis


// Here is arm64
# if __arm64__
// ARM64 simulators have a larger address space, so use the ARM64e
// scheme even when simulators build for ARM64-not-e.
// Emulator ------------------------------------
#   if __has_feature(ptrauth_calls) || TARGET_OS_SIMULATOR
#     define ISA_MASK        0x007ffffffffffff8ULL
#     define ISA_MAGIC_MASK  0x0000000000000001ULL
#     define ISA_MAGIC_VALUE 0x0000000000000001ULL
#     define ISA_HAS_CXX_DTOR_BIT 0
#     define ISA_BITFIELD                                                      \
        uintptr_t nonpointer        : 1;                                       \
        uintptr_t has_assoc         : 1;                                       \
        uintptr_t weakly_referenced : 1;                                       \
        uintptr_t shiftcls_and_sig  : 52;                                      \
        uintptr_t has_sidetable_rc  : 1;                                       \
        uintptr_t extra_rc          : 8
#     define RC_ONE   (1ULL<<56)
#     define RC_HALF  (1ULL<<7)
#   else
// Real mobile phone ------------------------------------
#     define ISA_MASK        0x0000000ffffffff8ULL
#     define ISA_MAGIC_MASK  0x000003f000000001ULL
#     define ISA_MAGIC_VALUE 0x000001a000000001ULL
#     define ISA_HAS_CXX_DTOR_BIT 1
#     define ISA_BITFIELD                                                      \
        uintptr_t nonpointer        : 1;                                       \
        uintptr_t has_assoc         : 1;                                       \
        uintptr_t has_cxx_dtor      : 1;                                       \
        uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
        uintptr_t magic             : 6;                                       \
        uintptr_t weakly_referenced : 1;                                       \
        uintptr_t unused            : 1;                                       \
        uintptr_t has_sidetable_rc  : 1;                                       \
        uintptr_t extra_rc          : 19
#     define RC_ONE   (1ULL<<45)
#     define RC_HALF  (1ULL<<18)
#   endif

// This time we will focus on the MACOS version
//MACOS------------------------------------
# elif __x86_64__

// This definition can be used as a bit and to remove shiftcls conveniently and powerfully
/ / into 2 base for 0000000000000000011111111111111111111111111111111111111111111000
// The 64-bit binary consists of three zeros + 44 ones + 11 zeros from the low to the high
#   define ISA_MASK        0x00007ffffffffff8ULL


#   define ISA_MAGIC_MASK  0x001f800000000001ULL
#   define ISA_MAGIC_VALUE 0x001d800000000001ULL
#   define ISA_HAS_CXX_DTOR_BIT 1
#   define ISA_BITFIELD                                                        \
      // Indicates whether to optimize the ISA pointer. 0 indicates a pure pointer. 1 indicates that isa contains more than the address of the class object
      uintptr_t nonpointer        : 1;                                         \
      // Associated object flag bit, '0' means unassociated, '1' means associated
      uintptr_t has_assoc         : 1;                                         \
      // Whether the object is a 'C ++' or 'Objc' destructor. If there is a destructor, the destructor logic needs to be done. If not, the object is released
      uintptr_t has_cxx_dtor      : 1;                                         \
      // Store the value of the class pointer. With pointer optimization enabled, there are '33' bits to store the class pointer in 'arm64' and '44' bits in 'x86_64'
      uintptr_t shiftcls          : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ \
      // Space used by the debugger to determine whether the current object is' real 'or' uninitialized '
      uintptr_t magic             : 6;                                         \
      // Refers to whether or not the object has been referred to an 'ARC' weak variable. Objects without weak references can be freed faster
      uintptr_t weakly_referenced : 1;                                         \
      // Indicates whether the object is being released
      uintptr_t unused            : 1;                                         \
      // If the object reference count is greater than '10', this variable is used to store the carry
      uintptr_t has_sidetable_rc  : 1;                                         \
      For example, if the object's reference count is' 10 ', then 'extra_rc' is' 9 '. If the object's reference count is greater than '10', then 'has_sidetable_rc' is used above
      uintptr_t extra_rc          : 8
#   define RC_ONE   (1ULL<<56)
#   define RC_HALF  (1ULL<<7)

# else
#   error unknown architecture for packed isa
# endif

Copy the code
  • Breakpoint debugging isa association before and after
/ / before
(lldb) p/t newisa
(isa_t) $28 = {
  bits = 0b0000000000000000000000000000000000000000000000000000000000000000
  cls = nil
   = {
    nonpointer = 0b0
    has_assoc = 0b0
    has_cxx_dtor = 0b0
    shiftcls = 0b00000000000000000000000000000000000000000000
    magic = 0b000000
    weakly_referenced = 0b0
    unused = 0b0
    has_sidetable_rc = 0b0
    extra_rc = 0b00000000}}/ / associated

(lldb) p/x newisa
(isa_t) $5 = {
  bits = 0x011d80010000823d
  cls = 0x011d80010000823d NBPerson
   = {
    nonpointer = 0x0000000000000001
    has_assoc = 0x0000000000000000
    has_cxx_dtor = 0x0000000000000001
    shiftcls = 0x0000000020001047
    magic = 0x000000000000003b
    weakly_referenced = 0x0000000000000000
    unused = 0x0000000000000000
    has_sidetable_rc = 0x0000000000000000
    extra_rc = 0x0000000000000001}}Copy the code

Verify the isa

  • Isa pointer verification
(lldb) x/4gx nb
0x101237d20: 0x011d80010000823d 0x0000000000000000
0x101237d30: 0x0000000100004018 0x0000000000000000
(lldb) p/x 0x011d80010000823d >> 3
(long) $11 = 0x0023b00020001047
(lldb) p/x 0x0023b00020001047 << 20
(long) $12 = 0x0002000104700000
(lldb) p/x 0x0002000104700000 >> 17
(long) $13 = 0x0000000100008238
(lldb) po 0x0000000100008238
NBPerson
Copy the code
  • ISA_MAGIC_MASK validation
//ISA pointer 64-bit //ISA_MAGIC_MASK is also 64-bit and [3-46]44 bits just bit 1 with isa pointer bit and get object address (LLDB) p/x 0x00007FFFFFFFFff8&0x011D80010000823D (long) $16 = 0x0000000100008238 (lldb) po 0x0000000100008238 NBPersonCopy the code
  • Shiftcls access
(lldb) p/x (long) 0x0000000020001047 << 3
(long) $24 = 0x0000000100008238
(lldb) po 0x0000000100008238
NBPerson

Copy the code
  • All three waysshiftclsIs the key to isa associated objects
  • In __x86_64__ MACH_VM_MAX_ADDRESS = 0x7FFFFFE00000 The maximum virtual memory addressing space is 47 bits.
  • Byte alignment is 8-byte alignment, which means that the address of a pointer can only be a multiple of 8, so the last three bits of the address of the pointer can only be 0, such as 0x8, 0x18, 0x30.
  • Is not only used in isa associationsA domainAlso used toA consortiumOptimizing the memory of an ISA pointer to 8 bytes really does not waste any position