6. iOS underlying analysis - message sending (objc \ msgesend) analysis

Posted by MissiCoola on Mon, 20 Jan 2020 15:13:10 +0100

Debugging analysis

How are methods in OC called? Before checking the data, we know that the method call of OC is in the form of message sending, so let's explore the source code

     LGPerson *person = [[LGPerson alloc] init];
     Class pClass = [LGPerson class];
     [person sayNB]; 

Assembly viewing process (pay attention to using real machine debugging)

Debug - Debug workflow - Always show Disassembly

    0x100000b4b <+27>:  movq   0x746(%rip), %rsi         ; (void *)0x00000001000012d8: LGPerson
    0x100000b52 <+34>:  movq   0x71f(%rip), %rcx         ; "alloc"
    0x100000b59 <+41>:  movq   %rsi, %rdi
    0x100000b5c <+44>:  movq   %rcx, %rsi
    0x100000b5f <+47>:  movq   %rax, -0x48(%rbp)
    0x100000b63 <+51>:  callq  *0x4a7(%rip)              ; (void *)0x0000000100344640: objc_msgSend
    0x100000b69 <+57>:  movq   0x710(%rip), %rsi         ; "init"
    0x100000b70 <+64>:  movq   %rax, %rdi
    0x100000b73 <+67>:  callq  *0x497(%rip)              ; (void *)0x0000000100344640: objc_msgSend
    0x100000b79 <+73>:  movq   %rax, -0x18(%rbp)
    0x100000b7d <+77>:  movq   0x714(%rip), %rax         ; (void *)0x00000001000012d8: LGPerson
    0x100000b84 <+84>:  movq   0x6fd(%rip), %rsi         ; "class"
    0x100000b8b <+91>:  movq   %rax, %rdi
    0x100000b8e <+94>:  callq  *0x47c(%rip)              ; (void *)0x0000000100344640: objc_msgSend
    0x100000b94 <+100>: movq   %rax, -0x20(%rbp)
    0x100000b98 <+104>: movq   -0x18(%rbp), %rax
    0x100000b9c <+108>: movq   0x6ed(%rip), %rsi         ; "sayNB"

By assembling Debug, you can find four methods called by LGPerson: alloc, init, class, sayNB,

Objc [msgsend] can be seen in init and sayNB. Continue analysis

Find the target directory, use clang-rewrite-objc main. M in the terminal, compile and generate the main.cpp file, and compare the main functions of the two

// oc
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        LGPerson *person = [[LGPerson alloc] init];;
        [person sayNB];
    }
    return 0;
}

// c++
int main(int argc, const char * argv[]) {
    /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; 
        LGPerson *person = ((LGPerson *(*)(id, SEL))(void *)objc_msgSend)((id)((LGPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("LGPerson"), sel_registerName("alloc")), sel_registerName("init"));;
        ((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("sayNB"));
    }
    return 0;
}
  1. Comparative analysis shows that the [person sayNB] method we call is compiled into objc [msgsend ((ID) person, SEL [registername ("saynb") in C + +).
  2. As can be seen from the above two places, the essence of method at the bottom is actually to send messages to objc ﹣ msgsend, and the calling method is to call objc ﹣ msgsend to send specific messages to specific objects
  3. Objc [msgsend (ID, sel) id message receiver (object) sel method number
  4. Get the key & mask, get the index, and find the corresponding imp
  • Sending a message is a process of finding a function. The underlying encapsulation of oc is that c is the imp realized by finding a real function through message sending
  • If the function of c is called directly, there will be no message sending process. It is called directly.
  • //Send message: objc · msgsemd
    //Object method - person - sel
    //Class method class sel
    //Parent class: objc UU supermsgsend

control+in enters the objc_msgSend method and finds the location of libobjc.A.dylib. Then, we can explore it through source code search

 

Objc & msgsend analysis

Global search objc_msgSend method in objc source code, and find objc_msgSend entry in objc-msg-arm64.s file

	ENTRY _objc_msgSend
	UNWIND _objc_msgSend, NoFrame

	cmp	p0, #0			// nil check and tagged pointer check
#if SUPPORT_TAGGED_POINTERS
	b.le	LNilOrTagged		//  (MSB tagged pointer looks negative)
#else
	b.eq	LReturnZero
#endif
    // person - isa - class
	ldr	p13, [x0]		// p13 = isa
	GetClassFromIsa_p16 p13		// p16 = class
LGetIsaDone:
	CacheLookup NORMAL		// calls imp or objc_msgSend_uncached

This place is written in assembly, not easy to read. After a brief analysis of the query

  1. cmp P0, ා0 / / nil check and tagged pointer check, that is, null
  2. Determine the support ﹣ tagged ﹣ points ﹣ operation (supports the marked pointer)
  3. p13 is equivalent to isa
  4. Next, get the current class through getclassfromisa ﹣ p16
  5. First go to the cache through CacheLookup.
  6. In fact, it is basically the same as the structure of a class. It is written in assembly language to offset the current class's ground address by 16 bytes (isa and superClass occupy 8 bytes each), find the structure of the cache, and then find the buckets to find the cache method
  7. If not found, CheckMiss
.macro CheckMiss
    // miss if bucket->sel == 0
.if $0 == GETIMP
    cbz p9, LGetImpMiss
.elseif $0 == NORMAL
    cbz p9, __objc_msgSend_uncached
.elseif $0 == LOOKUP
    cbz p9, __objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro

CheckMiss

Depending on the parameters passed, different results are returned

Objc ﹣ msgsend just passed is of NORMAL type, so the next call is to call the ﹣ objc ﹣ msgsend ﹣ uncached method, and only the MethodTableLookup method and a method pointer are called back internally

    STATIC_ENTRY __objc_msgSend_uncached
    UNWIND __objc_msgSend_uncached, FrameWithNoSaves

    // THIS IS NOT A CALLABLE C FUNCTION
    // Out-of-band p16 is the class to search
    
    MethodTableLookup
    TailCallFunctionPointer x17

    END_ENTRY __objc_msgSend_uncached
  • Class. First find the class through isa, and then find the buckets in the cache of the class. When none of them are found in the cache, then go to the original storage space of the class, that is, class - bits - rw - ro - methodList to find the definition of the method.
  • The compilation of message sending is the same. In the MethodTableLookup method, a series of preparatory work was done first, then the __class_lookupMethodAndLoadCache3 method was called to find and load the cache.
  • At the end of the compilation is C/C++
macro MethodTableLookup
    
    // push frame
    SignLR
    stp fp, lr, [sp, #-16]!
    mov fp, sp
        // A series of preparations
    // save parameter registers: x0..x8, q0..q7
    sub sp, sp, #(10*8 + 8*16)
    stp q0, q1, [sp, #(0*16)]
    stp q2, q3, [sp, #(2*16)]
    stp q4, q5, [sp, #(4*16)]
    stp q6, q7, [sp, #(6*16)]
    stp x0, x1, [sp, #(8*16+0*8)]
    stp x2, x3, [sp, #(8*16+2*8)]
    stp x4, x5, [sp, #(8*16+4*8)]
    stp x6, x7, [sp, #(8*16+6*8)]
    str x8,     [sp, #(8*16+8*8)]

    // receiver and selector already in x0 and x1
    mov x2, x16
    // The previous work is preparation. After a series of preparations, you can find the method and load the cache     
    bl  __class_lookupMethodAndLoadCache3
IMP _class_lookupMethodAndLoadCache3(id obj, SEL sel, Class cls)
{
    return lookUpImpOrForward(cls, sel, obj, 
                              YES/*initialize*/, NO/*cache*/, YES/*resolver*/);
}

Note: the method from assembly to C + + stage needs to be removed an underscore from the method in assembly to be found in the source code.

  • __After class ou lookupmethodandloadcache3, I was confused about what to do next. I checked the information here.
  • After control+in enters the objc ﹣ msgsend method, go down and find that there is an objc ﹣ msgsend ﹣ uncached method
  • Enter objc msgsend uncached. At the breakpoint, it is found that it jumps to a C + + method located in objc-runtime-new.mm file

Conclusion:

  • Objc ﹣ msgsend is written by assembly, mainly fast and flexible enough (C language can't write a function to keep unknown parameters and jump to any function pointer)
  • Assembly can be identified dynamically, with faster and higher performance. Assembly is more machine-readable
  • Objc ﹣ msgsend first looks up the cache through the fast path. If it can't find it, it goes to the MethodTableLookup method list to find it. It looks up and cache3 through ﹣ class ﹣ lookupmethodandloadcache3


extend

Sending of messages
	LGStudent *s = [LGStudent new];
        [s sayCode];
        // Method call underlying compilation
        // Method nature: Message: message receiver message number.... parameter (message body)
        objc_msgSend(s, sel_registerName("sayCode"));
        
        // Class method compilation bottom layer
//        id cls = [LGStudent class];
//        void *pointA = &cls;
//        [(__bridge id)pointA sayNB];
        objc_msgSend(objc_getClass("LGStudent"), sel_registerName("sayNB"));

        // Send message to parent (object method)
        struct objc_super lgSuper;
        lgSuper.receiver = s;
        lgSuper.super_class = [LGPerson class];
        objc_msgSendSuper(&lgSuper, @selector(sayHello));

        //Send message to parent class (class method)
        struct objc_super myClassSuper;
        myClassSuper.receiver = [s class];
        myClassSuper.super_class = class_getSuperclass(object_getClass([s class]));// Meta class
        objc_msgSendSuper(&myClassSuper, sel_registerName("sayNB"));

The above message to the parent class (class method)

The reason for calling the super class parent class and then going to the metaclass to find is because

Class method is to be found in metaclass, but metaclass cannot be found in its parent class.

Metaclass of S - > s – > metaclass of s parent

To send a message directly from the parent class is to

Superclass of s - > metaclass of superclass of s

 

 

 

 

 

 

 

 

 

 

69 original articles published, 12 praised, 180000 visitors+
Private letter follow

Topics: Assembly Language C