Exploring the underlying principles of iOS - the essence of Category classification

Posted by stravanato on Thu, 06 Jan 2022 00:02:35 +0100

preface

First of all, here are some interview questions related to Category, you can have a look
1. How to use Category?
2. What is the principle of Category?
3. What is the difference between Category and class extension?
4. When is the load method called in Category? Can the load method be inherited?
5. What is the difference between load and initialize? What is their calling order in category? What is the calling process between them when inheritance occurs?
6. Can Category add member variables? If yes, how?

How many interview questions can you answer? If there is something we can't, let's study together

Category

Function of Category classification: some methods, protocols and attributes can be added to the class without changing the original class.

First, we create a class YZPerson, which has an object method - (void)run; Then create two categories: YZPerson+Eat and YZPerson+Drink. There are four methods:

- (void)eat1
{
    NSLog(@"YZPerson+Eat-eat1");
}

- (void)eat2
{
    NSLog(@"YZPerson+Eat-eat2");
}

+ (void)eat3
{
    NSLog(@"YZPerson+Eat-eat3");
}

+ (void)eat4
{
    NSLog(@"YZPerson+Eat-eat4");
}

Use xcrun - SDK IPhoneOS clang - arch arm64 - rewrite objc yzperson + eat M command line instruction, yzperson + eat M into C language source code yzperson + eat cpp
The compiled classification files are all converted into_ category_ Structure of type T.

struct _category_t {
	const char *name;	//Category name
	struct _class_t *cls;
	const struct _method_list_t *instance_methods;	//Object method list
	const struct _method_list_t *class_methods;	//Class method list
	const struct _protocol_list_t *protocols;	//Protocol list
	const struct _prop_list_t *properties;	//Attribute list
};

Find the source code and you can see its assignment method

Among them, the assignments of the 3rd and 4th are shown in the following two figures

It can be seen from the source code that after compilation, the classification converts the contents of the classification: object methods, class methods, protocols and attributes into types_ category_ Structure variable of T.

Source code analysis of classification:

1. Runtime initialization:

2. Call_ dyld_objc_notify_register method, pass in map_images address (method address or function address):

3. Call map_images_nolock method, in map_ images_ Call in nolock method read_images method (image, loading some modules):

4. Load classification information (classification information is a two-dimensional array):

5. Find the implementation of the remethodizeClass(cls) core method (reorganize the method for the class object and the original class object):

static void 
// cls class object
// Cat classification list
attachCategories(Class cls, category_list *cats, bool flush_caches) 
{
    if (!cats) return;
    if (PrintReplacedMethods) printReplacements(cls, cats);

    bool isMeta = cls->isMetaClass();

    // fixme rearrange to remove these intermediate allocations
    // Allocate storage space
    // Method list
    method_list_t **mlists = (method_list_t **)
        malloc(cats->count * sizeof(*mlists));
    // Attribute array
    property_list_t **proplists = (property_list_t **)
        malloc(cats->count * sizeof(*proplists));
    // Protocol array
    protocol_list_t **protolists = (protocol_list_t **)
        malloc(cats->count * sizeof(*protolists));

    // Count backwards through cats to get newest categories first
    int mcount = 0;
    int propcount = 0;
    int protocount = 0;
    int i = cats->count;
    bool fromBundle = NO;
    while (i--) {		
    		//Take out a category, i --, the first one and the last compiled one
        auto& entry = cats->list[i];
				//Operation on method list
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
            mlists[mcount++] = mlist;//mcount + + to operate the first fetched
            fromBundle |= entry.hi->isBundle();
        }
				//Action on attribute list
        property_list_t *proplist = 
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) {
            proplists[propcount++] = proplist;
        }
				//Operation on protocol list
        protocol_list_t *protolist = entry.cat->protocols;
        if (protolist) {
            protolists[protocount++] = protolist;
        }
    }
		
		// Data in class object
    auto rw = cls->data();

    prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
    //Attach all classified object (class) method lists to the object (class) method list of the original class
    rw->methods.attachLists(mlists, mcount);//Number of mcount s
    free(mlists);
    if (flush_caches  &&  mcount > 0) flushCaches(cls);

    rw->properties.attachLists(proplists, propcount);
    free(proplists);

    rw->protocols.attachLists(protolists, protocount);
    free(protolists);
}

The attachLists method is implemented as follows:

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            // Reallocate memory
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;
            
            //array()->lists + addedCount = array()->lists
            memmove(array()->lists + addedCount, array()->lists, 
                    oldCount * sizeof(array()->lists[0]));
                    
			//addedLists classification data
			//addedLists overrides array() - > lists data
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        } 
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
    }

By consulting the above source code, you can get:

At runtime, the [method list (including object method list and class method list), protocol list and attribute list] in multiple categories are collected into arrays through the runtime mechanism, and then the new array is added to the front of the [method list in the original class object, class method list in the metaclass, protocol list and attribute list in the class object], That is, the contents of the classification are dynamically added to class objects and meta objects.
At the same time, because it is added at the front, when there is the same method in the classification, original class and parent class (for example: - (void)run; Method), give priority to the methods in the classification. If the methods in the original class are not executed again, the method will be found in the parent class. It should be noted that it is called first and does not override the methods in the original class.
When multiple classifications have a method at the same time, because the traversal is i --, and then the mcount + + operation is performed, the last compiled classification file is found first.

Q: when did you decide that the classification file was finally compiled?

In the following file, the last one is compiled.

Summary: loading process of Category

Load all of a class through the Runtime

data
Merge the methods, attributes and protocol data of all categories into a large array (the last Category data involved in compilation will be in front of the array)
Insert the merged classification data (including methods, attributes and protocols) in front of the original data
The above is the loading process of Category and the principle of Category.

Atomic parent
Classification comes first and the original class comes last (classification is added to the front of the original class)
The original class comes first and the parent class comes later (message sending mechanism)

+(void)load; method

Let's introduce the knowledge about load

When the program starts, it will load all classes and classifications, and call the + load method of all classes and classifications. That is, no matter whether the program has called this class during operation, the + load method will be called only once during program initialization.
Load the parent class first, and then the child class
Load the original class first and then the classification
Initialization load call sequence: parent-child original division

It should be noted that + (void)load; The method is different from the custom method in classification. Because if it is a custom method and the original class is the same as the classification method, only the classification method will be called. And + (void)load; The method will load all the original and classified + (void) loads; Will be called again. It is the same method in the original class and classification. Why do different results occur?

Let's continue to look at the source code

Call of custom method [YZPerson test]; It is a message passing mechanism, so the class method will be found in the metaclass through the isa pointer. If there is a classification + test method, the classification + test method will be called first.
The + load method is based on finding the memory address of + load directly in the memory through load_method method call.

First call the load method of the original class, and then call the load method of the classification;
First call the load method of the parent class, and then call the load method of the child class;
Multiple original classes without inheritance relationship shall be called according to the compilation order (compile first and call first);
Multiple classifications are only called according to the compilation order (compile first and call first);

+initialize method

Let's introduce some knowledge about initialize
+The initialize method is called the first time the class receives a message.

When a class is used for the first time (such as creating an object), the + initialize method will be called once
A class will only call the + initialize method once
Call order: call the of the parent class first, and then the of the child class
initialize call sequence: parent-child Division
Let's check the relevant source code:

Through source code analysis, it is not difficult to see the above knowledge points.
Because it is based on isa pointer mechanism, the + initialize method has the following characteristics:

If the classification implements + initialize, it will call + initialize of the classification and will not call the + initialize call of the class itself (it is said on the Internet that it overwrites the + initialize method in the original class. In fact, it is not a real coverage, but it does not call the + initialize method in the original class)

Parent class
@implementation YZPerson
+ (void)initialize
{
    NSLog(@"YZPerson-initialize");
}
@end

Classification of parent classes
@implementation YZPerson (Eat)
+ (void)initialize
{
    NSLog(@"YZPerson(Eat)-initialize");
}
@end

Subclass (original class)
@implementation YZStudent
//+(void)load
//{
//    NSLog(@"YZStudent-load");
//}
@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        [YZStudent alloc];
    }
    return 0;
}

A magical scene appeared:

2020-02-26 17:02:11.224559+0800 Category[75206:2732274] YZPerson(Eat)-initialize
2020-02-26 17:02:11.224800+0800 Category[75206:2732274] YZPerson(Eat)-initialize

Didn't it say that initialize is only called once? How did you call it twice? Why?

First of all, the classification is printed out. This is no problem, because the classification method is in front of the parent class method, and the classification is displayed first.
[YZStudent alloc]; You will find the parent class first. The parent class YZPerson does not implement the initialize method. Therefore, the first print is the initialize of the parent class;
When the parent class is finished, it does not end. Instead, it calls its own initialize method. It has no initialize method itself. Because of the inheritance relationship, it goes to the parent class to find initialize, and finally to tune the initialize of the parent class.
Pseudo code:

if (The original class is not initialized)
{
    if (The parent class is not initialized)
    {
        objc_msgSend([YZPerson alloc], @selector(initialize));
    }
    objc_msgSend([YZStudent alloc], @selector(initialize));
}

Therefore, two calls occur. In fact, each class is initialized only once. The first is the initialization of the parent class Person, and the second is the initialization of the child class Student. Since the child class does not have + initialize, the + initialize method of the parent class is called, that is, if the child class does not implement + initialize, the + initialize of the parent class will be called (so the + initialize of the parent class may be called multiple times)

Q: can you add attributes to a category?

We know that classification can only add methods, not attributes. In fact, this sentence is not rigorous. It should be said:
Classification can only add methods, not attributes directly, but attributes indirectly.

In a common class, @ property (assign, nonatomic) int age;
Can do three things:

Member variables generated
Generate the declaration of get and set methods of age
Implementation of get and set methods for generating age
In classification, @ property (assign, nonatomic) int weight; It can be written, but it has only one function:
Generate the declaration of get and set methods of weight

How to add attributes indirectly for classification?

We can implement the get and set methods of attributes in classification through the associated object method (objc_setAssociatedObject) in runtime. The specific implementation is as follows:

@interface YZPerson : NSObject
@property (assign, nonatomic) int age;
@end

@interface YZPerson (Eat)
@property (copy, nonatomic) NSString *name;
@end

#import <objc/runtime.h>
@implementation YZPerson (Eat)
- (void)setName:(NSString *)name
{
    objc_setAssociatedObject(self, @selector(setName:), name, OBJC_ASSOCIATION_COPY_NONATOMIC);
}

- (NSString *)name
{
    return objc_getAssociatedObject(self, @selector(setName:));
}
@end

YZPerson *person1 = [[YZPerson alloc] init];
person1.age = 10;
person1.name = @"zhangSan";
        
YZPerson *person2 = [[YZPerson alloc] init];
person2.age = 20;
person2.name = @"liSi";
        
NSLog(@"person1.age = %d, person2.age = %d", person1.age, person2.age);
NSLog(@"person1.name = %@, person2.name = %@", person1.name, person2.name);

result:
2020-02-27 16:26:56.015710+0800 Category[6423:189583] person1.age = 10, person2.age = 20
2020-02-27 16:26:56.015980+0800 Category[6423:189583] person1.name = zhangSan, person2.name = liSi

Answer to interview questions:

Call order
Category: atomic parent
load: father son original score
initialize: parent-child Division

The category method fully complies with the message sending mechanism, so it is the parent of the molecule
The load and initialize methods are explicitly written in the code: the parent class is called recursively, so it is a parent-child
It is explicitly written in the load method code. The original class is called first and then the classification is called, so it is the original classification
The initialize method does not explicitly write the calling relationship of the original class and classification. Therefore, it follows the message sending mechanism, so it is divided into original classes
1. How to use Category
The classification method can be extended without modifying the original class model;
2. What is the principle of Category?
When compiling, it is converted to category_ Structure type of type T.
At runtime, merge the method, attribute and protocol data of all categories into a large array (the last Category data involved in compilation will be in front of the array), and insert the merged classification data (including method, attribute and protocol) in front of the original data;
3. What is the difference between Category and class extension?
The classification method can be extended without modifying the original class model
• classification can only expand methods, not member variables;
• inheritance can extend methods and member variables, and inheritance will produce new classes;
• the classification has a name, and the class extension has no name;
• classification can only expand methods, not member variables; Class extension can extend methods and member variables;
• class extensions are generally written in m file, used to expand private methods and member variables (attributes);
• classification is to merge data into class information at runtime, and class extension is to include its data in class information when compiling;
4. When is the load method called in Category? Can the load method be inherited?
When the program starts, it will load all classes and classifications, and call the + load method of all classes and classifications. That is, no matter whether the program has called this class during operation, the + load method will be called only once during program initialization.

@implementation YZPerson
+(void)load
{
    NSLog(@"YZPerson-load");
}
@end

@implementation YZStudent

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSLog(@"----");
        [YZStudent load];
        NSLog(@"----");
    }
    return 0;
}

result:
2020-02-26 15:23:20.747580+0800 Category[74061:2678272] YZPerson-load
2020-02-26 15:23:20.747839+0800 Category[74061:2678272] ----
2020-02-26 15:23:20.747862+0800 Category[74061:2678272] YZPerson-load
2020-02-26 15:23:20.747871+0800 Category[74061:2678272] ----

The load method can be inherited
However, [YZStudent load]; This call method is equivalent to the message sending mechanism. It uses the isa pointer instead of the original system call load method.

5. What is the difference between load and initialize? What is their calling order in category? What is the calling process between them when inheritance occurs?

1. Different calling methods:
load is called directly by finding the function address;
initialize is objc through the message mechanism_ Called by msgsend;

2. Different call times
load is called when the program is running and classes and classifications are loaded through runtime (it will only be called once)
Initialize is called when the class is first used; (if the subclass does not have the + initialize method, the parent class may be called multiple times)

load is called in the classification according to the compilation order
initialize is called in compile order in the classification

load calls in inheritance is called by isa pointer.
initialize calls in inheritance is called by isa pointer.

6. Can Category add member variables? If yes, how?
Classification cannot add attributes directly, but can add attributes indirectly through association in runtime.

Expand knowledge points:

More learning

iOS category, class extension - the most complete strategy in history
Summary of iOS underlying principles - the essence of Category

{
    "_track_id" = 3492084489;
    "anonymous_id" = "2AADC4B8-CE6C-4BE2-BEBC-4DA23CEC7A74";
    "distinct_id" = newId;
    event = "$AppPageLeave";
    identities =     {
        "$identity_idfv" = "2AADC4B8-CE6C-4BE2-BEBC-4DA23CEC7A74";
        "$identity_login_id" = newId;
    };
    lib =     {
        "$app_version" = "1.4.1";
        "$lib" = iOS;
        "$lib_method" = code;
        "$lib_version" = "4.1.3";
    };
    "login_id" = newId;
    properties =     {
        "$app_id" = "cn.sensorsdata.SensorsData";
        "$app_name" = SensorsData;
        "$app_version" = "1.4.1";
        "$device_id" = "2AADC4B8-CE6C-4BE2-BEBC-4DA23CEC7A74";
        "$is_first_day" = 0;
        "$lib" = iOS;
        "$lib_method" = code;
        "$lib_version" = "4.1.3";
        "$manufacturer" = Apple;
        "$model" = "x86_64";
        "$network_type" = WIFI;
        "$os" = iOS;
        "$os_version" = "15.2";
        "$screen_height" = 896;
        "$screen_name" = DemoController;
        "$screen_width" = 414;
        "$timezone_offset" = "-480";
        "$title" = "SensorsAnalytics iOS Demo";
        "$url" = WoShiYiGeURL;
        "$wifi" = 1;
        AAA = "2AADC4B8-CE6C-4BE2-BEBC-4DA23CEC7A74";
        "__APPState__" = 0;
        "event_duration" = "17.352";
    };
    time = 1640921936297;
    type = track;
}