Research on Swift Object Memory Model (I)

Posted by esas_nefret on Sun, 30 Jun 2019 20:38:44 +0200

This article is from the Tencent Bugly Public Number (weixin Bugly). Please do not reproduce it without the author's consent. The original address is: https://mp.weixin.qq.com/s/zIkB9KnAt1YPWGOOwyqY3Q

Author: Wang Zhenyu

> HandyJSON is one of Swift's open source libraries for processing JSON data, similar to JOSNModel, which can directly convert JSON data into class instances for use in code.

> Because Swift is a static language, there is no flexible Runtime mechanism like OC. In order to achieve the effect similar to JSON Model, Handy JSON creates a new way to bypass the dependence on Runtime and directly manipulate the memory of the instance to assign the instance attributes, thus obtaining a fully initialized instance.

> This paper will introduce the principle of HandyJSON by exploring the mechanism of Swift object memory model.

memory allocation

  • Stack, Temporary Variables for Storage Value Type, Function Call Stack, Temporary Variable Pointer for Reference Type
  • Heap (heap), which stores instances of reference types

MemoryLayout

Basic usage method

MemoryLayout is a tool class introduced by Swift 3.0 to calculate the size of memory occupied by data. The basic usage is as follows:

MemoryLayout<Int>.size   //8

let a: Int = 10
MemoryLayout.size(ofValue: a)   //8

Introduction to MemoryLayout attributes

MemoryLayout has three very useful attributes, all of which are Int types:

alignment & alignment(ofValue: T)

This property is related to memory alignment. Many computer systems impose restrictions on the legitimate addresses of basic data types, requiring that the addresses of certain data type objects must be multiple of a value K (usually 2, 4 or 8). This alignment restriction simplifies the hardware design that forms the interface between the processor and the memory system. The alignment principle is that the address of any basic object in K bytes must be multiple of K.

MemoryLayout\ alignment represents the memory alignment principle of data type T. In 64 bit system, the maximum memory alignment principle is 8 byte.

size & size(ofValue: T)

An instance of T data type occupies the size of consecutive bytes of memory.

stride & stride(ofValue: T)

In a T-type array, the size of the consecutive bytes of memory occupied by any element from the start address to the end address is stride. As shown in the figure:

Note: There are four T-type elements in the array. Although each T-element is size bytes in size, each T-type element actually consumes stride bytes because of the limitation of memory alignment, while stride-size bytes are the memory space wasted by each element because of memory alignment.

MemoryLayout for Basic Data Types

//Value type
MemoryLayout<Int>.size           //8
MemoryLayout<Int>.alignment      //8
MemoryLayout<Int>.stride         //8

MemoryLayout<String>.size           //24
MemoryLayout<String>.alignment      //8
MemoryLayout<String>.stride         //24

//Reference type T
MemoryLayout<T>.size           //8
MemoryLayout<T>.alignment      //8
MemoryLayout<T>.stride         //8

//Pointer type
MemoryLayout<unsafeMutablePointer<T>>.size           //8
MemoryLayout<unsafeMutablePointer<T>>.alignment      //8
MemoryLayout<unsafeMutablePointer<T>>.stride         //8

MemoryLayout<unsafeMutableBufferPointer<T>>.size           //16
MemoryLayout<unsafeMutableBufferPointer<T>>.alignment      //16
MemoryLayout<unsafeMutableBufferPointer<T>>.stride         //16

Swift pointer

Common Swift pointer types

In this paper, mainly involves the use of several pointers, in this simple analogy to introduce.

  • unsafePointer<T> Unsafe Pointer< T> equivalent to const T*.
    • unsafeMutablePointer<T> Unsafe Mutable Pointer< T> equivalent to T*
    • unsafeRawPointer unsafeRawPointer is equivalent to const void*
    • unsafeMutableRawPointer Unsafe Mutable RawPointer is equivalent to void*

Swift gets pointers to objects

final func withUnsafeMutablePointers<R>(_ body: (UnsafeMutablePointer<Header>, UnsafeMutablePointer<Element>) throws -> R) rethrows -> R

//Basic data types
var a: T = T()
var aPointer = a.withUnsafeMutablePointer{ return $0 }

//Get a pointer to an instance of struct type, From HandyJSON
func headPointerOfStruct() -> UnsafeMutablePointer<Int8> {
    return withUnsafeMutablePointer(to: &self) {
        return UnsafeMutableRawPointer($0).bindMemory(to: Int8.self, capacity: MemoryLayout<Self>.stride)
     }
}

//Get a pointer to the class type instance, From HandyJSON
func headPointerOfClass() -> UnsafeMutablePointer<Int8> {
    let opaquePointer = Unmanaged.passUnretained(self as AnyObject).toOpaque()
    let mutableTypedPointer = opaquePointer.bindMemory(to: Int8.self, capacity: MemoryLayout<Self>.stride)
    return UnsafeMutablePointer<Int8>(mutableTypedPointer)
}

Struct memory model

In Swift, struct is a value type, and a Struct temporary variable without a reference type is stored on the stack:

struct Point {
    var a: Double
    var b: Double
}

MemoryLayout<Point>.size     //16

The memory model is shown as follows:

Look at another situation:

struct Point {
    var a: Double?
    var b: Double
}

MemoryLayout<Point>.size    //24

As you can see, if attribute a is turned into an optional type, the entire Point type is increased by 8 bytes. But in fact, the optional type adds only one byte:

MemoryLayout<Double>.size               //8
MemoryLayout<Optional<Double>>.size     //9

The reason why the a attribute adds 8 bytes of storage space to the Point type after an optional value is due to the memory alignment limitation:

Because Optional< Double> occupied the first nine bytes, resulting in seven bytes remaining in the second lattice, while attribute b is Double type alignment is 8, so the storage of attribute b can only start from 16 bytes, resulting in the storage space of the entire Point type to 24 bytes, of which seven bytes are wasted. It dropped.

So, from the above examples, it can be concluded that Swift's optional type is a waste of memory space.

Operational memory modifies the value of the property of an instance of Struct type

struct Demo

The following shows a simple structure that we will use to complete an example operation:

enum Kind {
    case wolf
    case fox
    case dog
    case sheep
}

struct Animal {
    private var a: Int = 1       //8 byte
    var b: String = "animal"     //24 byte
    var c: Kind = .wolf          //1 byte
    var d: String?               //25 byte
    var e: Int8 = 8              //1 byte

    //Returns a pointer to the head of the Animal instance
    func headPointerOfStruct() -> UnsafeMutablePointer<Int8> {
        return withUnsafeMutablePointer(to: &self) {
            return UnsafeMutableRawPointer($0).bindMemory(to: Int8.self, capacity: MemoryLayout<Self>.stride)
     }

    func printA() {
        print("Animal a:\(a)")
    }
}

operation

First, we need to initialize an Animal instance:

let animal = Animal()     // a: 1, b: "animal", c: .wolf, d: nil, e: 8 

Get the pointer to animal:

let animalPtr: unsafeMutablePointer<Int8> = animal.headPointerOfStruct()

Now the situation in memory is as follows:

PS: As can be seen from the graph, the size of the Animal type is 8 + 24 + 8 + 25 + 1 = 66, the algination is 8, and the string is 8 + 24 + 8 + 32 = 72.

If we want to modify the attribute value of animal instance through memory, we need to get the memory area where its attribute value is located, and then modify the value of the memory area, so that we can achieve the purpose of modifying the value of animal attribute:

//Converting the previously obtained pointer to an animal instance into a rawPointer pointer type facilitates pointer offset operations
let animalRawPtr = unsafeMutableRawPointer(animalPtr)
let intValueFromJson = 100

let aPtr = animalRawPtr.advance(by: 0).assumingMemoryBound(to: Int.self)
aPtr.pointee          // 1
animal.printA()       //Animal a: 1
aPtr.initialize(to: intValueFromJson)
aPtr.pointee          // 100
animal.printA()       //Animal a:100

By doing this, we have successfully changed the value of an Int type attribute of animal from 1 to 100, and the attribute is still a private attribute.

code analysis

First, the animalPtr pointer is an Int8 pointer, or byte pointer, which represents the first byte of memory in which the animal instance resides. To get the attribute a of an animal instance, we need an Int pointer. Obviously, as an Int8 pointer, animalPtr does not meet the requirements.

So let's first convert animalPtr to unsafeMutable RawPointer type (equivalent to void * type in C). Because attribute a is offset by 0 bytes in memory. Then, by assuming Memory Bound (to: Type) method, we get a pointer that points to the same address but has a specified type of Type (Int in this case). So we get a pointer to the first address of the animal instance but of type Int.

The assuming Memory Bound (to:) method is described in the document as follows: >Returns a typed pointer to the memory referenced by this pointer, assuming that the memory is already bound to the specified type

By default, a block of memory area has been bound to a data type (in this case, the green memory area is Int type, so we can default the block area to Int type), returning a pointer to this data type of memory area (in this case, we pass Int.self as a type parameter, and Returns an Int pointer to the green memory area.

So, by assuming Memory Bound (to: Int. self) method, we get the Int type pointer aPtr pointing to attribute a.

In Swift, a pointer has an attribute called pointee, through which we can get the value in memory that the pointer points to, similar to * Pointer in C to get the value of the pointer.

Because the default value of a is 1 when the animal instance is initialized, the value of aPtr.pointee is also 1.

Then, we use the initialize(to:) method to re-initialize the memory area pointed by aPtr, that is, the green area on the way, and change its value to 100. Thus, the operation of modifying the value of attribute a through memory is completed.

The idea of modifying the later attribute values is the same. Firstly, a pointer pointing to the starting address of an attribute is obtained by pointer offset of animal RawPtr. Then, the pointer type of this memory area is converted by assuming Memory Bound (to:) method. Then, the converted pointer rewrites the value of this memory area by reinitializing the memory area. Modify the operation.

Class memory model

Class is a reference type. The generated instances are distributed in Heap (heap) memory area, and only one pointer to the instance in heap is stored in Stack (stack). Because of the dynamic nature of reference types and ARC, class type instances need a separate area to store type information and reference counts.

class Human {
    var age: Int?
    var name: String?
    var nicknames: [String] = [String]()

    //Returns a pointer to the head of the Human instance
    func headPointerOfClass() -> UnsafeMutablePointer<Int8> {
        let opaquePointer = Unmanaged.passUnretained(self as AnyObject).toOpaque()
        let mutableTypedPointer = opaquePointer.bindMemory(to: Int8.self, capacity: MemoryLayout<Human>.stride)
        return UnsafeMutablePointer<Int8>(mutableTypedPointer)
    }
}

MemoryLayout<Human>.size       //8

The memory distribution of the Human class is as follows:

The type information area is 4 byte on a 32 bit machine and 8 byte on a 64 bit machine. Reference counting takes up 8 byte. So, on the heap, the address of the class attribute starts at the 16th byte.

Operational memory modifies the value of an instance attribute of Class type

As with modifying the value of struct type attributes, the only difference is that after the first address on the class instance heap is obtained, because of the existence of the Type field and the reference count field, 16 bytes need to be offset to reach the memory start address of the first attribute. The following example describes the operation of modifying nicknames properties:

let human = Human()
let arrFormJson = ["goudan","zhaosi", "wangwu"]

//Get the void * pointer to human heap memory
let humanRawPtr = unsafeMutableRawPointer(human.headerPointerOfClass())

//nicknames array offsets 64 byte in memory (16 + 16 + 32)
let humanNickNamesPtr =  humanRawPtr.advance(by: 64).assumingMemoryBound(to: Array<String>.self)
human.nicknames      
     //[]
humanNickNamePtr.initialize(arrFormJson)
human.nicknames           //["goudan","zhaosi", "wangwu"]

Play around with array properties in Class types

As shown in the Human type memory schematic, a human instance holding a nicknames array actually holds only an Array< String> type pointer, which is the nicknames region in the graph. The real array is in another contiguous piece of memory in the heap. Here's how to get that continuous memory area that actually stores array data.

In C, the pointer to the array actually points to the first element in the array. For example, if arrPointer is a pointer to the array in C, we can get the first element of the array by * arrPointer operation. That is to say, the arrPointer pointer points to the first element of the array, and the type of the pointer and the element of the array. The prime type is the same.

Similarly, it is also applicable in Swift. In this case, the nicknames memory area contains a pointer to an array of String types, which means that the pointer points to the first element of the array of String types. So the type of pointer should be unsafeMuatblePointer< String> so we can get the pointer to the array in the following way:

let firstElementPtr = humanRawPtr.advance(by: 64).assumingMemoryBound(to: unsafeMutablePointer<String>.self).pointee 

As shown in the figure:

So, in theory, we can use the pointee attribute of firstElementPtr to get the first element of the array, "goudan", look at the code:

After running on Playground, it doesn't show "goudan" as we expected. Is our theory wrong? It's not scientific! In the spirit of breaking the casserole and asking the question to the end, the problem can not be solved without sleeping, we have found out a rule indeed.

By comparing the first Element Ptr with the first Element Ptr, we find that the address of the first Element Ptr obtained by our method is always 32 bytes lower than the real address of the original array arrFromJson (after many rounds of tests by bloggers, the address obtained by the two methods is always 32 bytes worse than that of the original array arrFromJson).

As you can see, the difference between 0x6080000CE870x6080000CE850 and 0x20 bytes is 32 bytes in decimal system.

So, the first ElementPtr pointer that we obtained in our way points to the real address as follows, as shown in Figure 1:

PS: Although the reason is clear, the 32-byte bloggers at the beginning of the array have not yet figured out what to do. If you have the children's shoes you can tell the blogger.

So, all we need to do is offset the first ElementPtr by 32 bytes, and then take the value to get the value in the array.

Class Type Hanging Sheep's Head to Sell Dog Meat

The Role of Type

Let's assume the following code:

class Drawable {
    func draw() {

    }
}

class Point: Drawable {
    var x: Double = 1
    var y: Double = 1

    func draw() {
        print("Point")
    }
}

class Line: Drawable {
    var x1: Double = 1
    var y1: Double = 1
    var x2: Double = 2
    var y2: Double = 2 

    func draw() {
        print("Line")
    }
}

var arr: [Drawable] = [Point(), Line()]
for d in arr {
    d.draw()     //The question is, how does Swift determine which method to call?
}

In Swift, method dispatch of class type is implemented dynamically through V-Table. Swift generates a Type information for each type and places it in the static memory area, and the type pointer for each class type instance points to the type information of this type in the static memory area. When a class instance invokes a method, it first finds the type information of that type through the type pointer of the instance, then gets the address of the method through V-Table in the information, and jumps to the implementation address of the corresponding method to execute the method.

What about replacing Type?

From the above analysis, we know that method dispatch of a class type is determined by the type pointer of the head. Would something interesting happen if we pointed the type pointer of one class instance to another type? Haha~Let's try it together.~

class Wolf {
    var name: String = "wolf"

    func soul() {
        print("my soul is wolf")
    }

    func headPointerOfClass() -> UnsafeMutablePointer<Int8> {
        let opaquePointer = Unmanaged.passUnretained(self as AnyObject).toOpaque()
        let mutableTypedPointer = opaquePointer.bindMemory(to: Int8.self, capacity: MemoryLayout<Wolf>.stride)
        return UnsafeMutablePointer<Int8>(mutableTypedPointer)
    }
}

class Fox {
    var name: String = "fox"

    func soul() {
        print("my soul is fox")
    }

    func headPointerOfClass() -> UnsafeMutablePointer<Int8> {
        let opaquePointer = Unmanaged.passUnretained(self as AnyObject).toOpaque()
        let mutableTypedPointer = opaquePointer.bindMemory(to: Int8.self, capacity: MemoryLayout<Fox>.stride)
        return UnsafeMutablePointer<Int8>(mutableTypedPointer)
    }
}

You can see that the memory structures of the above two classes, Wolf and Fox, are identical except for their different types. Then we can use these two classes to test:

let wolf = Wolf()
var wolfPtr = UnsafeMutableRawPointer(wolf.headPointerOfClass())

let fox = Fox()
var foxPtr = UnsafeMutableRawPointer(fox.headPointerOfClass())
foxPtr.advanced(by: 0).bindMemory(to: UnsafeMutablePointer<Wolf.Type>.self, capacity: 1).initialize(to: wolfPtr.advanced(by: 0).assumingMemoryBound(to: UnsafeMutablePointer<Wolf.Type>.self).pointee)

print(type(of: fox))        //Wolf
fox.name                    //"fox"
fox.soul()                  //my soul is wolf

The amazing thing happened. A Fox instance called Wolf's method. Ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha-ha~

Reference Articles

Swift Advanced Memory Model and Method Scheduling Pointer usage in Swift Viewing the Array Use of Objective-C from Swift

More wonderful content welcome attention Tencent Bugly Wechat Public Account:

Tencent Bugly It is a quality monitoring tool specially designed for mobile developers to help developers quickly and easily locate online application crashes and solutions. Intelligent merging helps develop thousands of daily reports from students Crash According to the root cause merge classification, the daily report lists the crashes that affect the most users. The precise positioning function helps students locate the wrong lines of code. Real-time reporting can quickly understand the quality of the application after release, adapt to the latest iOS, Android official operating system, goose factory engineers are using it. Come and join us!

Topics: Attribute Swift JSON Mobile