HandyJSON
HandyJSON is a framework developed by Ali to convert JSON data into corresponding models on swift. Compared with other popular Swift JSON libraries, HandyJSON is characterized by its support for pure swift classes and ease of use. When it is deserialized (converting JSON to Model), it does not require Model to inherit from NSObject (because it is not based on the KVC mechanism), nor does it require you to define a Mapping function for Model. As long as you define the Model class and declare that it obeys the HandyJSON protocol, HandyJSON can resolve values from the JSON string by itself with the property name Key for each property. However, since HandyJSON is based on swift metadata, if the structure of swift metadata is changed, HandyJSON may not be available directly. Of course, Ali has been maintaining this framework, swift's source code has changed, I believe the framework is also relative to the change.
github of HandyJSON
Resolve Struct from Source Code
Get TargetStructMetadata
Since HandyJSON is based on swift metadata, when it comes to parsing and parsing struct s, you have to understand metadata. Next, we'll look for metadata from a source perspective.
First, let's start with the source metadata. Searching for information about StructMetadata in H reveals that its true type is TargetStructMetadata.
using StructMetadata = TargetStructMetadata<InProcess>;
Next, when we look at the structure of TargetStructMetadata, we see that TargetStructMetadata inherits from TargetValueMetadata and TargetValueMetadata inherits from TargetMetadata.
struct TargetStructMetadata : public TargetValueMetadata<Runtime> { struct TargetValueMetadata : public TargetMetadata<Runtime> {
This inheritance chain allows us to restore the structure of TargetStructMetadata.
As you can see from the code, the first property of TargetStructMetadata is Kind, and in addition to this property, there is a description that records the description file.
struct TargetMetadata { ...... private: /// The kind. Only valid for non-class metadata; getKind() must be used to get /// the kind value. StoredPointer Kind; ...... } struct TargetValueMetadata : public TargetMetadata<Runtime> { using StoredPointer = typename Runtime::StoredPointer; TargetValueMetadata(MetadataKind Kind, const TargetTypeContextDescriptor<Runtime> *description) : TargetMetadata<Runtime>(Kind), Description(description) {} //Description used to record metadata /// An out-of-line description of the type. TargetSignedPointer<Runtime, const TargetValueTypeDescriptor<Runtime> * __ptrauth_swift_type_descriptor> Description; ...... }
This gives us the structure of TargetStructMetadata as
struct TargetStructMetadata { // StoredPointer Kind; Using StoredPointer = uint64_under 64-bit system T; That is Int var kind: Int //Define it as Unsafe MutablePointer for now, and then analyze that the structure T of the typeDescriptor is generic var typeDescriptor: UnsafeMutablePointer<T> }
Get TargetStructDescriptor
Next, we'll parse the information about Description. The TargetStructDescriptor is probably the structure of the Description from the source code.
const TargetStructDescriptor<Runtime> *getDescription() const { return llvm::cast<TargetStructDescriptor<Runtime>>(this->Description); }
We find TargetStructDescriptor, which is inherited from TargetValueTypeDescriptor and contains two attributes, NumFields (count of record attributes) and FieldOffsetVectorOffset (offset of record attributes in metadata).
class TargetStructDescriptor final : public TargetValueTypeDescriptor<Runtime>, public TrailingGenericContextObjects<TargetStructDescriptor<Runtime>, TargetTypeGenericContextDescriptorHeader, /*additional trailing objects*/ TargetForeignMetadataInitialization<Runtime>, TargetSingletonMetadataInitialization<Runtime>, TargetCanonicalSpecializedMetadatasListCount<Runtime>, TargetCanonicalSpecializedMetadatasListEntry<Runtime>, TargetCanonicalSpecializedMetadatasCachingOnceToken<Runtime>> { ...... /// The number of stored properties in the struct. /// If there is a field offset vector, this is its length. uint32_t NumFields; //count of record attributes /// The offset of the field offset vector for this struct's stored /// properties in its metadata, if any. 0 means there is no field offset /// vector. uint32_t FieldOffsetVectorOffset; //Record attribute offset in metadata
TargetValueTypeDescriptor inherits from TargetTypeContextDescriptor, which contains three properties: Name (the name of the type), AccessFunctionPtr (a pointer to this type of metadata access function), and Fields (a pointer to a type's field descriptor).
class TargetValueTypeDescriptor : public TargetTypeContextDescriptor<Runtime> { public: static bool classof(const TargetContextDescriptor<Runtime> *cd) { return cd->getKind() == ContextDescriptorKind::Struct || cd->getKind() == ContextDescriptorKind::Enum; } };
class TargetTypeContextDescriptor : public TargetContextDescriptor<Runtime> { public: /// The name of the type. // Name of type TargetRelativeDirectPointer<Runtime, const char, /*nullable*/ false> Name; /// A pointer to the metadata access function for this type. /// /// The function type here is a stand-in. You should use getAccessFunction() /// to wrap the function pointer in an accessor that uses the proper calling /// convention for a given number of arguments. // Pointer to this type of metadata access function TargetRelativeDirectPointer<Runtime, MetadataResponse(...), /*Nullable*/ true> AccessFunctionPtr; /// A pointer to the field descriptor for the type, if any. // A pointer to a field descriptor of a type TargetRelativeDirectPointer<Runtime, const reflection::FieldDescriptor, /*nullable*/ true> Fields; ...... }
TargetTypeContextDescriptor also inherits from the base class TargetContextDescriptor, which contains two attributes: Flags (flags for describing contexts, including kind and version) and Aret (contexts for Parent classes, or NULL if there is no Parent at the top).
/// Base class for all context descriptors. template<typename Runtime> struct TargetContextDescriptor { /// Flags describing the context, including its kind and format version. // A flag for describing the context, including kind and version ContextDescriptorFlags Flags; /// The parent context, or null if this is a top-level context. // The context used to represent the parent class, or NULL if it is at the top level, if there is no parent TargetRelativeContextPointer<Runtime> Parent; ...... }
From here on, TargetStructDescriptor is already clear, so we can write out the structure of TargetStructDescriptor and fix the generic T in TargetStructMetadata.
struct TargetStructMetadata { var kind: Int var typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor> } struct TargetStructDescriptor { // A flag for describing the context, including kind and version var flags: Int32 // ContextDescriptorFlags Int32 // The context used to represent the parent class, or NULL if it is at the top level, if there is no parent var parent: TargetRelativeContextPointer<UnsafeRawPointer> // Relative relative address // Name of type var name: TargetRelativeDirectPointer<CChar> // Relative relative address // Pointer to this type of metadata access function var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer> // Relative relative address // A pointer to a field descriptor of a type var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor> // Relative relative address // count of record attributes var numFields: Int32 // Record attribute offset in metadata var fieldOffsetVectorOffset: Int32 } // Here are the type resolution of some attributes /// Common flags stored in the first 32-bit word of any context descriptor. // flags are Int32 struct ContextDescriptorFlags { private: uint32_t Value; }
Implement TargetRelativeDirectPointer
For the relative address TargetRelativeDirectPointer, we search for TargetRelativeDirectPointer from the source code to find that TargetRelativeDirectPointer is RelativeDirectPointer.
template <typename Runtime, typename Pointee, bool Nullable = true> using TargetRelativeDirectPointer = typename Runtime::template RelativeDirectPointer<Pointee, Nullable>;
Then at RelativePointer.h finds RelativeDirectPointer and finds that RelativeDirectPointerinherits from the base class RelativeDirectPointerImpl, which contains an attribute, RelativeOffset. It also contains a way to get real memory by offset.
template <typename T, bool Nullable = true, typename Offset = int32_t, typename = void> class RelativeDirectPointer; /// A direct relative reference to an object that is not a function pointer. // offset passed in Int32 template <typename T, bool Nullable, typename Offset> class RelativeDirectPointer<T, Nullable, Offset, typename std::enable_if<!std::is_function<T>::value>::type> : private RelativeDirectPointerImpl<T, Nullable, Offset> { ...... } /// A relative reference to a function, intended to reference private metadata /// functions for the current executable or dynamic library image from /// position-independent constant data. template<typename T, bool Nullable, typename Offset> class RelativeDirectPointerImpl { private: /// The relative offset of the function's entry point from *this. Offset RelativeOffset; ...... // Generic T type is also returned by offset calculation PointerTy get() const & { // Check for null. if (Nullable && RelativeOffset == 0) return nullptr; // The value is addressed relative to `this`. uintptr_t absolute = detail::applyRelativeOffset(this, RelativeOffset); return reinterpret_cast<PointerTy>(absolute); } ...... } /// Apply a relative offset to a base pointer. The offset is applied to the base /// pointer using sign-extended, wrapping arithmetic. // Calculate by offset template<typename BasePtrTy, typename Offset> static inline uintptr_t applyRelativeOffset(BasePtrTy *basePtr, Offset offset) { static_assert(std::is_integral<Offset>::value && std::is_signed<Offset>::value, "offset type should be signed integer"); auto base = reinterpret_cast<uintptr_t>(basePtr); // We want to do wrapping arithmetic, but with a sign-extended // offset. To do this in C, we need to do signed promotion to get // the sign extension, but we need to perform arithmetic on unsigned values, // since signed overflow is undefined behavior. auto extendOffset = (uintptr_t)(intptr_t)offset; // Pointer Address+Stored offset Address--Memory Shift Get Value return base + extendOffset; }
Then we can structure the TargetRelativeDirectPointer:
// Incoming generic Pointee struct TargetRelativeDirectPointer<Pointee> { var offset: Int32 // Calculate memory by offset mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> { let offset = self.offset return withUnsafePointer(to: &self) { p in // Use advanced offset and rebind to Pointee type return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self)) } } }
At the same time, we can modify the TargetStructDescriptor to be:
struct TargetStructDescriptor { // A flag for describing the context, including kind and version var flags: Int32 // The context used to represent the parent class, or NULL if it is at the top level, if there is no parent var parent: Int32// Temporarily defined as Int32 because it does not resolve // Name of type var name: TargetRelativeDirectPointer<CChar> // Pointer to this type of metadata access function var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer> // A pointer to a field descriptor of a type var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor> // count of record attributes var numFields: Int32 // Record attribute offset in metadata var fieldOffsetVectorOffset: Int32 } // TargetRelativeContextPointer is temporarily unresolved and can be temporarily resolved to Int32 by source analysis template<typename Runtime, template<typename _Runtime> class Context = TargetContextDescriptor> using TargetRelativeContextPointer = RelativeIndirectablePointer<const Context<Runtime>, /*nullable*/ true, int32_t, TargetSignedContextPointer<Runtime, Context>>;
FieldDescriptor and FieldRecord
Next, we start parsing the FieldDescriptor, which is in the source code as follows:
// Field descriptors contain a collection of field records for a single // class, struct or enum declaration. class FieldDescriptor { const FieldRecord *getFieldRecordBuffer() const { return reinterpret_cast<const FieldRecord *>(this + 1); } public: const RelativeDirectPointer<const char> MangledTypeName; const RelativeDirectPointer<const char> Superclass; FieldDescriptor() = delete; const FieldDescriptorKind Kind; const uint16_t FieldRecordSize; const uint32_t NumFields; ...... // Get all properties, each encapsulated in FieldRecord llvm::ArrayRef<FieldRecord> getFields() const { return {getFieldRecordBuffer(), NumFields}; } ...... } // FieldDescriptorKin is Int16 enum class FieldDescriptorKind : uint16_t { ...... }
The structure of FieldRecord in the source code is:
class FieldRecord { const FieldRecordFlags Flags; public: const RelativeDirectPointer<const char> MangledTypeName; const RelativeDirectPointer<const char> FieldName; ...... } // Field records describe the type of a single stored property or case member // of a class, struct or enum. // FieldRecordFlags is Int32 class FieldRecordFlags { using int_type = uint32_t; ...... }
fieldOffsetVectorOffset calculates offset
Finally, there is a calculation of the fieldOffsetVectorOffset, which records the offset of the attribute in the metadata, to get the offset of the attribute in the metadata. The information available from the source code is:
// StoredPointer is Int32 and returns an Int32 /// Get a pointer to the field offset vector, if present, or null. const StoredPointer *getFieldOffsets() const { assert(isTypeMetadata()); auto offset = getDescription()->getFieldOffsetVectorOffset(); if (offset == 0) return nullptr; auto asWords = reinterpret_cast<const void * const*>(this); return reinterpret_cast<const StoredPointer *>(asWords + offset); }
But to process with this logic, the data is not correct, so I found this from the source of HandyJSON:
// The 64-bit offset was then multiplied by 2 return Int(UnsafePointer<Int32>(pointer)[vectorOffset * (is64BitPlatform ? 2 : 1) + $0])
At this point, we have a fairly clear structure line, as follows:
// Calculate memory address by offset into generic Pointee struct TargetRelativeDirectPointer<Pointee> { var offset: Int32 // Calculate memory by offset mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> { let offset = self.offset return withUnsafePointer(to: &self) { p in // Use advanced offset and rebind to Pointee type return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self)) } } } struct TargetStructMetadata { var kind: Int var typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor> } struct TargetStructDescriptor { var flags: Int32 var parent: Int32 var name: TargetRelativeDirectPointer<CChar> var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer> var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor> var numFields: Int32 var fieldOffsetVectorOffset: Int32 func getFieldOffsets(_ metadata: UnsafeRawPointer) -> UnsafePointer<Int32> { print(metadata) return metadata.assumingMemoryBound(to: Int32.self).advanced(by: numericCast(self.fieldOffsetVectorOffset) * 2) } // Use when calculating metatypes var genericArgumentOffset: Int { return 2 } } struct FieldDescriptor { var MangledTypeName: TargetRelativeDirectPointer<CChar> var Superclass: TargetRelativeDirectPointer<CChar> var kind: UInt16 var fieldRecordSize: Int16 var numFields: Int32 var fields: FieldRecordBuffer<FieldRecord> } struct FieldRecord { var fieldRecordFlags: Int32 var mangledTypeName: TargetRelativeDirectPointer<CChar> var fieldName: TargetRelativeDirectPointer<UInt8> } // Get FieldRecord struct FieldRecordBuffer<Element> { var element: Element mutating func buffer(n: Int) -> UnsafeBufferPointer<Element> { return withUnsafePointer(to: &self) { let ptr = $0.withMemoryRebound(to: Element.self, capacity: 1) { start in return start } return UnsafeBufferPointer(start: ptr, count: n) } } mutating func index(of i: Int) -> UnsafeMutablePointer<Element> { return withUnsafePointer(to: &self) { return UnsafeMutablePointer(mutating: UnsafeRawPointer($0).assumingMemoryBound(to: Element.self).advanced(by: i)) } } }
Code validation
Here's the code to verify this structure.
protocol BrigeProtocol {} extension BrigeProtocol { // Return through protocol rebind type static func get(from pointor: UnsafeRawPointer) -> Any { // Self is the real type pointor.assumingMemoryBound(to: Self.self).pointee } } struct BrigeMetadataStruct { let type: Any.Type let witness: Int } func custom(type: Any.Type) -> BrigeProtocol.Type { let container = BrigeMetadataStruct(type: type, witness: 0) let cast = unsafeBitCast(container, to: BrigeProtocol.Type.self) return cast }
// LLPerson Structures struct LLPerson { var age: Int = 18 var name: String = "LL" var nameTwo: String = "LLLL" } // Create an instance var p = LLPerson() // LLPerson's metadata is bit-wise inserted into the TargetStructMetadata metadata, LLPerson.self is Unsafe MutablePointer<TargetStructMetadata>. Self let ptr = unsafeBitCast(LLPerson.self as Any.Type, to: UnsafeMutablePointer<TargetStructMetadata>.self) // Get the structure name let namePtr = ptr.pointee.typeDescriptor.pointee.name.getmeasureRelativeOffset() print("current struct name: \(String(cString: namePtr))") // Get the number of attributes let numFields = ptr.pointee.typeDescriptor.pointee.numFields print("Current number of class attributes: \(numFields)") // Get the offset from the property to the metadata let offsets = ptr.pointee.typeDescriptor.pointee.getFieldOffsets(UnsafeRawPointer(ptr).assumingMemoryBound(to: Int.self)) print("----------- start fetch field -------------") for i in 0..<numFields { // Get Property Name let fieldName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.fieldName.getmeasureRelativeOffset() print("----- field \(String(cString: fieldName)) -----") // Get the offset of the property by byte let fieldOffset = offsets[Int(i)] print("\(String(cString: fieldName)) The offset of is:\(fieldOffset)byte") // This is a swift mixed-up type name that needs to be converted to a real type name let typeMangleName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.mangledTypeName.getmeasureRelativeOffset() // print("\(String(cString: typeMangleName))") let genericVector = UnsafeRawPointer(ptr).advanced(by: ptr.pointee.typeDescriptor.pointee.genericArgumentOffset * MemoryLayout<UnsafeRawPointer>.size).assumingMemoryBound(to: Any.Type.self) // This library function swift_is required GetTypeByMangledNameInContext passes four parameters let fieldType = swift_getTypeByMangledNameInContext( typeMangleName, // Blended Name 256, // The length of the name information after mixing, requires calculating direct 256 in HandyJSON UnsafeRawPointer(ptr.pointee.typeDescriptor), // In context typeDescriptor UnsafeRawPointer(genericVector).assumingMemoryBound(to: Optional<UnsafeRawPointer>.self)) //Current generic parameter restores symbol information // Bitwise fieldType into Any let type = unsafeBitCast(fieldType, to: Any.Type.self) // Get our true type of information through protocol bridging let value = custom(type: type) //The pointer to get the instance object p needs to be converted to Unsafe RawPointerand bound to 1 byte, the Int8 type. //Since the offset is then calculated in bytes, it will be offset by the length of the structure without conversion let instanceAddress = withUnsafePointer(to: &p){return UnsafeRawPointer($0).assumingMemoryBound(to: Int8.self)} print("fieldTyoe: \(type) \nfieldValue: \(value.get(from: instanceAddress.advanced(by: Int(fieldOffset))))") } print("----------- end fetch field -------------")
Print information:
From the memory address, we can also see the layout information of the attributes.