InstanceKlass "suggestions collection" of HotSpot class model

Posted by Skyphoxx on Wed, 16 Feb 2022 01:42:52 +0100

Hello, I'm architecture Jun, an architect who can write code and recite poetry. Today, let's talk about InstanceKlass "suggestions collection" of HotSpot class model. I hope it can help you make progress!!!

Last HotSpot source code analysis and other models This paper introduces the important properties and methods of the basic class Klass of the class model. This article introduces InstanceKlass and its subclasses.

1. InstanceKlass class

Each InstanceKlass object represents a specific Java class (Java classes here do not include Java arrays). InstanceKlass class and important attributes are defined as follows:

class InstanceKlass: public Klass {
 ...

 protected:
  // Annotations for this class
  Annotations*       _annotations;
  // Array classes holding elements of this class.
  Klass*             _array_klasses;
  // Constant pool for this class.
  ConstantPool*     _constants;
  // The InnerClasses attribute and EnclosingMethod attribute. The
  // _inner_classes is an array of shorts. If the class has InnerClasses
  // attribute, then the _inner_classes array begins with 4-tuples of shorts
  // [inner_class_info_index, outer_class_info_index,
  // inner_name_index, inner_class_access_flags] for the InnerClasses
  // attribute. If the EnclosingMethod attribute exists, it occupies the
  // last two shorts [class_index, method_index] of the array. If only
  // the InnerClasses attribute exists, the _inner_classes array length is
  // number_of_inner_classes * 4. If the class has both InnerClasses
  // and EnclosingMethod attributes the _inner_classes array length is
  // number_of_inner_classes * 4 + enclosing_method_attribute_size.
  Array<jushort>*   _inner_classes;
 
  // Array name derived from this class which needs unreferencing
  // if this class is unloaded.
  Symbol*           _array_name;
 
  // Number of heapOopSize words used by non-static fields in this klass
  // (including inherited fields but after header_size()).
  int               _nonstatic_field_size;
  int               _static_field_size;    // number words used by static fields (oop and non-oop) in this klass
  // Constant pool index to the utf8 entry of the Generic signature,
  // or 0 if none.
  u2                _generic_signature_index;
  // Constant pool index to the utf8 entry for the name of source file
  // containing this klass, 0 if not specified.
  u2                _source_file_name_index;
  u2                _static_oop_field_count;// number of static oop fields in this klass
  u2                _java_fields_count;    // The number of declared Java fields
  int               _nonstatic_oop_map_size;// size in words of nonstatic oop map blocks
 

  u2                _minor_version;  // minor version number of class file
  u2                _major_version;  // major version number of class file
  Thread*           _init_thread;    // Pointer to current thread doing initialization (to handle recusive initialization)
  int               _vtable_len;     // length of Java vtable (in words)
  int               _itable_len;     // length of Java itable (in words)
  OopMapCache*      volatile _oop_map_cache;   // OopMapCache for all methods in the klass (allocated lazily)
  JNIid*            _jni_ids;              // First JNI identifier for static fields in this class
  jmethodID*        _methods_jmethod_ids;  // jmethodIDs corresponding to method_idnum, or NULL if none
  nmethodBucket*    _dependencies;         // list of dependent nmethods
  nmethod*          _osr_nmethods_head;    // Head of list of on-stack replacement nmethods for this class

 
  // Class states are defined as ClassState (see above).
  // Place the _init_state here to utilize the unused 2-byte after
  // _idnum_allocated_count.
  u1                _init_state;                    // state of class
  u1                _reference_type;                // reference type
 

  // Method array.
  Array<Method*>*   _methods;
  // Default Method Array, concrete methods inherited from interfaces
  Array<Method*>*   _default_methods;
  // Interface (Klass*s) this class declares locally to implement.
  Array<Klass*>*    _local_interfaces;
  // Interface (Klass*s) this class implements transitively.
  Array<Klass*>*    _transitive_interfaces;

  // Int array containing the vtable_indices for default_methods
  // offset matches _default_methods offset
  Array<int>*       _default_vtable_indices;
 
  // Instance and static variable information, starts with 6-tuples of shorts
  // [access, name index, sig index, initval index, low_offset, high_offset]
  // for all fields, followed by the generic signature data at the end of
  // the array. Only fields with generic signature attributes have the generic
  // signature data set in the array. The fields array looks like following:
  //
  // f1: [access, name index, sig index, initial value index, low_offset, high_offset]
  // f2: [access, name index, sig index, initial value index, low_offset, high_offset]
  //      ...
  // fn: [access, name index, sig index, initial value index, low_offset, high_offset]
  //     [generic signature index]
  //     [generic signature index]
  //     ...
  Array<u2>*        _fields;
 
  // embedded Java vtable follows here
  // embedded Java itables follows here
  // embedded static fields follows here
  // embedded nonstatic oop-map blocks follows here
  // embedded implementor of this interface follows here
  //   The embedded implementor only exists if the current klass is an
  //   iterface. The possible values of the implementor fall into following
  //   three cases:
  //     NULL: no implementor.
  //     A Klass* that's not itself: one implementor.
  //     Itsef: more than one implementors.
  // embedded host klass follows here
  //   The embedded host klass only exists in an anonymous class for
  //   dynamic language support (JSR 292 enabled). The host class grants
  //   its access privileges to this class also. The host class is either
  //   named, or a previously loaded anonymous class. A non-anonymous class
  //   or an anonymous class loaded through normal classloading does not
  //   have this embedded field.
  
  ...
}

I only heard the voice of the architect from the architect's office: Endure tears, pretend to lower your face, and half frown with shame. Who will match the first couplet or the second couplet?

The important attributes are described in the following table.

Field name

effect

_annotations

A pointer of type Annotations that holds all Annotations used by this class

_array_klasses

The array element is the array Klass pointer of this class. For example, objrarrayklass is an Object array and the element type is Object, Then the InstanceKlass Object representing the Object class_ array_klasses are pointers to objrayklass objects

_array_name

The name of the array with this class as the array element. If the current InstanceKlass Object represents the Object class, the name is "[Ljava/lang/Object;"

_constants

A pointer of ConstantPool type, which is used to point to the ConstantPool object that holds the constant pool information of the Java class

_inner_classes

Use a jushort array to save the InnerClasses property and EnclosingMethod property of the current class

_nonstatic_field_size

The amount of memory occupied by non static fields, in words. When allocating memory for an object (represented by oop) created by a Java class represented by the current class, Memory will be allocated by referring to the value of this attribute, which will be calculated during class file parsing.

_static_field_size

The amount of memory that a static field needs to occupy, in words. In the Java. XML file created for the Java class represented by the current class When lang. class objects (represented by oop) allocate memory, Memory will be allocated by referring to the value of this attribute, which will be calculated during class file parsing.

_generic_signature_index

Save the index of the signature of this class in the constant pool

_source_file_name_index

Save the index of the source file name of this class in the constant pool

_static_oop_field_count

The number of static reference type fields that this class contains

_java_fields_count

Total number of fields included in this class

_nonstatic_oop_map_size

The amount of memory occupied by non static OOP map blocks, in words

_minor_version

The minor version number of the class

_major_version

The major version number of the class

_init_thread

Thread pointer to perform such initialization

_vtable_len

The amount of memory occupied by the Java virtual function table (vtable), in words

_itable_len

The amount of memory occupied by the Java interface function table (itable), in words

_oop_map_cache

OopMapCache pointer, the OopMapCache of all methods of this class

_jni_ids/_methods_jmethod_ids

JNIid pointer and jmethodidid pointer are very important for JNI method operation properties and methods. They will be described in detail when introducing JNI.

_dependencies

nmethodBucket pointer, which depends on the local method according to its_ The next property gets the next nmethod

_osr_nmethods_head

The header element of the local method linked list replaced on the stack

_init_state

Represents the state of the class. It is an enumeration type ClassState, which defines the following constant values: Allocated (allocated memory) loaded (read from class file and load into memory) Linked (successfully linked and verified) being_initialized fully_initialized (initialization completed) initialization_error (initialization exception)

_reference_type

Reference type, which may be strong reference, soft reference, weak reference, etc

_methods

Pointer array to save method

_default_methods

Save the pointer array of the method and the default method inherited from the interface

_local_interfaces

Save the pointer array of the interface and directly implement the interface Klass

_transitive_interfaces

Save the pointer array of the interface, including_ local_interfaces and indirectly implemented interfaces

_default_vtable_indices

Index of default method in virtual function table

_fields

Class, and the six attributes of each field are access, name index, sig index, initial value index and low_offset,high_offset forms a tuple, Access refers to the access control attribute. You can get the attribute name according to the name index, get the initial value according to the initial value index, and get the initial value according to the low index_ Offset and high_offset can get the offset of the attribute in memory. In addition, generic signature information may also be saved after all properties are saved.

  • Allocated (allocated memory)
  • loaded (read from class file and load into memory)
  • Linked (successfully linked and verified)
  • being_initialized
  • fully_initialized (initialization completed)
  • initialization_error (initialization exception)

_ reference_type refers to the type, which may be strong reference, soft reference, weak reference, etc_ methods holds an array of pointers to the method_ default_methods saves the pointer array of the method and the default method inherited from the interface_ local_interfaces saves the pointer array of the interface and directly implements the interface Klass_ transitive_ Interfaces saves the pointer array of the interface, including_ local_interfaces and indirectly implemented interfaces_ default_ vtable_ Index of the default method of indexes in the virtual function table_ The field attributes of the fields class, and the six attributes of each field are access, name index, sig index, initial value index and low_offset,high_offset forms a tuple, Access refers to the access control attribute. You can get the attribute name according to the name index, get the initial value according to the initial value index, and get the initial value according to the low index_ Offset and high_offset can get the offset of the attribute in memory. In addition, generic signature information may also be saved after all properties are saved.

With these attributes defined in InstanceKlass and Klass, it is enough to save Java class meta information. In the subsequent class resolution, you will see the attribute filling operation of related variables. In addition to saving class meta information, this class also has another important function, that is, it supports method dispatch, which is mainly completed through Java virtual function table and java interface function table. However, unlike Java, C + + does not have to define relevant attributes in the class when saving information. C + + only allocates specific memory for the information to be stored when allocating memory, Then it can be operated directly through memory offset.

The following attributes have no corresponding attribute names and can only be accessed through pointers and offsets:

  • Java vtable: Java virtual function table with size equal to_ vtable_len;
  • Java itables: Java interface function table, size equal to_ itable_len;
  • Non static OOP map blocks with size equal to_ nonstatic_oop_map_size. During garbage collection, when GC traverses other objects referenced by an object, it will search in combination with this information;
  • Interface implementation class, which exists only when the current class represents an interface. NULL if the interface does not have any implementation classes; If there is only one implementation class, it is the Klass pointer of the implementation class; If there are multiple implementation classes, it is the current class itself;
  • host klass only exists in anonymous classes. In order to support the dynamic language feature in JSR 292, a host klass will be generated for anonymous classes.

HotSpot will call instanceklass:: allocate when parsing a class_ instance_ Klass() method allocates memory, and the amount of memory allocated is calculated by calling InstanceKlass::size(). The call statement is as follows:

This code is by Java Architect must see network-Structure Sorting
int size = InstanceKlass::size(vtable_len,itable_len,nonstatic_oop_map_size,isinterf,is_anonymous);

The implementation of the called size() method is as follows:

static int size(
  int    vtable_length,
  int    itable_length,
  int    nonstatic_oop_map_size,
  bool   is_interface,
  bool   is_anonymous
){
return     align_object_size(header_size()    +  // The amount of memory occupied by the InstanceKlass class itself
	   align_object_offset(vtable_length) +
	   align_object_offset(itable_length) +
	   //    [EMBEDDED nonstatic oop-map blocks] size in words = nonstatic_oop_map_size
	   //      The embedded nonstatic oop-map blocks are short pairs (offset, length)
	   //      indicating where oops are located in instances of this klass.
	   (
			  (is_interface || is_anonymous) ?
			  align_object_offset(nonstatic_oop_map_size) :
			  nonstatic_oop_map_size
	   ) +
	   //    [EMBEDDED implementor of the interface] only exist for interface
	   (
			   is_interface ? (int)sizeof(Klass*)/HeapWordSize : 0
	   ) +
	   //    [EMBEDDED host klass        ] only exist for an anonymous class (JSR 292 enabled)
	   (
			   is_anonymous ? (int)sizeof(Klass*)/HeapWordSize : 0)
	   );
}

The return value of the method is the amount of memory needed to create the Klass object this time. From the calculation logic of this method, we can see the memory layout of Klass object.  

The gray shaded part in the figure is optional. About vtable_length and itable_length and nonstatic_ oop_ map_ The value of size is calculated in the process of class resolution, which will be described in detail in the subsequent process of class resolution.

Called header_ The size () method is to calculate the memory size occupied by this kind of object. The implementation is as follows:

This code is by Java Architect must see network-Structure Sorting
// Sizing (in words) 
static int header_size(){ 
  return align_object_offset(sizeof(InstanceKlass)/HeapWordSize); // In the unit of HeapWordSize, a 64 bit word is 8 bytes, so the value is 8 
}

Called align_ object_ The offset () method is used for memory alignment, which is a very important C + + knowledge point, which will be explained later.

2. Subclass of InstanceKlass class

InstanceKlass has three direct subclasses, which are used to represent some special classes. The following is a brief introduction to these three subclasses:

(1)InstanceRefKlass

The subclass of java/lang/ref/Reference needs to be represented by InstanceRefKlass class. When creating an instance of this class_ reference_ The value of the type field usually indicates which reference type the current class represents. The value has been defined in the enumeration class, as follows:

REF_ The definition of none enumeration constant is as follows:

// ReferenceType is used to distinguish between java/lang/ref/Reference subclasses

enum ReferenceType {
  REF_NONE,      // Regular class
  REF_OTHER,     // Subclass of java/lang/ref/Reference, but not subclass of one of the classes below
  REF_SOFT,      // Subclass of java/lang/ref/SoftReference
  REF_WEAK,      // Subclass of java/lang/ref/WeakReference
  REF_FINAL,     // Subclass of java/lang/ref/FinalReference
  REF_PHANTOM    // Subclass of java/lang/ref/PhantomReference
};

You can see that all Java class references and subclasses will be represented by the object of C + + class InstanceRefKlass. When it is impossible to determine which Java subclass it is, the_ Reference_ The value of type is set to REF_OTHER.

Because these classes need special treatment by the garbage collector, they will be introduced in detail in the subsequent explanation of strong reference, weak reference, virtual reference and ghost reference.

(2) InstanceMirrorKlass class

InstanceMirrorKlass object is used to represent special Java Lang. class class, added a static attribute_ offset_of_static_fields, used to describe the starting offset of static fields. It is defined as follows:

static int _offset_of_static_fields;

This attribute is only added because of Java Special Class. Lang. Under normal circumstances, HotSpot uses Klass to represent Java classes and oop to represent Java objects. Static or non-static fields may be defined in Java objects. The non-static field values are stored in oop, while the static field values are stored in Java that represents the current Java Class Lang. Class object. java.lang.Class class is represented by instancemirrorclass object, Java Lang. Class objects are represented by oop, so the non static field values of the Class object are stored in oop, and the Class class itself defines static fields. Then these values are also stored in the Class object, that is, the oop representing the Class object. In this way, the static and non static fields are stored in one oop_ offset_ of_ static_ The fields property is offset to locate the storage location of the static field.

This attribute is through init_ offset_ of_ static_ The fields method is initialized. The initialization process is as follows:

  static void init_offset_of_static_fields() {
    // Cache the offset of the static fields in the Class instance
    assert(_offset_of_static_fields == 0, "once");
    // java.lang.Class class is represented by instancemirrorclass object, while Java Lang. class objects are represented by Oop objects, then imk - > size_ What helper () gets is
    // The size of the Oop object, shifted 3 bits to the left to convert words into bytes. Store the value of the static field immediately after the Oop object
    InstanceMirrorKlass* imk = InstanceMirrorKlass::cast(SystemDictionary::Class_klass());
    _offset_of_static_fields = imk->size_helper() << LogHeapWordSize; // LogHeapWordSize=3
  }
  
int size_helper() const {
    return layout_helper_to_size_helper(layout_helper());
  }
 
static int layout_helper_to_size_helper(jint lh) {
    assert(lh > (jint)_lh_neutral_value, "must be instance");
    return lh >> LogHeapWordSize;
}
 
int layout_helper() const{ return _layout_helper; }

Call Java Size of lang. class class (represented by instancemirrorclass object)_ Helper () method to get Java The size of lang. class object (represented by Oop object), which is Java The size of some attributes declared in the lang.class class, followed by the static storage area.

The printing results after opening the command - XX:+PrintFieldLayout are as follows:

The non static layout is as follows:

java.lang.Class: field layout
  @ 12 --- instance fields start ---
  @ 12 "cachedConstructor" Ljava.lang.reflect.Constructor;
  @ 16 "newInstanceCallerCache" Ljava.lang.Class;
  @ 20 "name" Ljava.lang.String;
  @ 24 "reflectionData" Ljava.lang.ref.SoftReference;  
  @ 28 "genericInfo" Lsun.reflect.generics.repository.ClassRepository;
  @ 32 "enumConstants" [Ljava.lang.Object;
  @ 36 "enumConstantDirectory" Ljava.util.Map;
  @ 40 "annotationData" Ljava.lang.Class$AnnotationData;
  @ 44 "annotationType" Lsun.reflect.annotation.AnnotationType;
  @ 48 "classValueMap" Ljava.lang.ClassValue$ClassValueMap;
  @ 52 "protection_domain" Ljava.lang.Object;
  @ 56 "init_lock" Ljava.lang.Object;
  @ 60 "signers_name" Ljava.lang.Object;
  @ 64 "klass" J
  @ 72 "array_klass" J 
  @ 80 "classRedefinedCount" I
  @ 84 "oop_size" I
  @ 88 "static_oop_field_count" I
  @ 92 --- instance fields end ---
  @ 96 --- instance ends ---

This is Java For the layout of non static fields of lang.class, the offset of each field has been calculated in the process of class resolution. After completing the layout of non static fields, the static fields will be laid out immediately. At this time_ offset_ of_ static_ The fields field has a value of 96.   

We need to distinguish the representation methods of related classes, as shown in the figure below.

java.lang.Class objects save the static attributes of the class through the corresponding Oop object. Therefore, their instance sizes are different, and special methods are needed to calculate their size and attribute traversal.

Properties of Klass_ java_ The mirror points to the Oop object that holds the static fields of this class. You can access the static fields of this class through this property. Oop is the object representation model of HotSpot, which will be described in detail later.

(3) InstanceClassLoaderKlass class

Instead of adding new fields, a new oop traversal method is added, which is mainly used for class loader dependency traversal.

3. Create an instance of a class

Creating an InstanceKlass instance will call InstanceKlass::allocate_instance_klass() method. When creating, it will involve C + + overloading the new operator. Overload the new operator to allocate the memory space of the object, and then call the constructor of the class to initialize the corresponding properties. The implementation of the method is as follows:

InstanceKlass* InstanceKlass::allocate_instance_klass(
	ClassLoaderData*  loader_data,
	int               vtable_len,
	int               itable_len,
	int               static_field_size,
	int               nonstatic_oop_map_size,
	ReferenceType     rt,
	AccessFlags       access_flags,
	Symbol*           name,
	Klass*            super_klass,
	bool              is_anonymous,
	TRAPS
){
  bool  isinterf = access_flags.is_interface();
  int   size = InstanceKlass::size(
				 vtable_len,
				 itable_len,
				 nonstatic_oop_map_size,
				 isinterf,
				 is_anonymous
			   );

  // Allocation
  InstanceKlass* ik;
  ///////////////////////////////////////////////////////////////////////
  if (rt == REF_NONE) {
    if (name == vmSymbols::java_lang_Class()) { // Java. XML is represented by InstanceMirrorKlass object Lang. class class
      ik = new (loader_data, size, THREAD) InstanceMirrorKlass(
                                           vtable_len,
					   itable_len,
					   static_field_size,
					   nonstatic_oop_map_size,
					   rt,
                                           access_flags,
					   is_anonymous);
    } else if (
    	  name == vmSymbols::java_lang_ClassLoader() ||
          (
             SystemDictionary::ClassLoader_klass_loaded() &&
             super_klass != NULL &&	 // ClassLoader_klass is java_lang_ClassLoader
             super_klass->is_subtype_of(SystemDictionary::ClassLoader_klass())
          )
    ){ //  Java is represented by InstanceClassLoaderKlass object Lang.classloader or related subclasses
      ik = new (loader_data, size, THREAD) InstanceClassLoaderKlass(
                                           vtable_len,
					   itable_len,
					   static_field_size,
					   nonstatic_oop_map_size,
					   rt,
                                           access_flags,
					   is_anonymous);
    } else { // Ordinary classes are represented by InstanceKlass objects
      // normal class
      ik = new (loader_data, size, THREAD) InstanceKlass( 
				vtable_len, itable_len,
				static_field_size,
				nonstatic_oop_map_size,
				rt,
				access_flags,
				is_anonymous);
    }
  }
  ///////////////////////////////////////////////////////////////////////
  else { // References are represented by InstanceRefKlass objects
    // reference klass
    ik = new (loader_data, size, THREAD) InstanceRefKlass(
				vtable_len, itable_len,
				static_field_size,
				nonstatic_oop_map_size,
				rt,
				access_flags,
				is_anonymous);
  }
  ///////////////////////////////////////////////////////////////////////

  // Add all types to our internal class loader list, including the classes in the root loader
  // Add all classes to our internal class loader list here,
  // including classes in the bootstrap (NULL) class loader.
  // loader_ The type of data is ClassLoaderData *, through the_ klasses remains through instanceklass_ next_ The link property holds a list of
  loader_data->add_class(ik);
  Atomic::inc(&_total_instanceKlass_count);
  return ik;
}

The implementation of the method is relatively simple when rt is equal to Ref_ When none, that is, when it is a non Reference type, the object corresponding to the C + + class will be created according to the class name. Class class creates InstanceMirrorKlass, ClassLoader class or subclass of ClassLoader. InstanceClassLoaderKlass class and ordinary class are represented by instanceclasss. When rt is not ref_ When none, an InstanceRefKlass object is created.

The called size() function has been described in the previous introduction of InstanceKlass class, which will not be introduced here. After getting the size, the new overloaded operator function will be called to open up the memory space, as follows:

void* Klass::operator new(size_t size, ClassLoaderData* loader_data, size_t word_size, TRAPS) throw() {
  void* x = Metaspace::allocate( // Allocate memory space in metadata area
				 loader_data,
				 word_size,
				 false,   /*read_only*/
				 MetaspaceObj::ClassType,
				 CHECK_NULL
			 );
  return x;
}

As you can see, for jdk1 In version 8, the Klass object allocates memory in the metadata area. Since C + + does not have the same garbage collection mechanism as Java, the memory of Metaspace needs to be automatically managed and released. This knowledge will be introduced in detail later.   

Other reference articles:

1,Compile the source code of OpenJDK8 on Ubuntu 16.04 (with video)

2,Debug HotSpot source code (with video)

3,HotSpot project structure

4,Startup process of HotSpot (with video for source code analysis)

5,Memory layout of C + + objects in HotSpot source code analysis

6,HotSpot source code analysis and other models

There are videos related to HotSpot source code analysis on station B https://space.bilibili.com/27533329