In depth Java class loading mechanism

Posted by jibster on Sat, 22 Jan 2022 16:59:55 +0100

Initial class loading process

When using a class, if the class file of the class is not loaded into memory, the system will initialize the class through the following three steps

1. Class loading → 2 Class connection (Link) → 3 Class initialization

  • Class Load: read the class file of the class into memory and create a Java. Net file for it Lang. class, which is completed by the ClassLoader
  • Class Link: load the data in the class into each memory area
  • Class initialization: the JVM is responsible for initializing the class

Deep class loading process

The complete life cycle of a class: loading, connecting (validation, preparation, parsing, initialization), using, and unloading

  1. load

    1. Get the binary byte stream defined by a class through its fully qualified name

    2. The static storage structure represented by this byte stream is transformed into the runtime data structure of the method area

    3. A Class object representing this Class is generated in the heap as an access to these data in the method area

    4. Note: compared with other stages of the class loading process, the loading stage is the most controllable stage, because programmers can use the system's class loader to load and their own class loader to load. Here, we only need to know that the role of the class loader is the three things that the above virtual machine needs to complete

  2. connect

    1. verification

      1. Validation of file format: validation Whether the byte stream of class file conforms to the format specification of class file and can be processed by the current version of virtual machine

      2. Metadata verification: it mainly performs semantic analysis on the information described by bytecode to ensure that the information described meets the requirements of java language specification, such as verifying whether this class has a parent class, whether the field methods in the class conflict with the parent class, etc.

      3. Bytecode verification: This is the most complex stage of the whole verification process. It is mainly through the analysis of data flow and control flow to determine that the program semantics is legal and logical. After verifying the data type in the metadata verification stage, this stage mainly analyzes the class methods to ensure that the class methods will not threaten the security of the virtual machine at runtime.

      4. Symbolic reference validation: it is the last stage of validation, which occurs when the virtual machine converts symbolic references to direct references. It is mainly used to verify the information outside the class itself (various symbol references in the constant pool). The purpose is to ensure that the parsing action can be completed.

      5. Note: for the whole class loading mechanism, the verification stage is a very important but non essential stage. If our code can ensure that there is no problem, we don't need to verify. After all, verification takes a certain time. Of course we can use

        -Xverty: none to turn off most validation.

    2. Preparation - important

      The preparation phase is mainly to allocate memory for class variables (static) and set initial values. This memory is allocated in the method area. At this stage, we only need to pay attention to two key words: class variable and initial value:

      1. Class variable (static): the memory will be allocated, but the corresponding allocation value will not be. Secondly, the instance variable will not allocate space, because the instance variable is mainly allocated to the java heap memory along with the instantiation of the object

      2. Initial value: the initial value here refers to the default value of the data type, not the value given by the display in the code

      For example, 1: public static int value = 1;

      Here, the value after the preparation phase is 0 instead of 1. The action with a value of 1 is in the initialization phase

      For example, 2: public static final int value = 1;

      It is modified by both final and static. After the preparation stage, it is 1, because static final puts the result into the constant pool of the class calling it in the compiler

    3. analysis

      The parsing phase is mainly the process that the virtual machine converts the symbolic reference in the constant pool into a direct reference

      1. Symbol reference: use a group of symbols to describe the referenced target, which can be literal in any form, as long as it can locate the target without ambiguity. For example, in the class, the teacher can use Zhang San to represent you or your student number to represent you, but in any way, these are only a code (symbol), which points to you (symbol reference)
      2. Direct reference: a direct reference is a pointer that can point to the target, a relative offset, or a handle that can directly or indirectly locate the target. It is related to the memory implemented by the virtual machine. The direct reference of different virtual machines is generally different
      3. Supplement: the parsing action is mainly for class or interface, field, class method, interface method, method type, method handle and call point qualifier
    4. initialization

      This is the last step of the class loading mechanism. At this stage, the java program code begins to execute. The class variable has been assigned a value once in the preparation phase. At the initialization stage, programmers can assign values according to their own needs. In a word, this stage is the process of executing the clinit() method of the class constructor.

      In the initialization phase, it is mainly to give the correct initial value to the static variable of the class. The JVM is responsible for initializing the class, mainly initializing the class variable. There are two ways to set the initial value of class variables in Java:

      1. The declared class variable is the specified initial value
      2. Use static code blocks to specify initial values for class variables

      Supplement: clinit() method has the following characteristics:

      1. It is generated by the combination of the assignment actions of all class variables (static) in the compiler automatic collection class and the statements in the static statement block (static {} block). The order of the compiler collection is determined by the order in which the statements appear in the source file. In particular, the static statement block can only access the class variables defined before it, and the class variables defined after it can only be assigned and cannot be accessed. For example, the following code

        class Test {
        	static {
        		i = 0;                // Assigning values to variables can be compiled normally
        		System.out.print(i);  // The compiler will report an error and prompt "illegal forward reference"
        	}
        	static int i = 1;
        }
        
      2. Unlike the class constructor (or instance constructor init()), there is no need to explicitly call the constructor of the parent class. The virtual opportunity automatically ensures that the clinit() method of the parent class has finished executing before the clinit() method of the child class runs. Therefore, the first class in the virtual machine that executes the clinit() method must be Java lang.Object. Because the clinit() method of the parent class is executed first, it means that the static statement block defined in the parent class is better than the variable assignment operation of the child class. For example, the following code:

        public class Test {
        	public static void main(String[] args) {
        	     System.out.println(Sub.B);//The output result is the value of static variable A in the parent class, that is, 2
        	}
        }
        class Father{
            public static int A = 1;
            static {
            	System.out.println("a");
                A = 2;
            }
        }
        class son extends Father {
            public static int B = A;
            
        }
        
      3. clinit() method is not necessary for a class or interface. If a class does not contain static statement blocks and there is no assignment operation on class variables, the compiler may not generate clinit() method for this class.

      4. Static statement blocks cannot be used in the interface, but there are still assignment operations initialized by class variables. Therefore, the interface will generate clinit() method like the class. However, the difference between an interface and a class is that the clinit() method of the parent interface does not need to be executed first to execute the () method of the interface. The parent interface is initialized only when the variables defined in the parent interface are used. In addition, the implementation class of the interface will not execute the clinit() method of the interface during initialization.

      5. Virtual opportunity ensures that the clinit() method of a class is correctly locked and synchronized in a multi-threaded environment. If multiple threads initialize a class at the same time, only one thread will execute the clinit() method of this class, and other threads will block and wait until the active thread finishes executing the clinit() method. If there are time-consuming operations in the clinit() method of a class, multiple threads may be blocked, which is very hidden in the actual process.

      6. JVM initialization steps:

        1. If the class has not been loaded and connected, the program loads and connects the class first
        2. If the direct parent of this class has not been initialized, initialize its direct parent first
        3. If there are initialization statements in the class, the system executes these initialization statements in turn
      7. Class initialization time:

        Class initialization occurs only when a class is actively used. Active use of a class includes the following six types:

        1. Create an instance of a class, that is, new

        2. Access or assign a value to a static variable of a class or interface

        3. Calling a static method of a class

        4. reflex

        5. If you initialize a subclass of a class, its parent class will also be initialized

        6. When starting a Java virtual machine, the class marked as the startup class directly uses Java Exe command to run a main class

          For example: Test class

  3. Usage: after the JVM completes the initialization phase, the JVM starts to execute the user's program code from the entry method

  4. Uninstall: after the user program code is executed, the JVM starts to destroy the created Class object, and finally the JVM responsible for running exits memory

Using class loading process to understand interview questions

public class Test {
	public static void main(String[] args) {
		A a = A.getInstance();
        System.out.println("A value1:" + a.value1);//1
        System.out.println("A value2:" + a.value2);//0

        B b = B.getInstance();
        System.out.println("B value1:" + b.value1);//1
        System.out.println("B value2:" + b.value2);//1
	}
}
class A{
    private static A a = new A();
    public static int value1;	
    public static int value2 = 0;

    private A(){
        value1++;
        value2++;
    }

    public static A getInstance(){
        return a;
    }
}
class B{
    public static int value1;
    public static int value2 = 0;
    private static B b = new B();

    private B(){
        value1++;
        value2++;
    }
    public static B getInstance(){
        return b;
    }

}

Class loader

The function implemented by the class loader is to obtain the binary byte stream for the loading stage, which is implemented outside the Java virtual machine, so that the application can decide how to obtain the required classes.

Class loader classification

From the perspective of Java virtual machine, there are only two different class loaders:

  1. Bootstrap ClassLoader: This classloader is implemented in C + + and is a part of the virtual machine itself
  2. The loader of all other classes, which are implemented in Java, independent of the virtual machine, and all inherit from the abstract class java lang.ClassLoader

From the perspective of Java developers, class loaders can be divided in more detail:

  1. Bootstrap ClassLoader: the topmost class loader, which is responsible for loading Java_ The class in the home \ lib directory or in the path specified by the - Xbootclasspath parameter and recognized by the virtual machine (identified by file name, such as rt.jar).
  2. Extension ClassLoader this class loader is implemented by ExtClassLoader (sun.misc.Launcher$ExtClassLoader). Responsible for loading Java_ In the home \ lib \ ext directory, or through Java The ext.dirs system variable specifies the class library in the path
  3. Application ClassLoader this classloader is implemented by AppClassLoader (sun.misc.Launcher$AppClassLoader). It is also called system class loader, which can be obtained through getSystemClassLoader(). It is responsible for loading class libraries on the user's classpath. If there is no custom class loader, this is generally the default class loader.

Hierarchical relationships between class loaders

Launch classloader > extended classloader > Application classloader > Custom classloader

public class ClassLoaderTest {
	
	public static void main(String[] args) {
		
		Thread thread = new Thread();
		
		ClassLoader appClassLoader = thread.getContextClassLoader();
		ClassLoader extClassLoader = appClassLoader.getParent();
		ClassLoader booClassLoader = extClassLoader.getParent();
		
        //sun.misc.Launcher$AppClassLoader@73d16e93
		System.out.println("Application class loader:" + appClassLoader);
        //sun.misc.Launcher$ExtClassLoader@15db9742
		System.out.println("Extension class loading:" + extClassLoader);
        //null
		System.out.println("Start class loader:" + booClassLoader);
        /*
			The parent Loader of ExtClassLoader was not obtained because the Bootstrap Loader is in C language 			    If there is no definite way to return the parent Loader, null is returned.
		*/
	}
}

Parental delegation model - Concepts

This hierarchical relationship between class loaders is called the parental delegation model.
The parental delegation model requires that all class loaders except the Bootstrap ClassLoader at the top level should have their own parent class loader. The parent-child relationship between class loaders here is generally implemented not by inheritance, but by composition.

Parental delegation model - work process

A class loader receives a class loading request. Instead of loading the request, it delegates the class loading request to the parent class loader and transmits it layer by layer until it reaches the Bootstrap ClassLoader.
Only when the parent loader cannot load the request will the child loader try to load it by itself.

Parental delegation model - benefits

Make Java classes have a hierarchical relationship with priority along with its class loader, so as to unify the basic classes.

The two parent delegation model solves the problem of the unity of the basic classes loaded by each class loader. That is, the more basic classes are loaded by the upper loader

For example: Java Lang.Object is stored in rt.jar. If you write another Java Lang. object and put it into ClassPath. The program can be compiled. Because of the existence of the parent delegation model, the object in rt.jar has higher priority than the object in clas path, because the object in rt.jar uses the startup class loader, while the object in ClassPath uses the application class loader. It is precisely because the object in rt.jar has higher priority, because all objects in the program are this object.

Parental delegation principle

  1. It can avoid repeated loading. If the parent class has been loaded, the child class does not need to be loaded again
  2. It is more secure and solves the problem of unifying the basic classes of each class loader. If this method is not used, users can define the class loader to load the core api at will, which will bring related hidden dangers.

Code implementation of parental delegation model

The code implementation of the parental delegation model focuses on Java In the loadClass() method of lang. classloader

1. Check whether the class is loaded. If not, call the of the parent class loader loadClass()method;
2. If the parent class loader is empty, the startup class loader is used as the parent loader by default;
3. Thrown if the parent class fails to load ClassNotFoundException After exception, call your own findClass() method. 
//loadClass() source code
protected synchronized Class<?> loadClass(String name, boolean resolve) throws ClassNotFoundException {
    //First, check whether the class is loaded
    Class c = findLoadedClass(name);
    if (c == null) {//If not loaded
        try {
            if (parent != null) {//Parent class exists
             	//Then call the loadClass() method of the parent class loader;
                c = parent.loadClass(name, false);
            } else {//Your parent class does not exist
            	//The startup class loader is used as the parent loader by default;
                c = findBootstrapClass0(name);
            }
        } catch (ClassNotFoundException e) {
           //If the parent class fails to load, throw the ClassNotFoundException exception
           //Call its own loading function. Generally, the custom class overrides this method
           c = findClass(name);
        }
    }
    if (resolve) {//Initialize
        //Then call your own findClass() method.
        resolveClass(c);
    }
    return c;
}

Custom class loader

First, we define a common Java class to be loaded: test java. Put it on COM dream. Under test package:

package com.dream.test;

public class Person {
    
	private String name;

	public Person() {}

	public Person(String name) {
		this.name = name;
	}

	public String getName() {
		return name;
	}

	public void setName(String name) {
		this.name = name;
	}

	@Override
	public String toString() {
		return "Person [name=" + name + "]";
	}
}

be careful:

If you create it directly in the current project, wait for test After compiling Java, please put test Copy the class file, and then test Java ` delete. Because if test Class is stored in the current project. According to the parent delegation model, it will be through sun misc. Launcher $appclassloader class loader loads. In order for our custom class loader to load, we put test Put the class file in another directory.

In this case, we use person The directory where the class file is stored is as follows: C: \ code \ com \ dream \ test \ person class

The next step is to customize our class loader:

package com.dream.test;

import java.io.BufferedInputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;

public class ClassLoadTest {

	public static void main(String[] args) throws Exception {
		
		//Create a custom classloader object
		MyClassLoader classLoader = new MyClassLoader("C:/code");
		//Get the bytecode file object of the class through the full path of the class
		Class<?> c = classLoader.loadClass("com.dream.test.Person");
		//create object
		Object obj = c.newInstance();
		
		System.out.println(obj);
		System.out.println(obj.getClass().getClassLoader());//com.dream.test.MyClassLoader@6d06d69c
	}
}

class MyClassLoader extends ClassLoader{
	
	private String classPath;

	public MyClassLoader(){}

	public MyClassLoader(String classPath) {
        this.classPath = classPath;
    }

	//Convert "com.dream.test.Person" to Class object	 
	@Override
	protected Class<?> findClass(String name) throws ClassNotFoundException {
		
		String replaceAll = name.replaceAll("\\.", "/");//com\dream\test\Person
		File file = new File(classPath,replaceAll + ".class");//C:\code\com\dream\test\Person.class
		
		try {
			//Convert file to byte array
			byte[] bytes = getClassBytes(file);
			
			//Convert byte array to Class object
			Class<?> c = this.defineClass(name,bytes, 0, bytes.length);
			return c;
		} catch (Exception e) {
			e.printStackTrace();
		}
		return super.findClass(name);
	}

	//Read XXX Class file, which returns the contents of the file as a byte array
	public byte[] getClassBytes(File file) throws Exception{
		
		BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
		ByteArrayOutputStream baos = new ByteArrayOutputStream();
		byte[] buf = new byte[1024];
		int len;
		while((len=bis.read(buf)) != -1){
			baos.write(buf, 0, len);
		}
		bis.close();
		return baos.toByteArray();
	}
}

Application scenario of custom class loader

Introduction: Tomcat container. Each WebApp has its own ClassLoader to load the classes on the ClassPath of each WebApp. Once the Jar package provided by Tomcat is encountered, it will be delegated to CommonClassLoader to load

  1. Encryption: Java code can be decompiled easily. If you need to encrypt your own code to prevent decompilation, you can encrypt the compiled code with some encryption algorithm first. After class encryption, you can't load classes with Java ClassLoader. At this time, you need to customize the ClassLoader to decrypt the classes before loading them.
  2. Load code from non-standard sources: if your bytecode is placed in the database or even in the cloud, you can customize the class loader to load classes from the specified source.

Comprehensive application of the above two situations in practice: for example, your application needs to transmit Java class bytecode through the network. For security, these bytecodes have been encrypted. At this time, you need to customize the class loader to read the encrypted byte code from a network address, decrypt and verify it, and finally define the class running in the Java virtual machine.

Topics: Java jvm