1. General
Reprint: Parental delegation model and Flink's class loading strategy child first parent first I remember I seem to have written similar articles before, but I can't find them. Reprint one and supplement it..
2. Class loading
As we know, in the JVM, the process of loading a class can be roughly divided into five stages: loading, linking (verification, preparation, parsing) and initialization. We usually mention class loading, which means using the class loader to obtain the binary byte code stream defining this class through the fully qualified name of the class, and then construct the class definition. As a JVM based framework, Flink provides the parameter ClassLoader.xml to control the class loading policy in flink-conf.yaml Resolve order. The options are child first (default) and parent first. This article briefly analyzes the meaning behind this parameter.
3. Parent first class loading strategy
The parent classes of ParentFirstClassLoader and ChildFirstClassLoader are the abstract class of FlinkUserCodeClassLoader. Let's take a look at this abstract class first. The code is very short.
public abstract class FlinkUserCodeClassLoader extends URLClassLoader { public static final Consumer<Throwable> NOOP_EXCEPTION_HANDLER = classLoadingException -> {}; private final Consumer<Throwable> classLoadingExceptionHandler; protected FlinkUserCodeClassLoader(URL[] urls, ClassLoader parent) { this(urls, parent, NOOP_EXCEPTION_HANDLER); } protected FlinkUserCodeClassLoader( URL[] urls, ClassLoader parent, Consumer<Throwable> classLoadingExceptionHandler) { super(urls, parent); this.classLoadingExceptionHandler = classLoadingExceptionHandler; } @Override protected final Class<?> loadClass(String name, boolean resolve) throws ClassNotFoundException { try { return loadClassWithoutExceptionHandling(name, resolve); } catch (Throwable classLoadingException) { classLoadingExceptionHandler.accept(classLoadingException); throw classLoadingException; } } protected Class<?> loadClassWithoutExceptionHandling(String name, boolean resolve) throws ClassNotFoundException { return super.loadClass(name, resolve); } }
FlinkUserCodeClassLoader inherits from URLClassLoader. Because the user code of Flink App can only be determined at runtime, it is more appropriate to find the class corresponding to the fully qualified name in the JAR package through the URL. ParentFirstClassLoader is just an empty class that inherits FlinkUserCodeClassLoader.
public static class ParentFirstClassLoader extends FlinkUserCodeClassLoader { ParentFirstClassLoader( URL[] urls, ClassLoader parent, Consumer<Throwable> classLoadingExceptionHandler) { super(urls, parent, classLoadingExceptionHandler); } static { ClassLoader.registerAsParallelCapable(); } }
This is equivalent to ParentFirstClassLoader directly calling the loadClass() method of the parent loader. As mentioned earlier, the hierarchy of class loaders in the JVM and the logic of the default loadClass() method are embodied by the parents delegation model. Review the meaning:
If a class loader wants to load a class, it will not try to load the class itself, but delegate the loading request to the parent loader. All class loading requests should eventually be passed to the top-level startup class loader. Only when the parent loader cannot load into this class will the child loader try to load itself.
It can be seen that Flink's parent first class loading strategy is to copy the parent delegation model. In other words, the class loader of user code is Custom ClassLoader, and the class loader of Flink framework itself is Application ClassLoader. The classes in the user code are loaded by the class loader of the Flink framework first, and then by the class loader of the user code. However, by default, Flink does not adopt the parent first strategy, but the following child first strategy. Continue.
4. Child first class loading strategy
We have learned that the advantage of the parental delegation model is to ensure the hierarchical relationship of the loaded classes with the hierarchical relationship of the class loader, so as to ensure the security of the Java running environment. However, in the complex environment of Flink App, the parental delegation model may not be applicable.
For example, the Flink Cassandra connector introduced in the program always depends on the fixed Cassandra version. In order to be compatible with the actual Cassandra version, a lower or higher dependency will be introduced in the user code. The class definitions of different versions of the same component may be different (even if the fully qualified name of the class is the same). If the parental delegation model is still used, inexplicable compatibility problems will occur because the class of the version specified by the Flink framework will be loaded first, such as NoSuchMethodError, IllegalAccessError, etc.
In view of this, Flink implements the ChildFirstClassLoader class loader as the default policy. It breaks the parental delegation model and makes the class of user code load first. This operation is called "Inverted Class Loading" in the official document. The code is still not long, as recorded below.
public final class ChildFirstClassLoader extends FlinkUserCodeClassLoader { /** * The classes that should always go through the parent ClassLoader. This is relevant for Flink * classes, for example, to avoid loading Flink classes that cross the user-code/system-code * barrier in the user-code ClassLoader. */ private final String[] alwaysParentFirstPatterns; public ChildFirstClassLoader( URL[] urls, ClassLoader parent, String[] alwaysParentFirstPatterns, Consumer<Throwable> classLoadingExceptionHandler) { super(urls, parent, classLoadingExceptionHandler); this.alwaysParentFirstPatterns = alwaysParentFirstPatterns; } @Override protected Class<?> loadClassWithoutExceptionHandling(String name, boolean resolve) throws ClassNotFoundException { // First, check if the class has already been loaded Class<?> c = findLoadedClass(name); if (c == null) { // check whether the class should go parent-first for (String alwaysParentFirstPattern : alwaysParentFirstPatterns) { if (name.startsWith(alwaysParentFirstPattern)) { return super.loadClassWithoutExceptionHandling(name, resolve); } } try { // check the URLs c = findClass(name); } catch (ClassNotFoundException e) { // let URLClassLoader do it, which will eventually call the parent c = super.loadClassWithoutExceptionHandling(name, resolve); } } else if (resolve) { resolveClass(c); } return c; } @Override public URL getResource(String name) { // first, try and find it via the URLClassloader URL urlClassLoaderResource = findResource(name); if (urlClassLoaderResource != null) { return urlClassLoaderResource; } // delegate to super return super.getResource(name); } @Override public Enumeration<URL> getResources(String name) throws IOException { // first get resources from URLClassloader Enumeration<URL> urlClassLoaderResources = findResources(name); final List<URL> result = new ArrayList<>(); while (urlClassLoaderResources.hasMoreElements()) { result.add(urlClassLoaderResources.nextElement()); } // get parent urls Enumeration<URL> parentResources = getParent().getResources(name); while (parentResources.hasMoreElements()) { result.add(parentResources.nextElement()); } return new Enumeration<URL>() { Iterator<URL> iter = result.iterator(); public boolean hasMoreElements() { return iter.hasNext(); } public URL nextElement() { return iter.next(); } }; } static { ClassLoader.registerAsParallelCapable(); } }
The core logic is located in the loadClassWithoutExceptionHandling() method, which is briefly described as follows:
- Call findLoadedClass() method to check whether the class corresponding to the fully qualified name name has been loaded. If not, continue to execute.
- Check whether the class to be loaded starts with the prefix in the alwaysParentFirstPatterns collection. If yes, call the corresponding method of the parent class to load it in the parent first manner.
- If the class does not meet the conditions of the alwaysParentFirstPatterns collection, call the findClass() method to find and obtain the definition of the class in the user code (this method has a default implementation in URLClassLoader). If not found, fallback to the parent loader to load.
- Finally, if the resolve parameter is true, the resolveClass() method is called to link the Class, and finally the corresponding Class object is returned.
It can be seen that the child first strategy avoids the step of "delegating the loading request to the parent loader first". Only certain classes must "follow the old system". These classes in the alwaysParentFirstPatterns collection are the basis of Java, Flink and other components and cannot be washed out by user code. It is specified by the following two parameters:
classloader.parent-first-patterns.default, which is not recommended to be modified. It is fixed to the following values:
java.; scala.; org.apache.flink.; com.esotericsoftware.kryo; org.apache.hadoop.; javax.annotation.; org.slf4j; org.apache.log4j; org.apache.logging; org.apache.commons.logging; ch.qos.logback; org.xml; javax.xml; org.apache.xerces; org.w3c
classloader.parent-first-patterns.additional: in addition to the class specified in the previous parameter, if the user wants to load other classes in the child first mode and wants to load them in the parent delegation model, it can be specified additionally (separated by semicolons).
5. Cases
Because there is a case: [Flink] Flink 1.9 upgrade 1.12.4 can be run locally. After packaging, the cluster cannot find the class ClassNotFoundException