From using Arthas to check the problem of online Fastjson to Java Dynamic bytecode Technology (middle)

Posted by zuzupus on Wed, 02 Feb 2022 02:55:37 +0100

Last article Through the re examination of an online accident, it leads to the Arthas of Fubao factory, a Java diagnostic tool based on Java Dynamic bytecode technology; There is no need to elaborate on the use of Arthas. You can get started quickly by viewing the official documents, and there are many ways to play; In the previous article, only one usage scenario was introduced, namely "debug the internal class information of the online JVM, execute the online watch method and view the input and output of the method, decompile the class online, and directly hot deploy after re editing Java" (manual dog head)

what-how-why is the most basic thing to start a technology. After knowing what Arthas is and how to use it, you naturally need to understand how it works and the underlying principle (why); The most direct way to understand the underlying principle is to read the source code. After all, the code will not deceive people, but any technical article or book will have "noise". Then start with the Arthas source code and talk about Java Agent, Instrument and dynamic bytecode technology;

I have always avoided writing articles in the direction of source code. The source code is easy to read, but writing and interpretation requires pasting a lot of code, which will occupy a lot of space, and it is easy to write a daily account, but excellent domestic projects such as Arthas or Dubbo are still worth it;
At the same time, due to the limited time and energy, this article focuses on Java Agent, Instrument and other dynamic bytecode technologies, which will be put in the next article

Arthas source code

I remember when I first got the source code of Arthas, I was most impressed by the lovely todo MD, ha ha, it's like a Fuwa elder brother doing handover

* The code is still messy and needs to be refactored
* Dependency needs to be cleaned up. There are several problems:
    * All apache of common The library should not be required
    * json There are several copies in the library
    * `jopt-simple` See if it works `cli` replace
    * `cli`, `termd` of artifactId, version Need to think about it. Should I just bring it in. Their dependence also needs a closer look
* termd rely on netty,It feels a little heavy, and for the first time attach It's slow. I'm not sure netty The problem is attach Question of
* at present web console rely on termd Built in term.js and css,We need to beautify and think about how to integrate it into the R & D portal
* Because not now Java Client, so batch mode It's gone
* `com.taobao.arthas.core.shell.session.Session` Ability needs and previous session Implementation benchmarking. Of which:
    * Really need textmode Are you? I think this should be option Things
    * Really need encoding Are you? I think we should still option Even if it is really necessary, because I think it should be UTF-8
    * duration It should be displayed, session The list of may also be displayed
    * You need to look carefully session Does the expiration meet expectations
    * When many people work together session Was it shared among many people?
* All the commands are now implemented AnnotatedCommand,What needs to be further enhanced is:
    * Help The formatted output in is deleted. Need for `@Description` Define a unified format
    * Log of command input and output (record logger) It is deleted and needs to be implemented again, because it is used now `CommandProcess` To output, so you need to `CommandProcess` Log in the implementation of
* `com.taobao.arthas.core.GlobalOptions` It looks curious and strange. It feels like OptionCommand What should be done
* `com.taobao.arthas.core.config.Configure` It needs to be cleaned, especially and http dependent
* Need to merge develop Subsequent repairs on branches
* In code TODO/FIXME

To return to the topic, first go straight to the arthas core module, because this is the entrance of the whole arthas, that is, execute Java - jar arthas core jar; The most important task in the main function is to attach JVM and load agent through the pid of java process;

private void attachAgent(Configure configure) throws Exception {
    VirtualMachineDescriptor virtualMachineDescriptor = null;
    for (VirtualMachineDescriptor descriptor : VirtualMachine.list()) {
        String pid = descriptor.id();
        if (pid.equals(Long.toString(configure.getJavaPid()))) {
            virtualMachineDescriptor = descriptor;
            break;
        }
    }
    VirtualMachine virtualMachine = null;
    try {
        if (null == virtualMachineDescriptor) { // Use attach(String pid) this way
            virtualMachine = VirtualMachine.attach("" + configure.getJavaPid());
        } else {
            virtualMachine = VirtualMachine.attach(virtualMachineDescriptor);
        }
        //slightly
        virtualMachine.loadAgent(arthasAgentPath, configure.getArthasCore() + ";" + configure.toString());
    } finally {
        if (null != virtualMachine) {
            virtualMachine.detach();
        }
    }
}

Java Agent technology is used here

Java Agent

Java Agent can be said to be a back door of JVM. The dynamic compilation, hot deployment, APM monitoring tool and trace link analysis we usually use are all through Java Agent or built on Java Instrument; Then take Arthas as an example to see how Arthas does it;

Open the Arthas agent module and you can find some clues in pom. The following four lines are the key

<manifestEntries>
    <Premain-Class>com.taobao.arthas.agent334.AgentBootstrap</Premain-Class>
    <Agent-Class>com.taobao.arthas.agent334.AgentBootstrap</Agent-Class>
    <Can-Redefine-Classes>true</Can-Redefine-Classes>
    <Can-Retransform-Classes>true</Can-Retransform-Classes>
</manifestEntries>

You can see that AgentBootstrap is used in Manifest. If you open AgentBootstrap, you can see that it has a main function decorated with private, and two static methods actually call it

public static void premain(String args, Instrumentation inst) {
    main(args, inst);
}
public static void agentmain(String args, Instrumentation inst) {
    main(args, inst);
}
private static synchronized void main(String args, final Instrumentation inst) {
	//slightly
}

These two methods, a pre main and an agent main, correspond to the two entries defined in Manifest;

These two are the two implementations of Java Agent, static Agent and dynamic Agent;

It can be seen that in addition to args, there is another Instrumentation in the parameters, which is the bytecode enhancement function of Java. The bottom layer of any function provided by the Agent is dynamically woven through the enhancement of the original bytecode; I won't go deep into Instrument here, but I'll introduce it in the next article;

Continue to look at the main function. The most important thing in the main function is to execute the bind action;

private static synchronized void main(String args, final Instrumentation inst) {
	//slightly
    final ClassLoader agentLoader = getClassLoader(inst, arthasCoreJarFile);
    Thread bindingThread = new Thread() {
        @Override
        public void run() {
            try {
                bind(inst, agentLoader, agentArgs);
            } catch (Throwable throwable) {
                throwable.printStackTrace(ps);
            }
        }
    };
    //slightly
}
private static void bind(Instrumentation inst, ClassLoader agentLoader, String args) {
	//ARTHAS_BOOTSTRAP => com.taobao.arthas.core.server.ArthasBootstrap
    Class<?> bootstrapClass = agentLoader.loadClass(ARTHAS_BOOTSTRAP);
    //GET_INSTANCE => getInstance()
    Object bootstrap = bootstrapClass.getMethod(GET_INSTANCE, Instrumentation.class, String.class).invoke(null, inst, args);
    //IS_BIND => boolean isBind()
    boolean isBind = (Boolean) bootstrapClass.getMethod(IS_BIND).invoke(bootstrap);
    if (!isBind) {
        String errorMsg = "Arthas server port binding failed! Please check $HOME/logs/arthas/arthas.log for more details.";
        ps.println(errorMsg);
        throw new RuntimeException(errorMsg);
    }
    ps.println("Arthas server already bind.");
}

In this bind, the key class of ArthasBootstrap in the server package under the Arthas core module is loaded through the self-defined ClassLoader (the code structure is really messy...). At the same time, two methods are executed through reflection, one is getInstance, and the other is to detect whether the bind is successful;

Check arthasbootstrap GetInstance method, you can see that this is a lazy loading singleton mode,

public synchronized static ArthasBootstrap getInstance(Instrumentation instrumentation, String args){
    if (arthasBootstrap != null) {
        return arthasBootstrap;
    }
    Map<String, String> argsMap = FeatureCodec.DEFAULT_COMMANDLINE_CODEC.toMap(args);
	//slightly
    return getInstance(instrumentation, mapWithPrefix);
}
public synchronized static ArthasBootstrap getInstance(Instrumentation instrumentation, Map<String, String> args) throws Throwable {
    if (arthasBootstrap == null) {
        arthasBootstrap = new ArthasBootstrap(instrumentation, args);
    }
    return arthasBootstrap;
}

The fifth line is to parse the configuration parameters through the user-defined separator;

Finally came to the highlight, ArthasBootstrap

private ArthasBootstrap(Instrumentation instrumentation, Map<String, String> args){
    initFastjson();    //slightly
    
    initSpy();	// 1. initSpy()
    
    initArthasEnvironment(args);    //slightly

	transformerManager = new TransformerManager(instrumentation); // 2. Enhancement
    enhanceClassLoader();	// 2. Enhancement
    
    initBeans();	//slightly

    bind(configure);	// 3. Start the server

    shutdown = new Thread("as-shutdown-hooker") {
        @Override
        public void run() {
            ArthasBootstrap.this.destroy();
        }
    };

    Runtime.getRuntime().addShutdownHook(shutdown);
}

The first one worth talking about is Arthas Spyapi, Spy in Arthas is similar to AOP, which can be woven into methods at various entry points. For example:

public class SpyAPI {
    public static void atEnter(Class<?> clazz, String methodInfo, Object target, Object[] args) {
        spyInstance.atEnter(clazz, methodInfo, target, args);
    }
}
public class SpyImpl extends AbstractSpy {
    @Override
    public void atEnter(Class<?> clazz, String methodInfo, Object target, Object[] args) {
        ClassLoader classLoader = clazz.getClassLoader();

        String[] info = splitMethodInfo(methodInfo);
        String methodName = info[0];
        String methodDesc = info[1];

        List<AdviceListener> listeners = AdviceListenerManager.queryAdviceListeners(classLoader, clazz.getName(),
                methodName, methodDesc);
        if (listeners != null) {
            for (AdviceListener adviceListener : listeners) {
                try {
                    if (skipAdviceListener(adviceListener)) {
                        continue;
                    }
                    adviceListener.before(clazz, methodName, methodDesc, target, args);
                } catch (Throwable e) {
                    logger.error("class: {}, methodInfo: {}", clazz.getName(), methodInfo, e);
                }
            }
        }
    }

It can be seen that the design pattern observer pattern is also used here. First, all sections are obtained, and then the before method is woven into each section!

The next step is enhanceClassLoader(). Here is the transformer in dynamic bytecode technology, which is not a table for the time being. Now you only need to know that it is the logic that really needs dynamic weaving;

The last is the bind method. The bottom layer is the server built by Netty, and the last isBind() is to detect whether the server is started successfully; Once the server is started successfully, you can send instructions through the client CLI command line;

Above, whether it is arthas Jar is packaged into the project, or arthas is started after the project is deployed. The functions provided by arthas server can be woven into the original project through pre main and agent main;

Then, the next article will introduce how arthas realizes bytecode level function weaving through instrument

Topics: source code arthas