A high-performance, small and beautiful serialization tool!

Posted by xX_SuperCrazy_Xx on Wed, 26 Jan 2022 03:55:18 +0100

Author: fredalxin
Address: https://fredal.xin/kryo-quickstart

Kryo is a high-performance serialization / deserialization tool. Due to its variable length storage characteristics and the use of bytecode generation mechanism, kryo has high running speed and small volume. In some scenarios, it has become a choice other than Jason and Protobuf.

rely on

First, we introduce maven dependencies:

<dependency>
    <groupId>com.esotericsoftware</groupId>
    <artifactId>kryo</artifactId>
    <version>4.0.2</version>
</dependency>

It should be noted that kryo uses a higher version of asm, which may conflict with the existing business dependent asm, which is a common problem. Just change the dependency to:

<dependency>
    <groupId>com.esotericsoftware</groupId>
    <artifactId>kryo-shaded</artifactId>
    <version>4.0.2</version>
</dependency>

Record type information

This is a feature of kryo. The object information can be written directly into the serialized data. During deserialization, the original Class information can be found accurately without error, which means that there is no need to pass in Class or Type Class information when writing readxxx method.

Accordingly, kryo provides two reading and writing modes. The writeClassAndObject/readClassAndObject method for recording type information and the traditional writeObject/readObject method.

Thread safety

kryo's object itself is not thread safe, so we have two options to ensure thread safety.

Use Threadlocal to ensure thread safety:

private static final ThreadLocal<Kryo> kryoLocal = new ThreadLocal<Kryo>() {
    protected Kryo initialValue() {
        Kryo kryo = new Kryo();
        kryo.setInstantiatorStrategy(new Kryo.DefaultInstantiatorStrategy(
                    new StdInstantiatorStrategy()));
        return kryo;
    };
};

Or use the pool provided by kryo:

public KryoPool newKryoPool() {
    return new KryoPool.Builder(() -> {
        final Kryo kryo = new Kryo();
        kryo.setInstantiatorStrategy(new Kryo.DefaultInstantiatorStrategy(
            new StdInstantiatorStrategy()));
        return kryo;
    }).softReferences().build();
}

Instantiator

Notice kryo above setInstantiatorStrategy(new Kryo.DefaultInstantiatorStrategy(new StdInstantiatorStrategy())); This sentence shows that the instantiator is specified.

In some open source software that relies on kryo, null pointer exceptions may be thrown due to problems specified by the instantiator. For example, in some versions of hive, StdInstantiatorStrategy is specified by default.

public static ThreadLocal<Kryo> runtimeSerializationKryo = new ThreadLocal<Kryo>() {
    @Override
    protected synchronized Kryo initialValue() {
        Kryo kryo = new Kryo();
        kryo.setClassLoader(Thread.currentThread().getContextClassLoader());
        kryo.register(java.sql.Date.class, new SqlDateSerializer());
        kryo.register(java.sql.Timestamp.class, new TimestampSerializer());
        kryo.register(Path.class, new PathSerializer());
        kryo.setInstantiatorStrategy(new StdInstantiatorStrategy());
        ......
            return kryo;
    };
};

StdInstantiatorStrategy creates objects based on JVM version information and JVM vendor information. You can create objects without calling any construction methods of objects.

For example, when you encounter an object such as ArrayList, there will be a problem. Take a look at the source code of ArrayList:

public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

Since the constructor is not called, the elementData here will be NULL, and an exception will be thrown when calling a method similar to ensureCapacity.

 public void ensureCapacity(int minCapacity) {
     if (minCapacity > elementData.length
         && !(elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
              && minCapacity <= DEFAULT_CAPACITY)) {
         modCount++;
         grow(minCapacity);
     }
 }

The solution is very simple. As written in the code in the framework, the specified instantiator is displayed. First, the default parameterless construction policy DefaultInstantiatorStrategy is used. If the object creation fails, StdInstantiatorStrategy is used.

Class registration

When kryo writes an instance of an object, the fully qualified name of the class needs to be written by default. It is inefficient to write class names into serialized data, so kryo supports optimization through class registration.

kryo.register(SomeClassA.class);
kryo.register(SomeClassB.class);
kryo.register(SomeClassC.class);

Registration will associate each class with an Id of type int, which is obviously more efficient than the class name, but at the same time, it is required that the Id during deserialization must be consistent with that during serialization. This means that the order of registration is very important.

However, due to practical reasons, the registration numbers of the same code and the same Class on different machines can not be guaranteed to be consistent, so there may be problems in deserialization during multi machine deployment.

So kryo prohibits class registration by default. Of course, if you want to open this property, you can use kryo setRegistrationRequired(true); Open.

Circular reference

This is the support for circular reference, which can effectively prevent stack memory overflow. Kryo will open this attribute by default. When you are sure that no circular reference will occur, you can use kryo setReferences(false); Turn off circular reference detection to improve some performance.

Variable length storage

Kryo adopts the variable length storage mechanism for both int and long types. Taking int as an example, it generally needs 4 bytes to store. For kryo, it can store 1-5 variable length bytes to avoid the waste of high bits being 0.

A maximum of 5 bytes are required for storage because in the variable length storage int process, only 7 bits of 8 bits of a byte are used to store significant numbers, and the highest bit is used to mark whether the next byte needs to be read. 1 means yes, 0 means No.

Variable length storage is also used in the storage of strings. The overall structure of string serialization is length + content, so length will also use variable length int to write the length of characters.

Scenes used with cache

In actual development, adding and deleting fields in class is a common thing, but it is not supported by kryo. If you happen to need to use cache, this problem will be exacerbated.

For example, after an object is serialized using kryo, the data is put into the cache. At this time, if an attribute is added or deleted for the object, an error will be reported during deserialization in the cache. Therefore, frequent use of caching can be avoided kryo.

However, Kryo now provides compatibility support, using compatiblefieldserializer Class, in Kryo The information written when writeclassandobject is as follows:

class name|field length|field1 name|field2 name|field1 value| filed2 value

While reading kryo When reading classandobject, the field names will be read first, then the field and order of the current deserialized class will be matched, and then the result will be constructed.

Of course, if you do a good job in cache isolation, you don't have to care about all this.

Recent hot article recommendations:

1.1000 + Java interview questions and answers (2021 latest version)

2.Finally got the IntelliJ IDEA activation code through the open source project. It's really fragrant!

3.Ali Mock tools are officially open source and kill all Mock tools on the market!

4.Spring Cloud 2020.0.0 is officially released, a new and subversive version!

5.Java development manual (Songshan version) is the latest release. Download it quickly!

Feel good, don't forget to like + forward!