String constant pool and wrapper class

Posted by bhagwat on Sat, 23 Oct 2021 11:08:08 +0200

String constant pool

design idea

In order to improve performance, reduce memory overhead and avoid repeated creation of strings, the JVM maintains a special memory space, namely string constant pool. When a string needs to be used, first check whether the string exists in the string constant pool. If so, directly return the reference address of the string; If it does not exist, a string object is created in the string constant pool and the reference address of the object is returned.

String a = "abc";	// To constant pool
String b = "abc";	// Remove from constant pool
System.out.println(a == b);	// trzue

Note: before JDK7, the string constant pool was located in the permanent generation (method area). At this time, the string constant pool stores objects and their references. In JDK7, the string constant pool is moved to the heap. At this time, the string constant pool only stores references, and the string objects are in the heap.


Memory area

  • Before JDK 1.7, the runtime constant pool (including string constant pool) was stored in the method area. At this time, the implementation of the method area by the HotSpot virtual machine is a permanent generation.

  • In JDK 1.7, the string constant pool is transferred from the method area to the Java heap. Note that it is not the runtime constant pool, but the string constant pool is transferred to the heap separately. The rest of the runtime constant pool is still in the method area, that is, the permanent generation of HotSpot.

  • In JDK 1.8, the method area (the permanent generation of HotSpot) was completely removed (JDK 7 has already started) and replaced with a meta space implemented in local memory. At this time, the string constant pool is still in the heap, but the implementation of the method area changes from the permanent generation to the meta space, and all the remaining contents (runtime constant pool and type information) of the permanent generation in JDK 1.7 are moved to the meta space.


What happens when the "+" operation is performed on variables and constants of type String?

String str1 = "str";
String str2 = "ing";
String str3 = "str" + "ing";	//Objects in constant pool
String str4 = str1 + str2; 		//A new object created on the heap
String str5 = "string";			//Objects in constant pool
System.out.println(str3 == str4);//false
System.out.println(str3 == str5);//true
System.out.println(str4 == str5);//false

For a string whose value can be determined at compile time, that is, a constant string, the jvm will store it in the string constant pool.

Moreover, the string constants obtained by string constant splicing have been stored in the string constant pool in the compilation stage, which benefits from the optimization of the compiler.

During compilation, the Javac compiler (hereinafter referred to as the compiler) performs a code optimization called constant folding.

Constant folding will calculate the value of constant expression as a constant and embed it in the final generated code. This is one of the few optimization measures that the Javac compiler will take on the source code (almost all code optimization is carried out in the real-time compiler).

For String str3 = "str" + "ing"; The compiler will optimize you to String str3 = "string".

Not all constants will be folded. Only constants whose values can be determined by the compiler during program compilation can be folded:

  1. Basic data types (byte, boolean, short, char, int, float, long, double) and string constants
  2. Basic data types and string variables modified by final
  3. String: string obtained by splicing "+", arithmetic operation between basic data types (addition, subtraction, multiplication and division), bit operation of basic data types (<, > >, > > >)

If the referenced value cannot be determined at the compilation time of the program, the compiler cannot optimize it.

Therefore, str1, str2, and str3 are all objects in the string constant pool.

Object reference and "+" string concatenation are implemented by StringBuilder calling append(). After the completion of the mosaic, toString() is called to get a String object.

String str4 = new StringBuilder().append(str1).append(str2).toString();

Therefore, str4 is not an object existing in the string constant pool, but a new object on the heap.

However, after the string is declared with the final keyword, it can be treated by the compiler as a constant.

final String str1 = "str";
final String str2 = "ing";
// The following two expressions are actually equivalent
String c = "str" + "ing";	// Objects in constant pool
String d = str1 + str2; 	// Objects in constant pool
System.out.println(c == d);	// true

The String modified by the final keyword will be treated as a constant by the compiler. The compiler can determine its value at the compilation time of the program. Its effect is to access constants.

You can verify by viewing the bytecode file:

However, if the compiler can only know its exact value at run time, it cannot optimize it, that is, as long as one of them is a variable, the result is in the heap.


new String() creates several objects

  • Let's start with an example:
String str1 = "abc";
String str2 = new String("abc");
String str3 = new String("abc");
System.out.println(str1 == str2);	//false
System.out.println(str2 == str3);	//false

After the code runs, the output is false because:

// Get object from string constant pool
String str1 = "abc";

At this time, the JVM will first check whether the string constant pool has "abc". If so, str1 directly points to "abc" in the constant pool; If it does not exist, create one in the constant pool, and then str1 points to the object in the string constant pool.

// Create a new object directly in heap memory
String str2 = new String("abc");
String str3 = new String("abc");

Using new String() will create a string object in the heap, and then check whether there is a string object with the same string value in the string constant pool. If not, create a string object with the same value in the string constant pool, and finally return the address of the string object in the heap.

Therefore, using new String() will create 1 or 2 objects.

  • Let's take another example:
String str = new String("a") + new String("b");

Six objects are created:

1. new StringBuilder (because the connection operation occurs and the connected variables are variables)

2. "a" in heap

3. 'a' in constant pool

4. "b" in heap

5. "b" in constant pool

6. "ab" in the heap (created in the heap by the toString() method, but not in the string constant pool)


About intern()

String s = new String("1");
s.intern();		// In essence, this line of code is useless because "1" already exists in the string constant pool (due to new String)
String s2 = "1";
System.out.println(s == s2);	// jdk6:false  jdk7/8:false

String s3 = new String("1") + new String("1");
s3.intern();
String s4 = "11";
System.out.println(s3 == s4);	// jdk6:false  jdk7/8:true

After JDK1.6 and JDK1.7, the intern function has different processing:

In JDK1.6, the processing of intern is: first judge whether the string constant is in the string constant pool. If it exists, directly return the constant address. If it is not found, create the constant in the string constant pool and return the object address;

In JDK1.7, the processing of intern is: first judge whether the string constant is in the string constant pool. If it exists, directly return the constant address; If it is not found, it means that the string constant is in the heap. The processing is to add the reference of the object in the heap area to the string constant pool, and then get the reference of the string constant, and the actual object is stored in the heap.

Using the intern method, when there is no string constant in the constant pool:

Before JDK 1.7 (excluding 1.7), the intern method will create an object in the constant pool and return a reference to the object; JDK 1.7 and later, the string constant pool is taken from the method area to the heap. When using the intern method, the JVM will not create the object in the constant pool, but directly put the reference of the object in the heap into the constant pool to reduce unnecessary memory overhead.


Wrapper class and corresponding constant pool

What is the packing type? What is the difference between basic type and packaging type?

Java has introduced the corresponding wrapper class for each basic data type. The wrapper class of int is Integer. Since Java 5, the automatic boxing / unpacking mechanism has been introduced. The process of converting a basic type into a wrapper type is called boxing; Conversely, the process of converting a package type to a basic type is called unboxing, so that the two can be converted to each other.

Java provides wrapper types for each primitive type:

  • Primitive types: boolean, char, byte, short, int, long, float, double

  • Packing type: Boolean, Character, Byte, Short, Integer, Long, Float, Double

The main differences between basic type and packaging type are as follows:

1. The wrapper type can be null, while the base type cannot. It allows wrapper types to be applied to POJOs, while base types do not. So why do POJO attributes have to use wrapper types? The Alibaba Java development manual explains in detail that the query result of the database may be null. If the basic type is used, the exception of NullPointerException will be thrown because it is necessary to unpack automatically (convert the packaging type to the basic type, for example, convert the Integer object to the int value).

2. Wrapper types can be used for generics, but base types cannot. Generic types cannot use base types because compilation errors occur when using base types.

List<int> list = new ArrayList<>(); // Prompt Syntax error, insert "Dimensions" to complete ReferenceType
List<Integer> list = new ArrayList<>();

Because the generic type will be erased at compile time, and only the original type will be retained, and the original type can only be the Object class and its subclasses -- the basic type is a special case.

3. The basic type is more efficient than the packaging type. The basic type stores the specific value directly in the stack, while the wrapper type stores the reference in the heap. Obviously, wrapper types take up more memory than basic types.


What is automatic packing and unpacking

Auto boxing: converts basic data types to wrapper objects

Integer i = 9;	==>>	Integer i = Integer.valueOf(9)

9 is a basic data type. In principle, it cannot be directly assigned to an object Integer. By introducing the automatic boxing / unpacking mechanism, you can make such a declaration, automatically convert the basic data type into the corresponding encapsulation type, and call all the methods declared by the object after it becomes an object.

Automatic unpacking: converts packaging objects to basic data types

Integer i = 9;
int j = i;	===>>	through the use of Integer.intValue()	

// ===================

Integer i = 9;
System.out.print(i++);

Because objects cannot be calculated directly, but they can only be added, subtracted, multiplied and divided after being converted to basic data types.

Integer i1 = 40;
Integer i2 = 40;
Integer i3 = 0;
Integer i4 = new Integer(40);
Integer i5 = new Integer(40);
Integer i6 = new Integer(0);

System.out.println(i1 == i2);// true
System.out.println(i1 == i2 + i3);//true
System.out.println(i1 == i4);// false
System.out.println(i4 == i5);// false
System.out.println(i4 == i5 + i6);// true
System.out.println(40 == i5 + i6);// true

I1, I2 and I3 are objects in the constant pool, and I4, i5 and I6 are objects in the heap.

Why is i4 == i5 + i6 true? Because i5 and i6 will be unpacked automatically and the values will be added, that is, i4 == 40. The Integer object cannot be directly compared with the value, so i4 automatically unpack it to the int value of 40. Finally, this statement is converted to 40 = = 40 for numerical comparison.


Wrapper constant pool

Byte, short, integer and long create cache data of corresponding types of values [- 128, 127] by default, Character creates cache data of values in the range of [0, 127], and Boolean directly returns True or False.

Two floating-point wrapper classes, float and double, do not implement constant pool technology.

Integer

  • Integer cache range: [- 128, 127]

The equals method of Integer has been rewritten to compare the value of internal value;

If = = is used, it will be cache d in [- 128, 127]. If this is exceeded, the comparison is whether the objects are the same.

Integer source code:

public static Integer valueOf(int i) {
    if (i >= IntegerCache.low && i <= IntegerCache.high)
        return IntegerCache.cache[i + (-IntegerCache.low)];
    return new Integer(i);
}

private static class IntegerCache {
    static final int low = -128;
    static final int high;
    static final Integer cache[];	// The Integer object of [- 128, 127] has been stored in the cache array
    
    static {
        int h = 127;
        // ......
        high = h;
        cache = new Integer[(high - low) + 1];
        int j = low;
        for(int k = 0; k < cache.length; k++)
            cache[k] = new Integer(j++);	// Create Integer object for [- 128, 127]
        // ......
    }
}

When judging whether the Integer value is the same, it is recommended to use the equals method or automatic unpacking

1. Using the equals method

Integer x = 128;
Integer y = 128;
System.out.println(x == y);	//The result is false

Because 128 exceeds the cache, x and y actually create Integer objects with a value of 128, and the addresses must be different

Integer x = 128;
Integer y = 128;
System.out.println(x.equals(y));//The result is true

The equals method directly compares the value attribute value of an Integer object

2. Automatic unpacking

Integer x = 128;
Integer y = 128;
int z = y;
System.out.println(x == z);	//The result is true

First, replace one of the two variables for operation comparison with int type. When = = is compared, x will be automatically unpacked and converted to int


Supplement: constant pool in Java

Constant pool classification

There are three kinds of constant pools in Java: Global string constant pool, class file constant pool and runtime constant pool. The character constant pool is the global string constant pool.

For a detailed explanation of these constant pools, refer to: Differentiation of several constant pools in Java


reference material

JavaGuide/Java memory area. md at master · snail climb / javaguide (GitHub. Com)

Topics: Java