String constant pool (String Table)

Posted by mastermike707 on Wed, 02 Feb 2022 21:21:40 +0100

10. String constant pool (String Table)

Basic characteristics of String

String: a string, represented by a pair of "".

String s1 = "hguo"; // Definition of literal quantity
String s2 = new String("hello");

The String class is declared as final and cannot be inherited.

The Serializable String interface supports String serialization.

String implements the Comparable interface: it means that string can compare sizes.

Storage structure change

String defines final char [] value in JDK8 and before to store string data. JDK9 is changed to byte [] with coding mark, which saves some space.

public final class String implements java.io.Serializable, Comparable<String>, CharSequence {
    @Stable
    private final byte[] value;
}

Similarly, StringBuffer and StringBuilder also change the storage structure.

String: represents an immutable character sequence, abbreviated as immutability.

When the string is re assigned, the assigned memory area needs to be rewritten, and the original value cannot be used for assignment.

When connecting an existing string, you also need to assign a value to the memory area again. The original value cannot be used for assignment.

When you call the replace() method of String to modify the specified character or String, you also need to reassign the memory area assignment, and the original value cannot be used for assignment.

Assign a value to a string by literal means (different from new). At this time, the string value is declared in the string constant pool.

The String Pool of String is a HashTable of fixed size. The default size length is 1009. If there are too many strings in the String Pool, the Hash conflict will be serious, resulting in a long linked list. The direct impact of a long linked list is when calling String When intern, the performance will be greatly reduced.

Use - XX:StringTableSize to set the length of a StringTable

In JDK6, the StringTable is fixed, which is the length of 1009. Therefore, if there are too many strings in the constant pool, the efficiency will decline quickly. StringTableSize setting does not require

In JDK7, the default value of StringTable length is 60013, and there is no requirement for StringTableSize setting.

Starting from JDK8, if the length of StringTable is set, 1009 is the minimum value that can be set. If the setting is lower than 1009, an error will be reported.

Error:Could not create the Java Virtual Machine.
Error:A fatal exception has occurred. Program will exit.
StringTable size of 1000 is invalid; must be between 1009 and 2305843009213693951

Memory allocation of String

There are eight basic data types and a special type String in the Java language. In order to make them run faster and save memory, these types provide a concept of constant pool.

The constant pool is similar to a cache provided at the Java system level. The constant pools of basic data types in 8 are system coordinated. The constant pool of String type is special. It is mainly used in two ways:

String objects declared directly in double quotes are stored directly in the constant pool. For example: String info = "hello";

If it is not a String object declared in double quotation marks, you can use the intern() method provided by String.

In Java 6 and before, the string constant pool was stored in the permanent generation.

In Java 7, adjust the position of string constant pool to Java heap.

All strings are saved in the Heap, just like other ordinary objects, which allows you to adjust the Heap size when tuning the application.

In Java 8, the string constant pool is in the heap.

String basic operation

Strings with the same content will not be stored in the string constant pool

String splicing operation

1. The splicing results of constants and constants are in the constant pool. The principle is compilation time optimization.

Source code:

@Test
public void test1() {
    String s1 = "a" + "b" + "c"; // Compile time optimization: equivalent to "abc"
    String s2 = "abc"; // "abc" must be placed in the string constant pool and assigned this address to s2
    System.out.println(s1 == s2); // true
    System.out.println(s1.equals(s2)); // true
}

Decompile bytecode:

@Test
public void test1() {
    String s1 = "abc";
    String s2 = "abc";
    System.out.println(s1 == s2);
    System.out.println(s1.equals(s2));
}

2. Constants with the same content will not exist in the constant pool.

3. As long as one of them is a variable, the result is in the heap. The principle of variable splicing is StringBuilder.

Source code:

@Test
public void test2() {
    String s1 = "javaEE";
    String s2 = "hadoop";

    String s3 = "javaEEhadoop";
    String s4 = "javaEE" + "hadoop";//Compile time optimization
    //If there are variables before and after the splicing symbol, it is equivalent to new String() in the heap space. The specific content is the splicing result: javaEEhadoop
    String s5 = s1 + "hadoop";
    String s6 = "javaEE" + s2;
    String s7 = s1 + s2;

    System.out.println(s3 == s4);//true
    System.out.println(s3 == s5);//false
    System.out.println(s3 == s6);//false
    System.out.println(s3 == s7);//false
    System.out.println(s5 == s6);//false
    System.out.println(s5 == s7);//false
    System.out.println(s6 == s7);//false
    //intern(): judge whether there is a javaEEhadoop value in the string constant pool. If so, return the address of javaEEhadoop in the constant pool;
    //If there is no javaEEhadoop in the string constant pool, load a copy of javaEEhadoop in the constant pool and return the address of the secondary object.
    String s8 = s6.intern();
    System.out.println(s3 == s8);//true
}

Decompile bytecode:

@Test
public void test2() {
    String s1 = "javaEE";
    String s2 = "hadoop";
    String s3 = "javaEEhadoop";
    String s4 = "javaEEhadoop";
    String s5 = s1 + "hadoop";
    String s6 = "javaEE" + s2;
    String s7 = s1 + s2;
    System.out.println(s3 == s4);
    System.out.println(s3 == s5);
    System.out.println(s3 == s6);
    System.out.println(s3 == s7);
    System.out.println(s5 == s6);
    System.out.println(s5 == s7);
    System.out.println(s6 == s7);
    String s8 = s6.intern();
    System.out.println(s3 == s8);
}
@Test
public void test3() {
    String s1 = "a";
    String s2 = "b";
    String s3 = "ab";
    /*
        The execution details of s1 + s2 are as follows: (variable s is temporarily defined by me)
        ① StringBuilder s = new StringBuilder();
        ② s.append("a")
        ③ s.append("b")
        ④ s.toString()  --> Approximately equal to new String("ab")

        Add: in jdk5 After 0, StringBuilder is used, which is in jdk5 StringBuffer was used before 0
         */
    String s4 = s1 + s2;//
    System.out.println(s3 == s4);//false
}

Source code:

/*
1. String splicing does not necessarily use StringBuilder
   If the left and right sides of the splice symbol are string constants or constant references, compile time optimization, that is, non StringBuilder, is still used.
2. When final modifies the structure of classes, methods, basic data types and reference data types, it is recommended to use it when final can be used.
*/
@Test
public void test4() {
    final String s1 = "a";
    final String s2 = "b";
    String s3 = "ab";
    String s4 = s1 + s2;
    System.out.println(s3 == s4);//true
}

Decompile Code:

@Test
public void test4() {
    String s1 = "a";
    String s2 = "b";
    String s3 = "ab";
    String s4 = "ab";  // You can see that s4 has been assigned "ab" at compile time
    System.out.println(s3 == s4);
}

Efficiency comparison between string splicing "+" and StringBuilder:

/*
    Experience the execution efficiency: the efficiency of adding strings through the append() method of StringBuilder is much higher than that of String splicing method!
    Details: ① method of StringBuilder's append(): only one StringBuilder object has been created from beginning to end
          	String splicing method using string: create too many StringBuilder and string objects
         ② String splicing method using string: because more StringBuilder and string objects are created in memory, the memory occupation is larger; If GC is performed, it will take additional time.

    Space for improvement: in actual development, if it is basically determined that the length of the string to be added before and after is not higher than a certain limit value highLevel, it is recommended to use constructor instantiation:
           StringBuilder s = new StringBuilder(highLevel); // new char[highLevel]
*/
@Test
public void test6() {

    long start = System.currentTimeMillis();

    method1(100000); //4014ms
    method2(100000); //7ms

    long end = System.currentTimeMillis();

    System.out.println("Time spent:" + (end - start));
}

public void method1(int highLevel) {
    String src = "";
    for (int i = 0; i < highLevel; i++) {
        src = src + "a";//A StringBuilder and String are created for each cycle
    }
}

public void method2(int highLevel) {
    //Just create a StringBuilder
    StringBuilder src = new StringBuilder();
    for (int i = 0; i < highLevel; i++) {
        src.append("a");
    }
}

4. If the result of splicing calls the intern() method, it will actively put the string object that is not in the constant pool into the pool and return the address of this object.

Use of intern()

If it is not a String object declared in double quotation marks, you can use the intern method provided by String: the intern method will query whether the current String exists from the String constant pool. If it does not exist, it will put the current String into the constant pool. For example:

String myInfo = new String("hello word").intern();

That is, if you call string. On any string For the intern method, the class instance to which the returned result points must be exactly the same as the string instance directly in the form of a constant. Therefore, the value of the following expression must be true.

("a" + "b" + "c").intern() == "abc"

Generally speaking, interconnected string is to ensure that there is only one copy of the string in memory, which can save memory space and speed up the execution of string operation tasks. Note that this value will be stored in the string inter pool.

Title: how many objects will new String("ab") create?

Extension: how many objects will new String("a") + new String("b") create?

/**
 * Title:
 * new String("ab")How many objects will be created? Look at the bytecode, you know it's two.
 *     An object is created in heap space by the new keyword
 *     Another object is the object "ab" in the string constant pool. Bytecode instruction: ldc
 *
 *
 * reflection:
 * new String("a") + new String("b")And?
 *  Object 1: new StringBuilder()
 *  Object 2: new String("a")
 *  Object 3: "a" in constant pool
 *  Object 4: new String("b")
 *  Object 5: "b" in constant pool
 *
 *  In depth analysis: toString() of StringBuilder:
 *      Object 6: new String("ab")
 *       Emphasize that the call to string() does not generate "ab" in the string constant pool
 *
 */
public class StringNewTest {
    public static void main(String[] args) {
//        String str = new String("ab");

        String str = new String("a") + new String("b");
    }
}

Interview questions:

/**
 * How to ensure that the variable s points to the data in the string constant pool?
 * There are two ways:
 * Method 1: String s = "hello"// How literal quantities are defined
 * Method 2: call intern()
 *         String s = new String("hello").intern();
 *         String s = new StringBuilder("hello").toString().intern();
 */
public class StringIntern {
    public static void main(String[] args) {

        String s = new String("1");
        s.intern(); //'1' already exists in the string constant pool before calling this method
        String s2 = "1";
        System.out.println(s == s2);//jdk6: false   jdk7/8: false


        String s3 = new String("1") + new String("1"); // The address of s3 variable record is: new String("11")
        // After executing the last line of code, does "11" exist in the string constant pool? Answer: does not exist!!
        s3.intern();// Generate "11" in the string constant pool.
        // How to understand: jdk6: when a new object "11" is created, there will be a new address.
        // jdk7/8: instead of creating "11" in the constant pool, create an address that points to new String("11") in the heap space, that is, s3
        String s4 = "11"; //Address of s4 variable record: the address of "11" generated in the constant pool during the execution of the previous line of code is used
        System.out.println(s3 == s4); //jdk6: false  jdk7/8: true
    }
}

Process Description:

Create objects in heap space and assign addresses to variables s,call s.intern()Method to check whether there is a string constant in the string constant pool"ab",If not, it is saved in the heap"ab"The reference address saved in the pool is assigned to the variable s2. 

Summarize the function of String intern():

jdk1.6, try to put this string object into the string pool.

If there is in the string pool, it will not be put in, and the address of the object in the existing string pool will be returned

If not, a copy of this object will be copied and put into the string pool, and the address of the object in the string pool will be returned

jdk1. From 7, try to put this string object into the string pool.

If there is in the string pool, it will not be put in, and the address of the object in the existing string pool will be returned

If not, it will copy the reference address of the object, put it into the string pool, and return the reference address in the string pool

intern efficiency test

Use intern() to test execution efficiency: for a large number of strings in the program, especially when there are many duplicate strings, using intern() can save memory space.

Topics: jvm