[Java] String is immutable, is that true? Deep Analysis from Principle

Posted by calevans on Mon, 03 Jan 2022 08:12:23 +0100

People who learn Java get more or less the message that String is immutable. So is that true?
Pre-knowledge of this article: Reflection, jmm.

1. How to change a String

Open the source of the String and you can see that the data of the String object is stored in its value array.
In the early version of Java, this is an array of type char[], replaced by type byte[] in the later version.

public final class String {
    private final byte[] value;
    // ......
}

So if you replace this array with reflection, can you change the String?
Next try.

Create a modifyString method that uses reflection to modify the value array in the string and test the effect in the main function (note that in lower versions of Java the byte[] here should be modified to char[]):

    private static void modifyString(String src, String dst) throws NoSuchFieldException, IllegalAccessException {
        Field valueField = String.class.getDeclaredField("value");
        valueField.setAccessible(true);
        byte[] newValue = dst.getBytes();
        valueField.set(src, newValue);
    }

    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
        String s = "hello, world!";
        modifyString(s, "you're so cool!");
        System.out.println("s = " + s);
    }

As you can see, the output display s did change!

s = you're so cool!

A bold idea

After seeing the results above, I had a bold idea.

It's also the modifyString method, but the main function changes to the following:

    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
        String s = "hello, world!";
        modifyString(s, "you're so cool!");
        System.out.println("hello, world!");
    }

Even directly:

    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
        modifyString("hello, world!", "you're so cool!");
        System.out.println("hello, world!");
    }

Guess what will be output? If you are interested, you can try it yourself.

2. Principle analysis

1. String Constant Pool

Strings in Java are stored in a pool of string constants. The string constant pool is theoretically in the method area, but is actually stored in the heap (see jmm).
The string constant pool stores used string objects. When a string needs to be used, first look for the corresponding object in the string constant pool, and if it is found, return it directly. Otherwise, create a new string object and place it in the string constant pool.

When running to s = "hello, world!" This string type of variable s points to "hello, world!" in the corresponding constant pool. Object. For ease of distinction, this is called "hello, world!" in the string constant pool. The object is helloworld.
It is well known that all objects in Java are passed by reference when using modifyString (s,'you're so cool!'). When a method modifies a string, it actually modifies helloworld.value. This is equivalent to directly modifying the value of a string in the constant pool.

So when we run the following code:

    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
        String s = "hello, world!";
        String t = "hello, world!";
        modifyString(s, "you're so cool!");
        System.out.println(t);
    }

The output is you're so cool!. Essentially, s and t are both objects helloworld pointing to the same pool of string constants. That is, s and t are essentially just like a pointer, and the real objects are helloworld in a constant pool.
When the internal value value value of the helloworld object is modified, the values of s and t on the surface are changed.

Look at this code again:

    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
        modifyString("hello, world!", "you're so cool!");
        System.out.println("hello, world!");
    }

In the same way, finally System.out.println("hello, world!") The output is you're so cool!, It's also amazing to uncover the mystery.

2. new String()

The situation changes when you create a string with new String().
View the following code:

    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
        String s = "hello, world!";
        String t = new String("hello, world!");
        modifyString(s, "you're so cool!");
        System.out.println(t);
    }

The final output is hello, world! Instead of you're so cool!, This is because the variable t points to the String object created in the heap, not the helloworld object in the string constant pool.

Execute t = new String("hello, world!) When the object is placed in heap memory, a space is opened and the value array of helloworld in the string constant pool is assigned to t. Below are the construction methods for the String class:

    public String(String original) {
        this.value = original.value;
        this.coder = original.coder;
        this.hash = original.hash;
    }

When using modifyString (s,'you're so cool!') When you modify a string, you change helloworld.value was replaced; Instead of replacing t.value, it still refers to "hello, world!" The corresponding byte array. As shown in the diagram:

Seeing this, don't know if you have a bold idea?

3. Another bold idea

In the analysis above, you already know that for the following code:

    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
        String s = "hello, world!";
        String t = new String("hello, world!");
        modifyString(s, "you're so cool!");
        System.out.println("t = " + t);
    }

    private static void modifyString(String src, String dst) throws NoSuchFieldException, IllegalAccessException {
        Field valueField = String.class.getDeclaredField("value");
        valueField.setAccessible(true);
        byte[] newValue = dst.getBytes();
        valueField.set(src, newValue);
    }

T is unchanged, still output t = hello, world!. So what if you change the behavior of the modifyString so that it modifies the value array directly?

    private static void modifyString(String src, String dst) throws NoSuchFieldException, IllegalAccessException {
        Field valueField = String.class.getDeclaredField("value");
        valueField.setAccessible(true);
        byte[] oldValue = (byte[]) valueField.get(src);
        byte[] newValue = dst.getBytes();
        System.arraycopy(newValue, 0, oldValue, 0, Math.min(oldValue.length, newValue.length));
    }

At this point, when you run the main function again, you see that the value of t has also changed. However, due to the length limitation of the value array, only the length of the original string can be displayed:

output
t = you're so coo

3. String in Android

Open the String class in the Android SDK and you will find that there is no value array in it. This is because Android modifies the implementation of the String class to manage the value array directly at the native level and adds an int-type variable count to the String to represent the length of the array.
A series of value-related methods, such as charAt, compareTo, and so on, have been changed to native methods.
Android also prohibits all String construction methods, creating strings in either double quotes or StringFactory.

Android is supposed to do this for performance reasons, to optimize performance on byte codes. I think it's also safe to avoid modifying the values in the string constant pool.

Anyway, this flower job is no longer fun in Android.

Topics: Java Android string