The combination of java foundation and source code is an important knowledge point for understanding string classes

Posted by JamesThePanda on Tue, 04 Jan 2022 09:37:04 +0100

String class

String classes mainly refer to string, StringBuffer and StringBuilder. As can be seen from the source code comments, both string and StringBuffer are jdk1 0, and StringBuilder is jdk1 5.
Generally speaking, the most commonly used is String, which is immutable, followed by variable StringBuilder and StringBuffer. StringBuffer is thread safe because the methods inside are added with the synchronized keyword.
StringBuilder and StringBuffer are both inherited from AbstractStringBuilder class, and many methods in it are also shared. This abstract class your logic, so other basic are the same except whether it is thread safe or not.

Understand that strings cannot be inherited

None of the above three string classes can be inherited because their definitions are final, such as:

public final class String

fainal cannot be inherited when modifying a class, cannot be overridden when modifying a method, cannot change the value when modifying a basic type variable, and cannot change a reference when modifying a reference type variable.

Understand that strings are immutable

String is immutable. Once a string variable is created, the content of the reference it points to cannot be changed. If you want to change the value of this variable, you will actually change the reference of the variable at the same time.
StringBuilder and StringBuffer are variable, which means that after creation, the value inside the reference object can be changed without changing the reference. In fact, the value of the array inside the reference object can be changed, which is in jdk1 8 refers to the character array, and jdk12 refers to the byte array.

Understand equals in String

The equals method is a method in the Object class. Without rewriting, it is actually a reference to two objects directly compared. The equals source code in Object is as follows:

public boolean equals(Object obj) {
	return (this == obj);
}

The equals method is overridden in String, so the equals method in String is no longer a direct reference to compare String objects, jdk1 8. The source code of equals in String is as follows:

public boolean equals(Object anObject) {
	if (this == anObject) {
		return true;
	}
	if (anObject instanceof String) {
		String anotherString = (String)anObject;
		int n = value.length;
		if (n == anotherString.value.length) {
			char v1[] = value;
			char v2[] = anotherString.value;
			int i = 0;
			while (n-- != 0) {
				if (v1[i] != v2[i])
					return false;
				i++;
			}
			return true;
		}
	}
	return false;
}

As can be seen from the logic above, when the two objects have the same reference, it will return true. When the references are different, it will first judge the type and then perform strong conversion. Then, it will traverse the character array elements at the bottom of the two strings and compare the sizes in turn. When all elements are equal, it will return true.
Note: in JDK 12, the equals of String has been rewritten, and the logic has changed greatly. The logic is not as intuitive as the above. The root cause seems to be that the underlying storage has changed, jdk1 8. The bottom layer is a String array, while jdk12 is a byte array. The specific version of JDK from which this change started has not been studied in detail.

compareTo from the source code

compareTo is a method in the Comparable interface, which is used to compare the size of two objects. String class implements this interface and implements the compareTo method. In jdk1 The corresponding source code in 8 is as follows:

public int compareTo(String anotherString) {
	int len1 = value.length;
	int len2 = anotherString.value.length;
	int lim = Math.min(len1, len2);
	char v1[] = value;
	char v2[] = anotherString.value;

	int k = 0;
	while (k < lim) {
		char c1 = v1[k];
		char c2 = v2[k];
		if (c1 != c2) {
			return c1 - c2;
		}
		k++;
	}
	return len1 - len2;
}

The logic of this method is also more intuitive, that is, take the length of the underlying character array of the two strings respectively, and then take the length of the smallest one as a cycle, and then judge the size of the characters at each position at once.
Note: similarly, due to the change of jdk12 underlying storage, the implementation of compareTo has also found great changes.

Look at replace from the source code

Replace is used to replace the string content, jdk1 The source code in 8 is as follows:

public String replace(CharSequence target, CharSequence replacement) {
	return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
			this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}

You can see that some methods related to regular expressions are used in replace. If you go into the replaceAll method, you can see that StringBuffer and StringBuilder are also used.

Note: similarly, the logic of this method in JDK 12 has changed greatly.

Detailed analysis of string splicing

String splicing generally uses "+" if it is a string, and append if it is a StringBuilder or StringBuffer.
In fact, the splicing with "+" in jdk8 is not the same, for example, the following code:

public static void main(String[] args) {
	String a1="ab"+"cd";

	String a="ab";
	String b="cd";
	String d=a+b;

	StringBuilder sb=new StringBuilder("ab");
	sb.append("cd");
}

The above code has three string splicing operations, one is the direct literal splicing, one is the splicing between string variables, and the other is the splicing of StringBuilder.
Results the javap tool executes "javap -c xxx.class" to view the compilation process. The compilation process is as follows:

0: ldc           #2                  // String abcd
2: astore_1
3: ldc           #3                  // String ab
5: astore_2
6: ldc           #4                  // String cd
8: astore_3
9: new           #5                  // class java/lang/StringBuilder
12: dup
13: invokespecial #6                  // Method java/lang/StringBuilder."<init>":()V
16: aload_2
17: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
20: aload_3
21: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
27: astore        4
29: new           #5                  // class java/lang/StringBuilder
32: dup
33: ldc           #3                  // String ab
35: invokespecial #9                  // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
38: astore        5
40: aload         5
42: ldc           #4                  // String cd
44: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
47: pop
48: return

There are a lot of contents above, which can only be fully understood after having some jvm foundation. In fact, we can only understand the following comments first, focusing on the following lines:

0: ldc           #2                  // String abcd

3: ldc           #3                  // String ab
6: ldc           #4                  // String cd
9: new           #5                  // class java/lang/StringBuilder
13: invokespecial #6                  // Method java/lang/StringBuilder."<init>":()V
17: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
21: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;

29: new           #5                  // class java/lang/StringBuilder
33: ldc           #3                  // String ab
35: invokespecial #9                  // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
42: ldc           #4                  // String cd
44: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;

As can be seen from the above, for the plus sign splicing of string literal quantity, the jvm is actually optimized during compilation, and the compiled class file has been spliced into a string.
For the plus sign splicing of variables, first define two String variables, then create a StringBuilder object, then splice the initial Hu and append, and finally use the toString method to return to String.
For StringBuilder, first create a StringBuilder object, then create a variable of String type, and then initialize and append.

About StringBuild capacity expansion

String and StringBuilder are stored in arrays at the bottom. Jdk8 contains character arrays. Later versions include byte arrays, and the array length is immutable. Therefore, the expansion of the bottom array is involved in string splicing using StringBuilder's append. In the jdk8 source code, the following codes are mainly used:

public AbstractStringBuilder append(String str) {
	if (str == null)
		return appendNull();
	int len = str.length();
	ensureCapacityInternal(count + len);
	str.getChars(0, len, value, count);
	count += len;
	return this;
}

private void ensureCapacityInternal(int minimumCapacity) {
	// overflow-conscious code
	if (minimumCapacity - value.length > 0) {
		value = Arrays.copyOf(value,
				newCapacity(minimumCapacity));
	}
}

private int newCapacity(int minCapacity) {
	// overflow-conscious code
	int newCapacity = (value.length << 1) + 2;
	if (newCapacity - minCapacity < 0) {
		newCapacity = minCapacity;
	}
	return (newCapacity <= 0 || MAX_ARRAY_SIZE - newCapacity < 0)
		? hugeCapacity(minCapacity)
		: newCapacity;
}

Here, we will first take the sum of the actual array length and the length of the new string, and then pass it to the capacity expansion method.
In the capacity expansion method, compare this parameter with the length of the underlying array. When the length exceeds the array length, the capacity of the underlying array will be expanded.
During capacity expansion, you can see that if the length exceeds integer MAX_ Value, a memory overflow exception is thrown, that is, the maximum length of this array is integer MAX_ Value. Under normal circumstances, the capacity expansion is based on the new actual string length multiplied by 2 and then added by 2.

Topics: Java string source code