Java String

Posted by shneoh on Sun, 12 May 2019 00:13:53 +0200

Java String

We know that Java's character channeling is Immutable, once created, it can't change its content; Usually we operate on strings most, in fact, the returned strings are all new string objects, which have not been changed, which is exactly the same as C.

Since strings are invariable, the efficiency of various operations on strings must have an impact, such as our usual + operator:

public class ConcatString{
    public static void main(String[] args) {
        var name = "marson";
        var s = "abc" + name + "shine" + 47+ "nancy" + "summer zhu";
        print(s);
    }
}

I believe this code is very easy to meet in our daily development, it has not started to add here, opened up six string objects, and then + up to form a new String object, so you can imagine how many new objects will be generated when we encounter a large number of (unknown length and predicted higher than a certain value) string splicing, which will not only affect memory, but also performance. Ring.

So then there's StringBuilder.

StringBuilder

StringBuilder is designed to solve the problem of invariant String. StringBuilder maintains an object with initial capacity of 16 (dynamically expandable) internally. It is a variable, so it append s the same object as the string. So StringBuilder can be significantly better than String when there are a lot of string splicing.

Before JAVA SE5, StringBuffer played the role of StringBuilder, but StringBuffer was thread-safe. If you deduct the source code carefully, you will find that it contains a large number of keywords synchronized, so the performance overhead is relatively high.

Below is the demo of StringBuilder

public void UsingStringBuilder(){
    var sb = new StringBuilder();
    sb.append("abc").append("marson").append("shine")
        .append("summer").append("zhu");
    System.out.println(sb);
}

The Hidden Trap of StringBuilder

Now let's learn a String Builder scenario mentioned in the book Java Programming Thoughts and paste down the code:

public class InfiniteRecursion {
    @Override
    public String toString() {
        return "InfiniteRecursion address: " + this + "\n";
    }

    public static void main(String[] args) {
        var v = new ArrayList<InfiniteRecursion>();
        for (int i = 0; i < 10; i++) {
            v.add(new InfiniteRecursion());
        }
        System.out.println(v);
    }
}

If you don't pay attention to this situation, you'll make the same mistake as the above code -- Stack Overflow Error

This is a stack memory overflow error due to infinite recursion, because the InfinitialRecursion class overwrites toString and returns a string + splicing operator. Although the stitching object is this object, because it is the stitching of strings, jvm automatically transforms to String type, and then calls toString again, resulting in errors.

About string pool-intern

Java has a container for string objects, pool of strings. As long as the pool does not exist, a new String object can be saved and a unique reference can be generated. When we create a new string content with the same content, we can directly refer to the string object in the pool, thereby reducing the new string bring. Overhead improves application performance, and String's example method, intern, serves this purpose:

public class StringIntern {
    public static void main(String[] args) {
        var s = "MarsonShine";
        var ss = new String("MarsonShine");
        var sss = ss.intern();
        System.out.println("s == ss: " + (s == ss));// false
        System.out.println("s == sss: "+(s == sss));//  true
        System.out.println("ss == sss: "+(ss == sss));// false
    }
}

String VS StringBuilder

Finally, we conclude by comparing the performance of String and String Builder mosaic strings.

public class StringVsStringBuilder {
    private static final String INIT_STRING = "abcdefghijklmn1234567890";

    public static void main(String[] args) {
        var sw = new Stopwatch();
        sw.start();
        var str = "";
        for (int i = 0; i < 100000; i++) {
            str += INIT_STRING;
        }
        sw.end();
        System.out.println("String + Running time:" + sw.ElapsedMilliseconds() + " ms");

        sw.restart();
        var sb = new StringBuilder();
        for (int i = 0; i < 100000; i++) {
            sb.append(INIT_STRING);
        }
        sw.end();
        System.out.println("StringBuilder append Running time:" + sw.ElapsedMilliseconds() + " ms");
    }
}

In this class, String and StringBuilder are used to stitch INIT_STRING string objects of fixed length.

The test results are certainly as you expected, the latter time is much less than the former. But when the number of stitching strings is relatively small, the difference is minimal. In theory, in the process of stitching a few strings, StringBuilder's performance is inferior to String's. But after a lot of tests on my computer, I found that StringBuilder's performance is always better than String's. I have some doubts whether I wrote the wrong auxiliary class of Stopwatch.

Finally, let me attach this code.

package performance;

public class Stopwatch {
    private long startTime;
    private long endTime;
    public void start(){
        startTime = System.currentTimeMillis();
    }
    public void end(){
        endTime = System.currentTimeMillis();
    }
    public void restart(){
        startTime = System.currentTimeMillis();
    }
    public long ElapsedMilliseconds(){
        return endTime - startTime;
    }
}

Epilogue

Because I want to find out how the + operator in Java is actually called and how the running process is, is it the concat method that is called like C#?

Later, by decompiling java code, I found that string's + operator became a dynamic instruction call in JVM:

The invokedynamic instruction calls the java.lang.invoke.makeConcatWithConstants method, and then generates CallSite information based on MethodHandler and MethodType information to execute specific functions, but I am not clear about the process of CallSite invocation and the debugging breakdown is not clear (Idiea can't play -)

Students who know me want to tell me^^

Topics: Java jvm Programming less