Analysis of string splicing in Java code

Posted by direland on Sun, 13 Feb 2022 09:49:25 +0100

The string splicing methods discussed in this paper are as follows: "+" sign, StringBuilder, StringJoiner and String#join, which are compared and analyzed to explore the best practices.

conclusion

The following content is boring, so let's start with the conclusion:

  1. The string splicing methods discussed in this paper are as follows: "+" sign, StringBuilder, StringJoiner and String#join
  2. In a simple string splicing scenario, such as "a" + "b" + "c", there is no significant difference in the performance of the above four methods.
  3. In the scenario of circular string splicing, the "+" sign has the lowest performance, and there is no significant difference in the performance of the other three methods. However, according to the verification results, it can be found that the StringBuilder with the specified initial capacity has the highest efficiency. Of course, we should not only consider the performance, but also consider the efficiency of garbage collection to avoid OOM.
  4. At the end of this paper, the StringBuffer is supplemented and compared. In the scenario of no competition for shared resources, the performance of StringBuffer does not deteriorate significantly.

Best practices

  1. Alibaba Java development manual - log specification "5" can be optimized: the form of placeholder is not readable and convenient. Lambda can be considered to delay string splicing and make it more convenient to use.
  2. Alibaba Java development manual - OOP Protocol "23" can be optimized: StringBuilder must be used for loop splicing; When splicing a large number of high-capacity strings, use StringBuilder to specify the initial capacity as much as possible.
  3. Simple string splicing can be done in any way. It is recommended to use "+" sign directly for the best readability.
  4. Try to use the features directly provided by JDK, such as "+" sign, concatenated string, Synchronized keyword, etc., because the compiler + JVM will continue to optimize this, and JDK upgrade can obtain greater benefits. Unless there is a clear reason, it can realize similar functions by itself.
  5. In scenarios where thread safety needs to be considered, StringBuffer can be used for string splicing. However, generally speaking, there is no such requirement, so StringBuffer should not be used to avoid increasing complexity.

Analysis process

environment

  1. System: windows 10 21H1
  2. JDK: OpenJDK 1.8.0_302
  3. Sample code for analysis:
@Slf4j
public class StringConcat {

    @SneakyThrows
    public static void main(String[] args) {
        log.info("java Virtual machine warm-up starts");
        String[] strs = new String[6000000];
        for (int i = 0; i < strs.length; i++) {
            strs[i] = id();
        }
        loopStringJoiner(strs);
        loopStringJoin(strs);
        loopStringBuilder(strs);
        log.info("java Virtual machine warm-up ends");
        Thread.sleep(1000);
        log.info("Start test:");

        Thread.sleep(1000);
        Stopwatch stopwatchLoopPlus = Stopwatch.createStarted();
//        loopPlus(strs);
        log.info("loop-plus: " + stopwatchLoopPlus.elapsed(TimeUnit.MILLISECONDS));

        Thread.sleep(1000);
        Stopwatch stopwatchLoopStringBuilderCapacity = Stopwatch.createStarted();
        loopStringBuilderCapacity(strs);
        log.info("loop-stringBuilderCapacity: " + stopwatchLoopStringBuilderCapacity.elapsed(TimeUnit.MILLISECONDS));

        Thread.sleep(1000);
        Stopwatch stopwatchLoopStringBuilder = Stopwatch.createStarted();
        loopStringBuilder(strs);
        log.info("loop-stringBuilder: " + stopwatchLoopStringBuilder.elapsed(TimeUnit.MILLISECONDS));

        Thread.sleep(1000);
        Stopwatch stopwatchLoopJoin = Stopwatch.createStarted();
        loopStringJoin(strs);
        log.info("loop-String.join: " + stopwatchLoopJoin.elapsed(TimeUnit.MILLISECONDS));

        Thread.sleep(1000);
        Stopwatch stopwatchLoopStringJoiner = Stopwatch.createStarted();
        loopStringJoiner(strs);
        log.info("loop-stringJoiner: " + stopwatchLoopStringJoiner.elapsed(TimeUnit.MILLISECONDS));

        Thread.sleep(1000);
        Stopwatch stopwatchSimplePlus = Stopwatch.createStarted();
        for (int i = 0; i < 500000; i++) {
            simplePlus(id(), id(), id());
        }
        log.info("simple-Plus: " + stopwatchSimplePlus.elapsed(TimeUnit.MILLISECONDS));

        Thread.sleep(1000);
        Stopwatch stopwatchSimpleStringBuilder = Stopwatch.createStarted();
        for (int i = 0; i < 500000; i++) {
            simpleStringBuilder(id(), id(), id());
        }
        log.info("simple-StringBuilder: " + stopwatchSimpleStringBuilder.elapsed(TimeUnit.MILLISECONDS));

        Thread.sleep(1000);
        Stopwatch stopwatchSimpleStringBuffer = Stopwatch.createStarted();
        for (int i = 0; i < 500000; i++) {
            simpleStringBuffer(id(), id(), id());
        }
        log.info("simple-StringBuffer: " + stopwatchSimpleStringBuffer.elapsed(TimeUnit.MILLISECONDS));

    }

    private static String loopPlus(String[] strs) {
        String str = "";
        for (String s : strs) {
            str = str + "+" + s;
        }
        return str;
    }

    private static String loopStringBuilder(String[] strs) {
        StringBuilder str = new StringBuilder();
        for (String s : strs) {
            str.append("+");
            str.append(s);
        }
        return str.toString();
    }

    private static String loopStringBuilderCapacity(String[] strs) {
        StringBuilder str = new StringBuilder(strs[0].length() * strs.length);
        for (String s : strs) {
            str.append("+");
            str.append(s);
        }
        return str.toString();
    }

    private static String loopStringJoin(String[] strs) {
        StringJoiner joiner = new StringJoiner("+");
        for (String str : strs) {
            joiner.add(str);
        }
        return joiner.toString();
    }

    private static String loopStringJoiner(String[] strs) {
        return String.join("+", strs);
    }

    private static String simplePlus(String a, String b, String c) {
        return a + "+" + b + "+" + c;
    }

    private static String simpleStringBuilder(String a, String b, String c) {
        StringBuilder builder = new StringBuilder();
        builder.append(a);
        builder.append("+");
        builder.append(b);
        builder.append("+");
        builder.append(c);
        return builder.toString();
    }

    private static String simpleStringBuffer(String a, String b, String c) {
        StringBuffer buffer = new StringBuffer();
        buffer.append(a);
        buffer.append("+");
        buffer.append(b);
        buffer.append("+");
        buffer.append(c);
        return buffer.toString();
    }

    private static String id() {
        return UUID.randomUUID().toString();
    }

}

Results and summary

- java Virtual machine warm-up starts
- java Virtual machine warm-up ends
- Start test:
- loop-plus: Execution timeout
- loop-stringBuilderCapacity: 285
- loop-stringBuilder: 1968
- loop-String.join: 1313
- loop-stringJoiner: 1238
- simple-Plus: 812
- simple-StringBuilder: 840
- simple-StringBuffer: 857
  1. After many tests, it can be found that in the string circular splicing scenario, the performance of directly using the "+" sign is the lowest, the performance of StringBuilder with initial capacity is the highest, and there is no great difference in the performance of other methods.
  2. After many tests, it can be found that in the scenario of simple string splicing, the performance gap of "+" sign, StringBuilder and StringBuffer is about 5%, which can be understood as test error, and the performance of the three methods can be considered to be consistent.

Code and result analysis

1. Comparison between StringBuilder and StringBuffer

In the scenario of no competition for shared resources, the JVM will use biased locking and other methods to optimize, or even eliminate locks. There is no significant difference in performance whether the Synchronized keyword is used or not.

2. Bytecode analysis

When the compiler uses the "+" method and the "simplePlus" method to optimize the content, it can be seen that the performance of the two methods is basically the same. It can be seen that the simplePlus method can be used to optimize the content directly. However, the following two methods can be used:

  // access flags 0xA
  private static simplePlus(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
    // parameter  a
    // parameter  b
    // parameter  c
   L0
    LINENUMBER 125 L0
    NEW java/lang/StringBuilder
    DUP
    INVOKESPECIAL java/lang/StringBuilder.<init> ()V
    ALOAD 0
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    LDC "+"
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    ALOAD 1
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    LDC "+"
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    ALOAD 2
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
    ARETURN
   L1
    LOCALVARIABLE a Ljava/lang/String; L0 L1 0
    LOCALVARIABLE b Ljava/lang/String; L0 L1 1
    LOCALVARIABLE c Ljava/lang/String; L0 L1 2
    MAXSTACK = 2
    MAXLOCALS = 3

  // access flags 0xA
  private static simpleStringBuilder(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
    // parameter  a
    // parameter  b
    // parameter  c
   L0
    LINENUMBER 129 L0
    NEW java/lang/StringBuilder
    DUP
    INVOKESPECIAL java/lang/StringBuilder.<init> ()V
    ASTORE 3
   L1
    LINENUMBER 130 L1
    ALOAD 3
    ALOAD 0
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    POP
   L2
    LINENUMBER 131 L2
    ALOAD 3
    LDC "+"
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    POP
   L3
    LINENUMBER 132 L3
    ALOAD 3
    ALOAD 1
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    POP
   L4
    LINENUMBER 133 L4
    ALOAD 3
    LDC "+"
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    POP
   L5
    LINENUMBER 134 L5
    ALOAD 3
    ALOAD 2
    INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    POP
   L6
    LINENUMBER 135 L6
    ALOAD 3
    INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
    ARETURN
   L7
    LOCALVARIABLE a Ljava/lang/String; L0 L7 0
    LOCALVARIABLE b Ljava/lang/String; L0 L7 1
    LOCALVARIABLE c Ljava/lang/String; L0 L7 2
    LOCALVARIABLE builder Ljava/lang/StringBuilder; L1 L7 3
    MAXSTACK = 2
    MAXLOCALS = 4

Topics: Java