Concatenation of strings is used frequently in projects, but it can cause performance problems if you are not careful. Recently, when reviewing the code, I found my colleague wrote the following code, so I mentioned a bug to him.

@Test
public void testForAdd() {
    String result = "NO_";
    for (int i = 0; i < 10; i++) {
        result += i;
    }
    System.out.println(result);
}
Copy the code

This article takes you from the surface down to the bottom to talk about why this approach has performance problems.

IDE the prompt

If you are using an IDE that has a plugin for code checking installed, you can easily see the “+=” operation in the above code with a yellow background. This is the plugin’s hint.

+= + IDEA

String concatenation ‘+=’ in loop

Inspection info: Reports String concatenation in loops. As every String concatenation copies the whole String, usually it is preferable to replace it with explicit calls to StringBuilder.append() or StringBuffer.append().

In the loop, string concatenation uses “+=”. Validation information: String concatenation in the report loop. Each concatenation of a String copies the entire String. It is often recommended to replace this with stringBuilder.append () or stringbuffer.append ().

The prompt provides the reasons and suggestions for solutions. But is it really as simple as the hint? Java8: String concatenation JVM has been optimized to build StringBuilder by default. Let’s take a closer look.

Decompilation of bytecode

Let’s decompile the above code by bytecode to see if the JVM did any optimization for us and if it involved copying the entire String.

Use the javap -c command to view the bytecode contents:

public void testForAdd(); // String NO_ 2: astore_1 3: iconst_0 4: istore_2 5: ILoAD_2 6: Bipush 10 // If the top two values are greater than or equal to 0(0-10) then jump 36(code), where we start the for loop 8: if_icMPGE 36 // Create a StringBuilder object that references stack 11: New #3 // class Java /lang/StringBuilder 14: dup invokespecial #4 // Method java/lang/StringBuilder."<init>":()V 18: aload_1 19: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder; 22: iload_2 / / call append Method 23: invokevirtual # 6 / Method/Java/lang/StringBuilder. Append: (I) Ljava/lang/StringBuilder; / / call the toString Method, and will produce a String in the top 26: invokevirtual # 7 / Method/Java/lang/StringBuilder. ToString () Ljava/lang/String; 29: astore_1 30: iinc 2, 1 33: goto 5 36: getstatic #8 // Field java/lang/System.out:Ljava/io/PrintStream; 39: aload_1 40: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;) V 43: returnCopy the code

The key parts of the decompiled bytecode operations above have been highlighted. The defined “NO_” string is loaded at number 0, the loop is judged at number 8, and the parts that meet the conditions (0-10) are executed in the subsequent loop body. In the body of the loop, number 11 creates a StringBuilder object, number 15 calls the constructor of StringBuilder, number 23 calls the append method, and number 26 calls the toString method.

What can we discover after the above steps? The JVM did optimize it for us at compile time, converting the concatenation of strings in the for loop into A StringBuilder and processing it through the Appen and toString methods. Is that a problem? The JVM has been optimized!

But here’s the kicker: Each time the for loop creates a new StringBuilder, it appends and toString, and then destroys it. This gets scary, as opposed to creating a String and copying it every time.

After the above analysis, the effect of the above code is equivalent to the following:

@Test
public void testForAdd1() {
    String result = "NO_";
    for (int i = 0; i < 10; i++) {
        result = new StringBuilder(result).append(i).toString();
    }
    System.out.println(result);
}
Copy the code

Does that make sense? By now, you can see why you gave your colleague the bug.

Plan to improve

So, how can the code be improved to address the above problems? Directly on the code:

@Test
public void testForAppend() {
    StringBuilder result = new StringBuilder("NO_");
    for (int i = 0; i < 10; i++) {
        result.append(i);
    }
    System.out.println(result);
}
Copy the code

Leave the creation of the StringBuilder object outside and call Append directly in the for loop. Let’s look at the bytecode manipulation of this code again:

public void testForAppend(); Code: 0: new #3 // class java/lang/StringBuilder 3: dup 4: ldc #2 // String NO_ 6: invokespecial #10 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;) V 9: astore_1 10: iconst_0 11: istore_2 12: iload_2 13: bipush 10 15: if_icmpge 30 18: aload_1 19: iload_2 20: invokevirtual #6 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder; 23: pop 24: iinc 2, 1 27: goto 12 30: getstatic #8 // Field java/lang/System.out:Ljava/io/PrintStream; 33: aload_1 34: invokevirtual #11 // Method java/io/PrintStream.println:(Ljava/lang/Object;) V 37: returnCopy the code

Compare the original bytecode content and see if it simplifies a lot. The problem is solved perfectly.

Scenarios within the for loop

The usage scenarios described above focus on getting an entire string through a for loop, but some business scenarios might have the concatenated string itself inside the for loop and not handled outside the for loop, for example:

@Test public void testInfoForAppend() { for (int i = 0; i < 10; i++) { String result = "NO_" + i; System.out.println(result); }}Copy the code

The concatenation of strings inside the for loop in the above code can be even more complex, as we already know that the JVM is optimized to handle the StringBuilder mentioned above. At the same time, a StringBuilder object is created every time, so should we just leave it at that?

Another way to think about it is to create a StringBuilder outside the for loop and then empty it when it is used internally. There are two ways to do this: delete and setLength.

Sample code for both methods directly:

@Test public void testDelete() { StringBuilder result = new StringBuilder(); for (int i = 0; i < 10; i++) { result.delete(0,result.length()); result.append(i); System.out.println(result); } } @Test public void testSetLength() { StringBuilder result = new StringBuilder(); for (int i = 0; i < 10; i++) { result.setLength(0); result.append(i); System.out.println(result); }}Copy the code

For those interested in the validation and low-level operations of the above example, you can dig a little deeper, but only the conclusion. Both methods performed much better than the default. In addition, the delete operation is slightly better than setLength. Therefore, the delete operation is recommended.

summary

Through a hint of IDE, we dig the underlying principle and verify the implementation, unexpectedly found so much room for improvement and hidden knowledge, isn’t it a sense of achievement? Finally, let’s summarize a little bit about String and StringBuilder (based on Java8 and above) :

  • There is no circular string concatenation, just use +, and the JVM will optimize it for us.
  • Concurrent scenarios do string concatenation, using StringBuffer instead of StringBuilder, which is thread-safe.
  • The optimization of the JVM in a loop has some drawbacks. A StringBuilder can be built outside the loop and append can be performed inside the loop.
  • For string concatenation used in the pure body of the loop, a StringBuilder can be built outside the body of the loop and cleaned up after use (delete or setLength).

I gave a Bug to a Colleague using StringBuilder like this


Program new horizon

\

The public account “program new vision”, a platform for simultaneous improvement of soft power and hard technology, provides massive information