See a lot of beginners on the web are not very clear about the String class, ask a lot of questions, and then the answer to the question is not clear. Even some developers who have worked on the String class for years don’t understand it, and the answer is not clear. So I took pains to JDK8 version as far as possible thorough analysis of Java String class.

String and primitive data types

Strings are widely used in Java development and can be declared directly as literals. It is worth noting, however, that it is not a primitive data type; it is a reference type. So the String object is stored in heap space and its reference in stack space.

When you assign a value to a reference type variable in Java, it is understood that the variable (reference) refers to a memory address. When this code is executed, it is actually a STR in the stack pointing to a space in the heap.

String str = "Hello";

Immutable String

Before we talk about immutable Strings, let’s look at immutable objects in Java.

  • Immutable object: An object is immutable if its state cannot be changed after it is created. What about the state of the object?
  • Object state: The member variables defined in the class are called properties, and the specific values of the properties of the object created at runtime are the state of the object.

If you look at the source code for the String class, you will see that the literal contents of the String class are maintained by a value[] array of final char. We all know that variables decorated with the final keyword can only be assigned once, that is, to one memory address, but not to another. So the String object will not be changed once it is created (reflection is not considered here). So the state of a String cannot be changed, it is immutable. See the memory diagram below

String str = "Hello";

Using this line of code as an example, to roughly draw the layout of the object in memory at runtime, the value[] is a member variable of String. Since value is final, we know that final variables can only be initialized once, that is, a constant, so the memory space that the array points to is fixed, that is, line 2 in the figure above. When you reassign STR

str = "World";

In fact, the memory space that STR points to has changed, that is, the 1 pointer has broken off and is pointing somewhere else. The original 2 pointer is still the same.

String has replace(), subString(), toLowerCase(), etc. Look at the following code

String str = "Hello"; str.toLowerCase(); STR. The substring (0, 1); str.replace("H","h");Copy the code

The problem is very simple. These methods do return a new String as expected, but if you look at the source code, you’ll see that they all return a new String inside, without changing the content of the object to which STR refers. All operations on String are regenerated as an object, leaving the original object unchanged.

Now let’s look at an exercise

public static void main(String[] args){ String str = "Hello World"; char[] arr = {"H","e","l","l","o"}; change(str); changeArr(arr); System.out.println(str); //Hello World System.out.println(arr); //Wello } private static void change(String str){ str = "World"; } private static void changeArr(char[] arr){ arr[0] = 'W'; arr = new char[5]; }Copy the code

You can try this problem, and I’m sure most beginners will get it wrong. Why doesn’t the change() method change STR? Why does changeArr() change the contents of array arr?

Before we get to the bottom of that, let’s give you a general idea. Java programs run on the JVM, which has two important areas of memory: the virtual machine stack and the heap. All method calls in Java are the operation of pushing a stack frame into the virtual machine stack, and all method calls are the operation of pushing a stack frame out of the virtual machine stack. So if you look at this code, there are three methods: main, change, and changeArr. They go on and off the stack as they execute.

Then we look at the specific memory map in the stack frame dimension

The program starts by executing the main method, which pushes the frame onto the stack, with string references to STR and array references to arR pointing to the heap. When the change method is executed, the corresponding stack frame of change is pushed onto the stack. We all know the data structure characteristics of the stack, namely, first in and then out. The main frame is at the bottom of the stack and the change frame is at the top. The change method reassigns the passed parameter STR. Note that the STR parameter in the change method is a copy of the STR in the main method at runtime, which is how you can understand Java’s reference type assignments.

If you’re assigning a variable to another variable

String str2 = str1;

What it means is that STR2 points to the memory address in the heap that STR1 points to. So going back to our change method, we’re actually calling STR in the change stack frame to point to the same memory address that STR in main used to point to. (this is a copy of STR in the original main frame. I’m afraid it’s not easy to understand.)

After the change method is executed, the change frame is removed from the stack, and all variables (references) inside the stack frame are destroyed. So the STR output from main is of course unchanged. If you put the output from line 6 inside change, the output is the “World” that STR points to in the change frame, but the original “HelloWorld” remains unchanged, just a new block of memory is opened. So there’s also the question of variable scope.

What’s misleading here is that the change parameter has the same name as the variable in main. You can change the parameter to a different name to make it clearer.

And let’s see why arrays change the contents of their objects. When we get to changeArr, the frame is at the top of the stack, and the frame is at the bottom of the stack. We pass arR to changeArr, and copy a reference in changeArr’s stack frame. With this reference, we change the first element of the array. That you have changed the original content through this reference, that must have changed ah! Since the contents of the array are mutable, it’s not like you just opened the memory address, it’s like you opened the warehouse door with a key and replaced the melon with a cantaloupe. Of course you’ll turn it into cantaloupe when you open the warehouse.

And then this line of code

arr = new char[5];

This is the same as STR above, I repoint the reference in my stack frame to a block of memory, regardless of the original arR in the main method, when the changeArr is finished, the stack frame is out of the stack, and the lifetime of the reference in the stack frame is ended and destroyed.

The benefits of String immutable

The official definition of String as immutable is no doubt deliberate. What are the benefits of immutable?

  • State immutable objects have no problem thread safety, do not need any locking operation to ensure thread safety, improve system performance.
  • Immutable objects are required to implement a constant pool of strings, putting the same literals into the same memory address in the constant pool.

String comparison

String as a reference type can be compared using the “==” and equals methods. For reference types, “==” compares whether addresses are the same, and String literals are compared when String calls the equals comparison. This point we look at the String class source is very clear. The String class overrides equals, which returns true if each character of two strings is the same.

Here, too, is an interview question asking the difference between “==” and equals. A lot of people will answer that the reference type “==” compares addresses and equals values. This is totally wrong. Equals compares values only because the String class overrides equals. If you define a class and call its equals method, you’ll find that it’s not what you thought it was.

public class Test {
    private String name;
    public static void main(String[] args) {
        Test t1 = new Test("ceshi");
        Test t2 = new Test("ceshi");
        System.out.println(t1.equals(t2));//false
    }
    public Test(String name){
        this.name = name;
    }
}
Copy the code

If you look at this example, the result is false. Test equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals

public boolean equals(Object obj) {
        return (this == obj);
    }
Copy the code

In the root class Object, equals and “==” are equivalent, so what equals compares depends on how subclasses override it. If they don’t override it, they are equivalent by default.

The String constant pool

The String constant pool, also known as the String constant pool, involves a lot of JVM memory areas, class loading, and more. Here we simply understand that the JVM provides a block of memory for holding String objects. This way, if you use the same string in the future, you don’t have to create a new space, just use objects from the string constant pool.

As mentioned earlier, String can be used to declare an object directly as a literal, so of course String can be used as a reference type to create objects using the new keyword.

  • When you declare an object in literal form, you first check to see if the object already exists in the string constant pool. If it does, you refer to it directly, and if it doesn’t, you put it in the constant pool.
  • When you declare an object with the new keyword, you first check to see if the object already exists in the string constant pool. If so, you point the reference directly to the existing object, and if not, you place the object in the constant pool. It also creates a memory address in the heap space. And points the reference to the address in the heap
  • Because the new keyword creates space in the heap, it is generally not recommended in development and can be declared as a literal
String str1 = "Hello"; String str2 = new String("World"); String str3 = new String("Hello"); // Create two objects, one in the heap and one in the constant pool // Create an object in the heap, since the constant pool already has Hello will not be created againCopy the code

Let’s look at an example

public static void main(String[] args){ String str1 = "Hello"; String str2 = "Hello"; System.out.println(str1 == str2); //true String str3 = new String("Hello"); System.out.println(str1 == str3); //false String str4 = new String("Hello"); System.out.println(str3 == str4); //false }Copy the code

The result of the above code is very simple, so I won’t go into details. Some companies will ask you how many objects this line of code creates. Now you know what to say.

String str = new String("Hello"); If the Hello string has already been declared, this line of code creates only one object in the heap, otherwise it creates two objects, one in the heap and one in the constant poolCopy the code

If you look at this and you think you’re invincible, how many objects do these two lines of code create? Please try to

String str = "Hello" + "World";
---------------------------------------------------------
String str = "Hello" + new String("World");
Copy the code

“Addition” of String classes

If you’ve learned C++. You should know that developers are allowed to override operators in C++, but not in Java. Only the “+ “and “+=” special overloads are officially provided for the String class. The “+” concatenates two strings to create a new object, which creates a new space in memory. Have you ever wondered why strings can use the + operator? After all, String is not Java’s eight basic data types and the corresponding boxing type, it is a reference type, so it can use + “” must be official manipulation.

Look at the following example

public static void main(String[] args) { String str1 = "Hello"; String str2 = "World"; String str3 = "Hello" + "World"; String str4 = "HelloWorld"; System.out.println(str3 == str4); //true String str5 = str1 + "World"; System.out.println(str5 == str4); //false String str6 = str1 + str2; System.out.println(str6 == str4); //false }Copy the code

It’s worth thinking about here, why does line 6 result in true? This result is determined by the compiler when “+” is used to concatenate two string constants. Since both sides of the “+” sign are constants, the concatenate is directly added to the constant pool. Constants on either side of the ‘+’ are not added to the constant pool.

Why are lines 8 and 10 false? Because on lines 7 and 9, both sides of the “+” are not constants, one is a variable, so its result is not determined at compile time, but only at run time. So it doesn’t join into the constant pool like above. So you might be wondering, how do you dynamically put variables into a constant pool at runtime? Congratulations, String gives us a method intern(). This method is a native method that attempts to put the object that calls it into the constant pool, and returns a reference to the constant pool if it already exists, or a reference to the constant pool if it doesn’t exist.

public static void main(String[] args) { String str1 = "hello"; String str2 = "helloworld"; String str3 =str1 + "world"; System.out.println(str3 == str2); //false str3 = str3.intern(); // place str3 in the constant pool and assign the reference to the original str3 system.out.println (str3 == str2); //true }Copy the code

The code above is a good example of what the intern() method does.

String, StringBuilder, StringBuffer

Thinking of the String class naturally brings to mind two closely related classes. Since strings are immutable, each operation creates a new object, which seems unfriendly and can lead to excessive memory usage and frequent object creation. So we’re officially provided with a class called StringBuilder. This class can add, delete, and modify strings without creating new memory. This makes up for the unfriendliness of strings in some scenarios. It’s very similar to String. Remember earlier we said that String maintains a final char[] value internally? StringBuilder maintains a char[] value that is not final, so it can be modified from the original string.

A StringBuffer is essentially the same as a StringBuilder, except that it uses the synchronized keyword to add, delete, and modify strings, making it thread-safe. Performance is also inferior to StringBuilder because it is thread-safe.

You probably haven’t or rarely seen the last two classes, but StringBuilder is a StringBuilder, and it’s responsible for every String’s “+”. When you have strings in your program that use a “+” concatenation, the bottom line gives you a new StringBuilder object that calls AppEnd to concatenate the string.

For example:

public static void main(String[] args) {
        String str1 = "hello";
        String str2 = "hello";
        String str3 = str1 + str2;
    }
Copy the code

Line 4 underneath the JVM will actually create a new StringBuilder that calls its append method to connect str1 and str2 and generate a new String. We can prove this by decompiling the class file using the javap -v command. So we want to use the + in our code to concatenate strings, if you recycle the + like this

public static void main(String[] args) { String str = "hello"; for(int i = 0; i<1000; i++){ str = str + i; }}Copy the code

That would be a big problem, creating 1000 StringBuilder objects underneath, wasting heap space memory. So this code, where we create the StringBuilder object ahead of time and put it outside of the loop, and then use the Append method to connect, works better.

public static void main(String[] args) { String str = "hello"; StringBuilder sb = new StringBuilder(str); for(int i = 0; i<1000; i++){ sb.append(i); }}Copy the code

Writing this way just creates a StringBuilder object.

The above is the Java String class related knowledge, if this article is helpful to you, you can give the author a thumbs up and follow, your support is the motivation I continue to create!