[Lens 1] String object creation


String s=new String(“Hello world”); String s=new String(“Hello world”); The question is what is the parameter “Hello world”, is it also a string object? Create a string object with a string object?

String s=”Hello world”; String s=”Hello world”; Int I =1; int I =1; int I =1;

We all know that to run a Java program, the compiler first needs to compile the source code files into bytecode files (A.K.A..class files). The execution is then interpreted by the JVM. A class file is an 8-bit binary stream. The meaning of these binary streams consists of compact meaningful terms. For example, the first four bytes of the class stream are called magic, which distinguishes a class file (0xCAFEBABE) from a non-class file. The rough structure of the class byte stream is shown on the left.

One of the most important items in the class file is the constant pool. This constant pool is dedicated to placing symbol information in the source code (and different symbol information is placed in the constant table of different flags). As shown above, on the right is the constant table in the HelloWorld code (HelloWorld code below), which has four different types of constant tables (four different constant pool entries).

public class HelloWorld{  
    void hello(){  
        System.out.println("Hello world"); }}Copy the code

As you can see in the figure above, the “Hello world” string literal in the code is compiled and clearly stored in the string constants table in the class constant pool (the red box on the right).

★ The JVM runs the class file

After the source code is compiled into a class file, the JVM runs the class file. It first loads into the class file using the class loader. You then need to create a number of in-memory data structures to hold the byte data in the class file. For example, the class information data corresponding to the class file, constant pool structure, binary instruction sequence in the method, class method and field description information, and so on. Of course, at runtime, you also need to create stack frames for the method, etc. With so many memory structures to manage, the JVM organizes them all into several “runtime data areas.” There are commonly referred to as “method areas,” “stacks,” “Java stacks,” and so on.

As we mentioned above, every literal string in Java source code, at the class file stage, forms a constant scale with the flag number 8(CONSTANT_String_info). When the JVM loads the class file, it creates an in-memory data structure for the corresponding constant pool and stores it in the method area. At the same time, the JVM automatically creates a new String in the heap for the String constant literals in the CONSTANT_String_info table. Then convert the entry address of the CONSTANT_String_info table to the direct address of the String object in the heap (constant pool parsing).

The key thing here is the detention string object. All string constants of the same literal value in the source code can create only one detention string object. The JVM actually maintains this feature through an internal data structure that records a reference to the detention string. In Java programs, you can call the String intern() method to make a regular String object a detention String object. We’ll talk about that later.

★ Opcode mnemonic instructions with the above two knowledge premises, we will distinguish between the two string object creation methods based on binary instructions:

(1) String s=new String(“Hello world”); Directives compiled into a class file (view in MyEclipse):

0 new java.lang.String [15] // Allocate space for a String object in the heap and heap the address of the object into the operand stack. 3 dup // Copies the data at the top of the operand stack and pushes it onto the operand stack. This instruction causes two strings to be referenced in the operand stack. 4 ldc <String"Hello world"> [17] // Constant the string in the constant pool"Hello world"Invokial Java.lang.String(java.lang.String) [19] // Call the initialization method of String, pop the address of two objects at the top of the operand stack, Astore_1 [s] // The top of the operand stack is stored in the first location of the local variable area. This stores the address of the initialized String created by the new directive (the top value of the stack pops up into a local variable).Copy the code

Note:


There is a dUP directive that copies a reference to the previously allocated java.lang. String space and pushes it to the top of the stack. The reason for this is that the Invokespecial directive looks for the java.lang.String() constructor through the constant pool entry [15], although the constructor is found. But must have to know who is the constructor, so will be allocated before the application of pressure into the stack to invokespecial command just know originally this construction method is just created the reference, the value of the call is completed will stack pop-up. Called after astore_1 value will now stack pop-up into local variables.”


In fact, the JVM has already created a detention string in the heap for “Hello World “before running this instruction. (Note that if there is also a “Hello World” string constant in the source program, they all correspond to the same detention string in the heap.) The value of the detention String is then used to initialize the new String created by the new directive in the heap. The local variable S actually stores the address of the new heap object. Notice that there are two String objects with the same String value in the JVM-managed heap: a detention String object and a newly created String object. If there is a create statement String s1=new String(“Hello world”); How many strings of “Hello world” are there in the heap? The answer is 3. Think about why!

String s=”Hello world”; After compiling to a class file:

0  ldc <String "Hello world"> [15]// Constant the string in the constant pool"Hello world"Astore_1 [STR] // Popup the top of the operand stack is stored at the first location in the local variable area. This stores the address of the detention string object in the heapCopy the code

Very different from the creation instruction above, the local variable S stores the heap address of the detention string already created (there is no new object). So if you think about it, if you have a String s1=”Hello word”; How many strings of “Hello world” are there in the heap? The answer is 1. Are the local variables S and S1 stored at the same address? Ha ha, this you should know.

★ Summary: String type naked is also very common. What really makes her mysterious is the presence of the CONSTANT_String_info constant scale and the detention string object. Now we can settle many disputes in the rivers and lakes.

Dispute 1: A dispute over the equality of strings

String sa=new String("Hello world");            
String sb=new String("Hello world");      
System.out.println(sa==sb);  // false// code 2 String sc="Hello world";    
String sd="Hello world";  
System.out.println(sc==sd);  // true   
Copy the code

In code 1, the local variable sa,sb, stores the memory addresses of the two strings new by the JVM from the heap. Even though the values of both strings (char[]) are “Hello world”. So “==” compares two different heap addresses. The local variables sc and SD in code 2 also store addresses, but both are the addresses of the detention string object unique to the heap pointed to by “Hello World “in the constant pool. It’s going to be equal.

【 Dispute 2】 The inside of the string “+” operation

// code 1 String sa ="ab";                                          
String sb = "cd";                                       
String sab=sa+sb;                                      
String s="abcd";  
System.out.println(sab==s); // false// code 2 String sc="ab"+"cd";  
String sd="abcd";  
System.out.println(sc==sd); //true  
Copy the code

The local variable sa,sb, in code 1 stores the addresses of two detention string objects in the heap. When sa+sb is executed, the JVM first creates a StringBuilder class on the heap, initializes it with the detention string object pointed to by SA, and then calls append to merge the detention strings pointed to by SB. We then call The toString() method of StringBuilder to create a String in the heap, and finally store the heap address of the newly generated String in the local variable SAb. The local variable s stores the address of the detention string object corresponding to “abcd” in the constant pool. Of course, the address of SAB and s are different. Note here that there are actually five String objects in the heap in code 1: three detention String objects, a String, and a StringBuilder object. The “ab”+” CD “in code 2 is merged directly at compile time into the constant “abcd”, so the same literal constant “abcd” corresponds to the same detention string object, and thus has the same natural address.

Three sisters lens two 】 String (String and StringBuffer, StringBuilder) String of steak. But he has two sisters StringBuffer, StringBuilder long also good oh! We’re going to do it too: String immutable character sequence StringBuffer thread-safe mutable character sequence StringBuilder thread-safe mutable character sequence

★StringBuffer and String variability problem. Let’s take a look at some of the source code for these two classes:

//String public final class String { private final char value[]; Public String(String original) {// Divide original into an array of characters and assign value[]; } } //StringBuffer public final class StringBuffer extends AbstractStringBuilder { char value[]; AbstractStringBuilder class [] public StringBuffer(String STR) {super(str.length() + 16); // Inherit the constructor of the parent class and create a value[] array of size str.length()+16 append(STR); // Split STR into character sequences and add them to value[]}}Copy the code

Obviously, the value[] in String and StringBuffer are used to store sequences of characters. However, (1) String is a final array that can only be assigned once. For example, new String(” ABC “) makes value[]={‘a’,’b’,’c’}(see JDK String implementation), after which the value[] in the String can never be changed. That’s why people say strings are immutable. Note: This is a mistake for beginners. Some people say String str1=new String(” ABC “); str1=new String(“cba”); Didn’t I change the string str1? You need to understand the difference between an object reference and the object itself. The object itself refers to the instance data (non-static invariant fields) of the object stored in heap space. Object references refer to the address of the object itself in the heap, and the general method area and the Java stack store object references, not the data of the object itself.

(2) The value[] in a StringBuffer is a plain array, and a new string can be added to the end of the value[] using the append() method. This changes the content and size of value[].

For example: the new StringBuffer (” ABC “) makes the value [] = {‘ a ‘, ‘b’, ‘c’, ‘, ‘ ‘… }(note that the length of the construct is str.length()+16). If it will be the object append (” ABC “), then the value of this object [] = {‘ a ‘, ‘b’, ‘c’, ‘a’, ‘b’, ‘c’, ‘ ‘… }. That’s why StringBuffer is a mutable string. You can also see from this that the value[] in StringBuffer can be used as a buffer for strings. Its cumulative performance is quite good, and we will compare it later. In conclusion, we discussed that Strings and StringBuffers are mutable. The value[] character array in an object is mutable.

StringBuffer and StringBuilder are kind of twins. There’s not much difference between the two methods. But in thread-safety terms, StringBuffer allows multiple threads to perform character manipulation. This is because many of the methods of StringBuffer in the source code are modified by the keyword synchronized, whereas StringBuilder is not. Programmers with experience with multithreaded programming should know synchronized. This keyword is set for thread synchronization. When A thread A calls synchronized method M in class object O, it must acquire the lock of object O before it can execute M. Otherwise, thread A blocks. Once thread A starts executing M, the lock on object O will be exclusive. Causes other threads that need to call M methods on O objects to block. Only thread A completes and releases the lock. Those blocked threads get a chance to call M again. This is the locking mechanism that solves the thread synchronization problem. Now that you know what synchronized means, you probably feel that way. StringBuffer is much safer than StringBuilder in multithreaded programming, and indeed it is. If multiple threads need to operate on the same StringBuffer, StringBuffer should be the best choice. Note: Is String also unsafe? In fact, there is no such problem. Strings are immutable. Threads can only read a String specified in the heap, but cannot modify it. Ask: what else is not safe?

★ The efficiency of strings and Stringbuffers (this is a hot topic!) First, StringBuffer and StringBuilder are twins. StringBuilder was introduced in 1.5, and its predecessor was StringBuffer. StringBuilder is slightly more efficient than StringBuffer and should be the first choice if thread safety is not a concern. In addition, the JVM spends most of its time running programs creating and reclaiming objects.

We use the following code to test the running time of String and StringBuffer by running 1W concatenations of strings.

Public class RunTime{public static void main(String[] args){public static void main(String[] args) beginTime=System.currentTimeMillis();for(int i=0; i<10000; I++){● test code position 2} long endTime= system.currenttimemillis (); System.out.println(endTime-beginTime); }}Copy the code

(1) String STR =””; STR =”Heart”+”Raid”; String s1=”Heart”; String s2=”Raid”; String str=””; STR =s1+s2; Conclusion: The “+ concatenation” of String constants is slightly better than the “+ concatenation” of String variables. Reason: Test 1’s “Heart”+”Raid” was concatenated at compile time to form a string constant “HeartRaid” pointing to a detention string object in the heap. The runtime only needs to fetch the address of the detention string object pointed to by “HeartRaid” 1W times and store it in the local variable STR. It really doesn’t take much time. The local variables s1 and s2 in test 2 hold the addresses of two different detention string objects. StringBuilder temp=new StringBuilder(s1); temp. Append (s2); 3, STR = temp. ToString (); We see that although the append() method is also used in the middle, we create a StringBuilder at the beginning and a String at the end, respectively. You can imagine: 1W calls and 1W creation of these objects? Don’t bargain.

However, the “+ concatenate “operation on String variables is more widely used than the “+ concatenate” operation on String constants. This goes without saying.

(2) Compare the String “tired +” join with the append() join with the StringBuffer. String s1=”Heart”; String s=””; (code position 2) s=s+s1; String s1=”Heart”; StringBuffer sb=new StringBuffer(); Sb. Append (s1); Conclusion: The append() of a StringBuffer is far more efficient than the “tired +” concatenation of String objects when a large number of strings are accumulated. When s=s+s1, the JVM creates a StringBuilder and uses the append method to merge the values of the string objects pointed to by s and s1. The toString() method of StringBuilder is then called to create a new String object in the heap with the value of the merged String. The local variable s points to the newly created String.

Because the value[] in a String cannot be changed, a new String must be created to store the String value after each merge. Looping a million times naturally requires creating a million Strings and a million StringBuilder objects, which is understandably inefficient.

Sb.append (s1); Simply expand your value[] array to hold S1. There is no need to create any new objects in the heap during the loop. No wonder it’s efficient.

★ Summary of shot:

(1) String constants that can be determined at compile time; there is no need to create String or StringBuffer objects. The “+” concatenation that uses the string constant directly is the most efficient.

(2) The append efficiency of a StringBuffer is higher than that of a String “+” join.

(3) Constantly creating objects is an important cause of program inefficiency. Can the same String value be created in the heap with only one String object. Obviously holding strings can do this, as can calling the String intern() method, in addition to the String constants in the program being automatically created by the JVM. When you call intern(), if you already have the value of the current String in the constant pool, return the constant to the address of the holding object. If not, the String value is added to the constant pool and a new detention String object is created.