Code run environment: JDK 8

Consider a few questions first:

  1. How are strings implemented in different JDKS?
  2. What is the immutability of a String?
  3. What is the output of the following code?
String s1 = new String("aaa") +new String("");
s1.intern();
String s2 = "aaa";
System.out.println(s1==s2);

String s3 = new String("bbb") +new String("");
s3 = s3.intern();
String s4 = "bbb";
System.out.println(s3==s4);

String s5 = new String("hi") + new String("j");
s5.intern();
String s6 = "hij";
System.out.println(s5 == s6); 
Copy the code

The implementation of a String object

Java6 and earlier versions

A String encapsulates a char array and consists of four member variables: char array, offset, count, and hash.

Use the offset and count attributes to locate a char array and retrieve a string. Doing so allows you to share array objects efficiently and quickly while saving memory.

The disadvantage is that memory leaks may result.

Java7 and Java8 versions

By removing offset and count, the String takes up less space.

The string. substring method no longer shares a char[], resolving a potential memory leak

Java9 version

Char [] is replaced with byte[], and a new attribute coder is added, which is an encoding format identifier.

In Java programs, strings take up the most space, and most strings only have Latin-1 characters, which need only one byte. A char takes up two bytes, which is a lot of space to waste. The JDK1.9 String class uses a one-byte array to hold strings in order to save memory space.

Coder is used to calculate the length of the string or when using indexOf (), we need to determine how to calculate the length of the string. The coder attribute defaults to 0 and 1, 0 for Latin-1 (single-byte encoding) and 1 for UTF-16. The coder property is 0 if String determines that the String contains only Latin-1, and 1 otherwise.

Immutability of String objects

Take a look at the following code:

String str = "Hello";
str = "World";
Copy the code

Well, STR has changed. What about the immutability of strings?

There are two meanings involved here, one is the object itself and the other is the object reference. The object itself is a memory address in memory. An object reference is a reference to that memory address. In these two lines, STR is just an object reference.

When the first line of code executes, it creates a string object “Hello”, and then STR points to the address of the “Hello” object. The second line of code creates a string object “World” object, and STR repoints to the address of the “World” object. The “Hello” and “World” objects do not change, only the value of the object referencing STR.

Let’s talk about the code implementation of the String class

The String class is decorated with the final keyword, so it cannot be inherited. Methods of a final class are final by default (final methods cannot be overridden by subclass methods, but can be inherited).

The String attribute char[] is also modified by final and private. Final member variables represent constants and can only be assigned once, after which the value does not change. This means that once a String has been created, it cannot be changed. This is also the immutability of strings.

advantages

  1. Ensure the security of String objects, preventing possible malicious modification
  2. This ensures that hash attribute values do not change frequently, ensuring uniqueness and enabling key-value caching for containers like HashMaps.
  3. String constant pooling can be implemented.

How to create a string object

// String constants
String str = "Hello World";
// constructor
String str = new String("Hello World");
Copy the code

With the first approach, the JVM checks for the presence of the string in the constant pool and returns a reference to the object if it does, otherwise the string will be created in the constant pool. This method can reduce the creation of the same worthy string object repeatedly, saving memory.

In the second way, the “Hello World” constant string is put into the constant structure when the class file is compiled, the “Hello World” constant string is created in the constant pool when the class is loaded, and the “Hello World” constant string is referenced in the constant pool when the constructor is called. Create a String in heap memory, and finally, the STR variable references the String.

Optimization of String objects

1. Build large strings

When + concatenates strings, the compiler optimizations them to StringBuilder. However, if loops are present, multiple Instances of StringBuilder may be created, so it is possible to display the StringBuilder concatenates strings.

In multithreaded programming, stringBuffers can be used.

2. String splitting

Because the performance of regular expressions is very erratic, improper use can cause backtracking problems, which can lead to high CPU performance.

So we should be careful with the Split() method. We can use string.indexof () instead of Split() to Split strings. If you really can’t meet the requirements, you should just pay attention to backtracking when using the Split() method.

3, use String. Intern to save memory

The official interpretation of intern() is as follows:

/**
* Returns a canonical representation for the string object.
* <p>
* A pool of strings, initially empty, is maintained privately by the
* class {@code String}.
* <p>
* When the intern method is invoked, if the pool already contains a
* string equal to this {@code String} object as determined by
* the {@link #equals(Object)} method, then the string from the pool is
* returned. Otherwise, this {@code String} object is added to the
* pool and a reference to this {@code String} object is returned.
* <p>
* It follows that for any two strings {@code s} and {@code t},
* {@code s.intern() == t.intern()} is {@code true}
* if and only if {@code s.equals(t)} is {@codetrue}. * <p> * All literal strings and string-valued constant expressions are * interned. String literals are defined in Section 3.10.5 of the * <cite> the Java&trade; Language Specification</cite>. * *@return  a string that has the same contents as this string, but is
*          guaranteed to be from a pool of unique strings.
*/
public native String intern(a);
Copy the code

When the intern method is called, it checks to see if there are references to strings equal to the object in the string constant pool, and if so, returns string references from the constant pool.

If not, discuss it in two ways:

  1. In JDK1.6, a string from the heap is copied to the constant pool and a string reference is returned. Any string in the heap that has no reference to it will be collected by the garbage collector.
  2. After JDK1.7, because the constant pool has been merged into the heap, specific strings are no longer copied, but references to strings encountered for the first time are added to the constant pool;

Now look again at the top three pieces of code, the first comparison code, as follows:

// Create two objects, one constant pool, s1 is a string reference object, and both point to the same block "aaa" in the pair.
String s1 = new String("aaa");
System.out.println("aaa:"+System.identityHashCode("aaa"));/ / 621009875
System.out.println("s1:"+System.identityHashCode(s1));/ / 1265094477

// Empty string object
String s2 = new String("");
After s1+s2, s3 becomes a new object
String s3 = s1+s2;
System.out.println("s3:"+System.identityHashCode(s3));/ / 2125039532
// s1 and s3 refer to the same address, so "" is not in the constant pool?
// Call the intern method here, because the constant pool contains the string "AAA", so it doesn't really change
System.out.println(S1 VS S3:+(s1.intern()==s3.intern()));// true
// The s3 string does not change before and after the intern method is called
System.out.println("s3:"+System.identityHashCode(s3));/ / 2125039532
// s4 refers to an object in the constant pool, so s3 and S4 have different memory addresses, so false is returned
String s4 = "aaa";
System.out.println("s4:"+System.identityHashCode(s4));/ / 621009875
System.out.println(s3==s4);// false
Copy the code

The code for the second comparison is easier to understand, just as the inter method mentioned above does, s3 = s3.intern(); , the return value is a string object in the string constant pool, so the memory address of S3 has changed, s3 and S4 are equal.

Finally, the third comparison, this is an interesting piece of code.

// How many objects are created here?
// The constant pool "hi","j", and string instance objects of hi and j. S5 is also string instance object "hij". Note that there is no string "hij" in the constant pool
String s5 = new String("hi") + new String("j");
// The memory address of s5 object is 621009875
System.out.println("s5:"+System.identityHashCode(s5));/ / 621009875
This is important because there is no "hij" string in the constant pool. All s5 object references are copied to the HashTable. The string constant pool and S5 point to the same block of memory
s5.intern();
// Print out the memory address of s5 to make sure that s5 has not changed
System.out.println("s5:"+System.identityHashCode(s5));/ / 621009875
// Create a string constant, because the string constant pool already exists, directly reference
String s6 = "hij";
// Prints the memory address of S6. The memory address is the same as that of S5, so S5 and S6 are equal
System.out.println("s6:"+System.identityHashCode(s6));/ / 621009875
System.out.println(s5 == s6); // true
Copy the code

How to useinternHow about saving memory?

Assuming that will create some user object, the coincidence of the user in the address information, such as provinces, cities, such as information, then we can use the String class in every assignment method of intern, if there is the same value in the constant pool, can reuse the object, returns an object reference, so that at the start of the object can be recycled.