1. Consult blogs

In-depth understanding of the Java String class

2. Content supplement

2.1 Basic Features

public final class String implements java.io.Serializable.Comparable<String>, CharSequence {
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L; . }Copy the code

By the String class part of the source, you can explore the String some basic features:

  • The String class is decorated with the final keyword to indicate that the String class cannot be inherited.
  • The String class implements the Serializable, Comparable, and CharSequence interfaces, indicating that the String class supports the specifications defined by these interfaces.
    • The String class implements the Serializable interface, which means that String instances can be serialized and transferred over the network.
    • The String class implements the Comparable interface, which means that the String class defines rules for comparing String instances.
    • The String class implements the CharSequence interface, which indicates that the sequence of characters in a String instance is readable.
  • Value, as a member variable of the String class, indicates that the String stored in the String instance is underwritten by a char[].
  • Value is modified by the private and final keywords to indicate that the String stored ina String instance is an immutable sequence of characters. That is, the contents of the characters stored ina String instance cannot be modified once initialized.

2.2 Literal definition

In Java, only the primitive data types, NULL and String, support literal definitions.

Assigns a value to a variable of type String, as defined by a literal, in the String constant pool stored in the JVM heap space.

Definition example:

String str = "abc";
Copy the code

2.3 New + constructor definition

2.3.1 new String ()

Definition example:

String str = new String();
Copy the code

Constructor source code:

public String(a) {
    this.value = "".value;
}
Copy the code

2.3.2 new String (String)

Definition example:

// String constants as arguments
String str1 = new String("abc");
// A String reference is used as an argument
String str2 = "Hello world!";
String str3 = new String(str2);
Copy the code

Constructor source code:

public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}
Copy the code

2.3.3 new String (char [])

Definition example:

char[] chars = {'a'.'b'.'c'};
String str = new String(chars);
Copy the code

Constructor source code:

public String(char value[]) {
    // Create a new array of value.length, copy the contents of value into the new array, and return the new array
    this.value = Arrays.copyOf(value, value.length);
}

/ / the Arrays. The copyOf the source code
public static char[] copyOf(char[] original, int newLength) {
    // Create a new array copy with newLength
    char[] copy = new char[newLength];
    // Copy the contents of the Original array into the copy
    System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength));
    return copy;
}

/ / System. Arraycopy source code
public static native void arraycopy(Object src, int srcPos, Object dest, int destPos, int length);
Copy the code

2.3.4 summary

If we use the new+String constructor to define a String instance, we can find the following line: this.value =??? ;

When you define a String instance using the new+String constructor, the value of that String instance is assigned by other String instances with the same value. So where do other instances of strings with value come from?

At compile time, the compiler knows that you use the new+String constructor to define the value of a String instance. For example, String() is a fixed “”, and String(String) is the value of the String instance passed in as an argument. At runtime, the JVM looks in the string constant pool to see if there are string instances with the same value as the string instance being created by the constructor.

  • If so, the string instance is defined in the heap space by assigning the value of the instance in the string constant pool to the constructor.
  • If not, the JVM creates a string instance in the string constant pool with the same value as the string instance being created by the constructor, and then defines the string instance in heap space by assigning the value of the instance in the string constant pool to the constructor.

Examples:

How many objects are created by the following statement?

String str = new String("abc");
Copy the code

Break this statement into four parts:

  1. String STR: defines a variable named STR in the stack space to store the address of the String instance, without creating an object.
  2. = : An operation that assigns the address of a string instance to the STR variable without creating an object.
  3. “ABC” : Before executing new String(String), the JVM searches the String constant pool to see if there is a String instance value=” ABC “. If there is one, the String instance object is not created. If there is none, the String instance object is created.
  4. New String(String) : Pass a String instance of value=” ABC “as an argument to the constructor. The value of the argument is assigned to this.value in the constructor, thus creating a String instance object of value=” ABC” in the heap space.

To sum up, in general, you need to create two objects. One in the string constant pool and one in the heap space.

JVM memory allocation:

2.4 Similarities and differences between the two definitions

  • String instances defined using literals are stored in the string constant pool.
  • String instances defined using literals may share one instance if the values are the same.
  • Instances of strings defined using the new+ constructor are stored in heap space.
  • String instances defined using the new+ constructor must be created independently, even if they have the same value.
  • String instances defined using the new+ constructor require string instances in the string constant area to assist in creation.
  • String instances, whether defined using literals or the new+ constructor, follow immutability.
String str1 = "abc";
String str2 = "abc";
// true, the constant pool uses one instance
System.out.println(str1 == str2);

String str3 = new String("abc");
String str4 = new String("abc");
// false, create separate instances in the heap space
System.out.println(str3 == str4);
Copy the code

2.5 Immutability

Reassignment of a String variable requires reassignment or creation of another String instance in the memory region. It is not possible to modify the value attribute of the original String instance to achieve the purpose of reassignment while keeping the memory area unchanged.

Examples:

String str = "abc";
str = "abcd";
Copy the code

The value of the “ABC” string instance is initialized to [‘a’, ‘b’, ‘c’], value.length = 3. It is known that the value member variable of this instance can never be changed, so after the statement STR = “abcd” is compiled, the JVM keeps the “ABC” string instance and looks for any string instance with the value “abcd” in the string constant pool. Returns the address of the instance directly, if any. If not, create a new string instance with the value “abcd” in the string constant pool and return the address of the instance.

2.6 “+” string concatenation

  • String constants and constants are stored in the string constant pool using the concatenation of “+”. If there is a string instance in the constant pool that matches the concatenated value, the address of that instance is returned.
  • String constants and variables are stored in heap space using a concatenation of “+”.
  • String variables and variables are stored in heap space using a concatenation of “+”.

Examples:

String str1 = "abc";
String str2 = "def";
String str3 = "abcdef";
String str4 = "abc" + "def";
String str5 = str1 + "def";
String str6 = "abc" + str2;
String str7 = str1 + str2;
String str8 = str7.intern();

// true
System.out.println(str3 == str4);
// All three of the following are false
System.out.println(str3 == str5);
System.out.println(str3 == str6);
System.out.println(str3 == str7);
// true
System.out.println(str3 == str8);
Copy the code

JVM memory allocation:

2.7 the StringBuilder and StringBuffer

2.7.1 overview

StringBuffer and StringBuilder both stand for mutable character sequences.

Unlike String instances, StringBuffer and StringBuilder instances can be modified multiple times and do not create new unused objects.

StringBuffer and StringBuilder are identical except that most methods in StringBuffer are thread-safe and are modified with the synchronized keyword, which StringBuilder does not. Can be considered thread unsafe. As a result, StringBuilder has a speed advantage over StringBuffer, so the StringBuilder class is recommended in most cases. However, in cases where the application requires thread-safety, the StringBuffer class must be used.

In a specific scenario, StringBuffer is preferred in a multithreaded environment where insertion and deletion of shared variables are involved. StringBuilder is preferred if it is non-multithreaded and there is a lot of string concatenation, insertion, and deletion.

2.7.2 Inheritance Structure

2.7.3 Source code analysis

This article only analyzes StringBuilder source code, StringBuffer analogy can be.

  1. AbstractStringBuilder uses a variable char[] to store character sequences. Use count to count the number of characters in a character sequence.

    abstract class AbstractStringBuilder implements Appendable.CharSequence {
        /** * The value is used for character storage. */
        char[] value;
        
        
        /** * The count is the number of characters used. */
        intcount; . }Copy the code
  2. As you can see from the source of the StringBuilder, StringBuilder defaults to an initialization length of 16. If a string is passed in, the initialization length is the string length +16.

    public StringBuilder(a) {  
        super(16);            
    }             
    
    public StringBuilder(String str) {  
        super(str.length() + 16);       
        append(str);                    
    }                                   
    Copy the code
  3. Char [] is created and copied at the bottom of the StringBuilder concatenated string.

    @Override                                 
    public StringBuilder append(String str) { 
        super.append(str);                    
        return this;                          
    }                                         
    
    public AbstractStringBuilder append(String str) {
        if (str == null)
            // Add null to the end of value
            return appendNull();
        // Add the number of characters in the character sequence
        int len = str.length();
        // Determine the length of the new STR (len), the length of the original value (count), and whether the value exceeds the capacity of the value (capacity).
        // Create a new value, expand it to the original length (capacity*2+2), and copy the content of the original value to the new value.
        ensureCapacityInternal(count + len);
        // copy everything from 0 to len after value count
        str.getChars(0, len, value, count);
        / / refresh the count
        count += len;
        return this;
    }
    
    private void ensureCapacityInternal(int minimumCapacity) {
        if (minimumCapacity - value.length > 0) {
            // int newCapacity = (value.length << 1) + 2;value = Arrays.copyOf(value, newCapacity(minimumCapacity)); }}public static char[] copyOf(char[] original, int newLength) {
        char[] copy = new char[newLength];
        // void arraycopy(Object src, int srcPos, Object dest, int destPos, int length);
        System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength));
        return copy;
    }
    
    public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
        if (srcBegin < 0) {
            throw new StringIndexOutOfBoundsException(srcBegin);
        }
        if (srcEnd > value.length) {
            throw new StringIndexOutOfBoundsException(srcEnd);
        }
        if (srcBegin > srcEnd) {
            throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
        }
        // void arraycopy(Object src, int srcPos, Object dest, int destPos, int length);
        System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
    }
    Copy the code

3. The interview questions

3.1 the first question

Q: What is the output when I execute the following code?

public class Test {

    private String str = "abc";
    private char[] chars = {'t'.'e'.'s'.'t'};

    public void change(String str, char[] chars) {
        str = "Hello world!";
        chars[0] = 'b';
    }

    public static void main(String[] args) {
        Test test = newTest(); test.change(test.str, test.chars); System.out.println(test.str); System.out.println(test.chars); }}Copy the code

A: “ABC” and “best”.

Method parameter passing, immutability of string instances.

Analysis:

We know that there is a reference to STR, a member variable STR of the Test class, in the heap. There is a reference to STR on the stack, which is the parameter STR to the change method. Both references point to instances of “ABC” in heap space. When the change method executes, STR in the operation stack reassigns the “ABC” instance. Because of the immutability of string instances, the “ABC” instance cannot be modified, so a “Hello World!” can only be recreated in the heap space. Example redirects STR in the stack. At the end of the change method, STR in the stack is destroyed, and STR in main is the STR in the heap. Since the “ABC” instance in the heap has not changed, the output is still “ABC”.

As for why chars prints “best”, char[] has no immutability. During the execution of the change method, the contents of char[] instances in the heap are modified by chars on the stack.

3.2 the second question

Q: How many objects are generated in the JVM by executing the following code?

String str1 = "abc";
String str2 = new String("def");
String str3 = str1 + str2;
Copy the code

A: Five objects.

Analysis:

Decompiling bytecode yields:

String str1 = "abc";
String str2 = new String("def");
(new StringBuilder()).append(str1).append(str2).toString();
Copy the code

ToString StringBuilder class toString

public String toString(a) {
    // Returns the merged string instance
    return new String(value, 0, count);
}
Copy the code

First statement: creates an “ABC” string instance in the string constant pool.

Second statement: a “def” string instance is created in the string constant pool and a “def” string instance is created in the heap.

Third statement: a StringBuilder instance and an “abcdef” string instance are created in the heap.

3.3 the third question

Q: How many objects are generated in the JVM by executing the following code?

String str1 = "abc";
String str2 = new String("def");
String str3 = str1 + str2;
System.out.println(str3);
Copy the code

A: Four objects.

Analysis:

Decompiling bytecode yields:

String str1 = "abc";
String str2 = new String("def");
String str3 = str1 + str2;
System.out.println(str3);
Copy the code

First statement: creates an “ABC” string instance in the string constant pool.

Second statement: a “def” string instance is created in the string constant pool and a “def” string instance is created in the heap.

Third statement: an “abcdef” string instance is created in the heap.

Extension:

Why does Str3 only define an extra StringBuilder object if it is not used?

After a string instance is defined, if the string is used, the compiler adjusts and optimizes the operation instructions associated with the string instance.

3.4 the fourth question

Q: What is the difference between the following three ways of creating empty strings?

String str1 = "";
String str2 = new String();
String str3 = new String("");
Copy the code

A:

First statement: creates an “” string instance in the string constant pool.

Second statement: Creates an “” string instance in the string constant pool and an “” string instance in the heap space.

Third statement: create an “” string instance in the string constant pool, create an “” string instance in the heap space, and hash the string instance in the string constant pool to the “” string instance in the heap space.

Extension:

What is the difference between a string instance hash value and one that does not?

When two string instances are stored in the same HashMap, the HashMap manipulates the string instances that have been hash evaluated faster than the string instances that have not been hash evaluated. Because if the hash of the string instance of the HashMap operation has not been evaluated, the HashMap first evaluates a hash for that string instance and assigns a value to that string instance. A HashMap, on the other hand, can directly reuse the hash of an instance of a string that has already been computed.

3.5 5

Q: What is the difference between the following two ways of creating strings?

// The first way
String str1 = new String("Hello world!");
// The second way
String str2 = "Hello world!";
String str3 = new String(str2);
Copy the code

A:

In the first way, the compiler iterates through all string instances in the string constant pool during compilation, looking for value=”Hello world!” String instance of. If there is a match, the string instance from the string constant pool is used; if there is no match, one is created and placed in the string constant pool. The second way is to pass a reference in the string constructor, telling the compiler value=”Hello world!” The address of the string instance in the string constant pool of the.

So the second method compiles faster than the first method.

At run time, the second method opens up more memory space on the JVM than the first method. Str2 holds the string constant pool of “Hello world!” Address of the instance.

Therefore, the second method consumes more memory than the first method.

3.6 the sixth problem

Q: Is it good to concatenate strings using the concat method?

A: No.

Analysis:

The relevant source code is as follows:

/ / concat the source code
public String concat(String str) {
    // Gets the length of the argument string, or itself if it is empty
    int otherLen = str.length();
    if (otherLen == 0) {
        return this;
    }
    // Get the length of its own character array
    int len = value.length;
    // Create a new character array whose length is the sum of itself and the parameter length, copy the contents of its character array into the new character array
    char buf[] = Arrays.copyOf(value, len + otherLen);
   	// Copy the contents of the argument's character array into the new character array
    str.getChars(buf, len);
    // Convert the new character array to a string and return
    return new String(buf, true);
}

/ / the Arrays. The copyOf the source code
public static char[] copyOf(char[] original, int newLength) {
    char[] copy = new char[newLength];
    System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength));
    return copy;
}

/ / getChars source code
void getChars(char dst[], int dstBegin) {
    System.arraycopy(value, 0, dst, dstBegin, value.length);
}

/ / System. Arraycopy source code
 public static native void arraycopy(Object src, int srcPos, Object dest, int destPos, int length);
Copy the code

Using concat to concatenate two strings requires not only the creation of an additional large array, but also a circular copy of each array, according to source code analysis. It is a waste of computing resources and space resources. Therefore, concat is not recommended for string concatenation. It is recommended to use StringBuilder for string concatenation.

3.7 number 7

Q: Is it good to compare two shorter strings using the equals method?

A: No.

Analysis:

Look at the equals source of the String class:

/ / equals the source code
public boolean equals(Object anObject) {
    // Return true if the address is the same as the parameter
    if (this == anObject) {
        return true;
    }
    // If they are of the same class, they can be compared
    if (anObject instanceof String) {
        String anotherString = (String) anObject;
        int n = value.length;
        // If the length is the same as the parameter, the bitwise comparison is performed
        if (n == anotherString.value.length) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = 0;
            while(n-- ! =0) {
                if(v1[i] ! = v2[i])return false;
                i++;
            }
            // All bits are the same, return true
            return true; }}return false;
}
Copy the code

The equals method overridden in the String class compares two strings by iterating over the contents of both strings at the same time, which consumes a lot of CPU and computing resources.

Class hashCode:

public int hashCode(a) {
    // The default is 0, and if the hash has already been computed, it will be returned directly
    int h = hash;
    // Calculate hashcode with value
    if (h == 0 && value.length > 0) {
        char val[] = value;
		// Sum each character according to the UTF-16 encoding value
        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}
Copy the code

As can be seen from the hashCode source code, in general, if two string instances have the same value, the return value of hashCode for both strings will be the same.

So you can compare the hashcode of two strings to determine whether the contents of the two strings are equal. This method is more efficient than using the Equals comparison method.

String str1 = new String("abc");
String str2 = new String("abc");
System.out.println(str1.hashCode() == str2.hashCode());
Copy the code

Extension:

Why is comparing two long strings not recommended using HashCode?

Because when two strings are long, the calculated Hashcode may be the same in the case of different values. It is generally recommended that strings of length up to 16 be compared using HashCode.

3.8 the eighth problem

Q: The String.intern method seems to reduce memory footprint for the JVM, so should it be used on a large scale in projects?

A: It depends on the JDK version.

Prior to JDK6, the intern method stored string constants in the permanent generation. Variables stored in the permanent generation cannot be reclaimed by the JVM, so they occupy memory space and cause unnecessary waste. In JDK7, the position of the string constant pool is moved to heap space, which can be automatically reclaimed by the JVM.

3.9 9 questions

Q: Why are String instances designed to be immutable?

A:

(1) String instances are not thread-safe. With immutability, when multiple threads access a string instance at the same time, when one thread changes a string variable, only the new reference is read, and the values read by the other threads are not affected. This means that even without locking, there is no thread safety issue.

(2) Suitable as a HashMap key. (Explained in HashMap Details)

3.10 the first ten questions

Can the value of a String instance be changed?

Answer: Yes.

Analysis: Final modifiers are used to ensure that variable data cannot be modified at compile time. However, if the data of an object instance needs to be modified at run time, they can be implemented by reflection.

Such as the following code:

String str = "abc";
// Get the value field declared in the String class
Field value = String.class.getDeclaredField("value");
// Get permission to modify the private field
value.setAccessible(true);
// Get the value of STR instance
char[] chars = (char[]) value.get(str);
// Modify the value data
chars[0] = 'b';
/ / output: the BBC
System.out.println(str);
Copy the code