An overview of the

Before we look at the String source code, let’s talk a little bit about the MEMORY distribution of the JVM, which will help us understand the design of strings better:

JVM memory Model

Method Area: Method, when the virtual machine to load a class file, it will be from the class files contain binary data parsing type information, then put these types of information (including information static variables, constants, etc.) on the method of area, the memory area Shared by all threads, method of local area is a special area of memory, It is called a Constant Pool. The largest chunk of memory managed by the Java virtual machine. The Java heap is an area of memory shared by all threads, in Java. The Stack is called the virtual machine Stack. The JVM assigns a stack for each newly created thread. In other words, for a Java program, its operation is done by the operation of the stack. The stack holds the state of the thread in frames. The JVM does only two kinds of operations on the stack: push the stack by frame and pull the stack out. We know that the method being executed by a thread is called the thread’s current method. Program Count Register Program Count Register The JVM supports multiple threads running simultaneously, and when each new thread is created, it gets its own PC register (program counter). If the thread is executing a Java method (non-native), then the value of the PC register will always point to the next instruction to be executed, and if the method is native, the value of the program counter register will not be defined. The JVM’s program counter register is wide enough to hold a pointer to the return address or native. Native Stack: The local method Stack that stores the call status of local methods.

A constant pool is data that is determined at compile time and stored in a compiled.class file. It includes constants about classes, methods, interfaces, and so on, as well as string constants. Java divides memory into heap memory and stack memory. The former is used to store objects and the latter is used to store variables of basic types and references to objects.

The body of the

Inheritance relationships

Take a look at the comments in the documentation first.

  • Strings are constant; their values can not be changed after they are created. Stringbuffers support mutable strings.Because String objects are immutable they can be shared. Forexample:
  • The String is a constant and cannot be changed after the instance is created, but the String buffer supports mutable strings because immutable String objects in the buffer can be shared.
String inheritance system

Through annotations and inheritance, we know that strings are final and cannot be changed once created, and that the CharSequence,Comparable, and Serializable interfaces are implemented.

Final:

  • Decorating a class: When decorating a class with final, it indicates that the class cannot be inherited. In other words, the String class cannot be inherited,
  • Modifier: locks a method in case any inherited class modifies its meaning.
  • Modifiers: Modifiers of basic data types so that their values cannot be changed once initialized; If a variable is of reference type, it cannot point to another object after it is initialized.

The String class is final and cannot be inherited. The character array underlying String is also final. Char is a basic data type and cannot be modified once it has been assigned.

CharSequence

CharSequence translates to String, which is also what we call a String, but the former is an interface. Here’s a look at the methods inside the interface:

    int length();
    char charAt(int index);
    CharSequence subSequence(int start, int end);
    public String toString();
    }Copy the code

There are very few methods. We don’t see our usual String methods. This class is supposed to be just a generic interface, so take a look at its implementation class

CharSequence implementation class


Member variables

private final char value[]; // Final array of characters, once assigned, cannot be changed private inthash; // Cache StringhashPrivate static final ObjectStreamField[] serialPersistentFields =new ObjectStreamField[0]; // Store the serialization information of the objectCopy the code

A constructor

Null parameter initialization

 public String(){
  this.value = "".value; } // Initialize the array to an empty String, creating a reference in stack memory and an object in heap memory // String STR = new String() STR ="hello";Copy the code
  • 1. An empty String is created
  • 2. Then create a “hello” in the constant pool and assign it to the second String
  • 3. Pass a reference to the second String to the first String

This actually creates two objects

The String to initialize the

public String(String original){ this.value = original.value; this.hash = original.hash; } // Code example String STR =new String("hello")Copy the code

An object is created

Character array initialization

Public String(char value[]){this.value = arrays.copyof (value, value.length);} public String(char value[]){this.value = arrays.copyof (value, value.length); }Copy the code

Byte array initialization

No code specified

public String(byte bytes[]){
  this(bytes, 0, bytes.length);
}
public String(byte bytes[], int offset, int length){
  checkBounds(bytes, offset, length);
    this.value = StringCoding.decode(bytes, offset, length);
}

static char[] decode(byte[] ba, int off, int len){
    String csn = Charset.defaultCharset().name();
  try{ //use char set name decode() variant which provide scaching.
         returndecode(csn, ba, off, len); } catch(UnsupportedEncodingException x){ warnUnsupportedCharset(csn); } try{// The default encoding format is ISO-8859-1return decode("ISO-8859-1", ba, off, len); } the catch (UnsupportedEncodingException x)} {/ / exception handlingCopy the code

Specifies the encoding

String(byte bytes[], Charset charset)
String(byte bytes[], String charsetName)
String(byte bytes[], int offset, int length, Charset charset)
String(byte bytes[], int offset, int length, String charsetName)Copy the code

Byte is a serialized form of network transmission or storage. Therefore, many transmission and storage procedures require the conversion of byte[] arrays and strings to each other. Byte is a byte and char is a character. Bytes streams are encoded using charsets. To convert them into Unicode char[] arrays without garbled characters, specify how they are decoded

Constructed by “SB”

Synchronized (buffer) {this.value = arrays.copyof (buffer.getValue(), buffer.length()); } } public String(StringBuilder builder) { this.value = Arrays.copyOf(builder.getValue(), builder.length()); } ··· Most of the time we don’t do this, because StringBuilder and StringBuffer have toString methods, if thread safety is not considered, StringBuilder is preferred.

The equals method

  public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while(n-- ! = 0) {if(v1[i] ! = v2[i])return false;
                    i++;
                }
                return true; }}return false;
    }Copy the code
  • 1. Check whether the addresses of the two objects are equal
    1. Then check whether the type is String
  • 3. If both are strings, compare the lengths first and then compare the values

Hashcode methods

    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }Copy the code
  • 1. If the String length==0 or the hash value is 0, return 0
  • 2. If the above conditions are not met, the algorithm S [0]31^(n-1) + s[1]31^(n-2) + … + s[n-1] Computes the hash value

    We know that hash values are often used to determine whether the values of two objects are equal, so we need to ensure that they are as unique as possible. As mentioned earlier in our analysis of HashMap principles, the fewer collisions, the more efficient the query.

Intern method

 public native String intern();Copy the code
  • Returns a canonical representation for the string object. A pool of strings, initially empty, is maintained privately by the class . When the intern method is invoked, if the pool already contains a string equal to this object as determined by the method, then the string from the pool is returned. Otherwise, this object is added to the pool and a reference to this object is returned. It follows that for any two strings { s} and { t}, { s.intern() == t.intern()} is { true}if and only if {s.equals(t)} is { true}.
  • Returns a fixed representation of the current String. A pool of String constants, initialized to empty, maintained by the current class. When this method is called, if the pool of constants contains constants equal to the current String value, the constant is returned. Otherwise, the value of the current string is added to the constant pool, and a reference to the current string is returned. If an intern() call to == for two strings returns true, then the equals method is also true.

If there is a value for the current String in the constant pool, return that value. If there is no value, add it and return a reference to that value.

String overloading of “+”

We know that “+” and “+=” are the only two overloaded operators in Java. Java does not support any other overloaded operators. Let’s decompile to see how Java overloads:

public static void main(String[] args) {
     String str1="wustor";
     String str2= str1+ "Android";
}Copy the code

Decompile main. Java and run javap -c Main. The result is displayed

Decompile the Main file

We may not understand all the code, but we see StringBuilder, and then we see Wustor and Android, and we call StringBuilder’s Append method. Why advocate having a StringBuilder when the compiler is already optimizing for us at the bottom? If we take a closer look at the third line of code above, we have a StringBuilder object new. If there is a StringBuilder object inside a loop, we will create multiple StringBuilder objects by overloading it with the “+” sign, and the compiler will optimize it for us. However, the compiler does not know the length of our StringBuilder in advance, can not allocate the buffer in advance, it will increase memory overhead, and using overloads will create multiple objects according to Java memory allocation, so why use StringBuilder, we will examine later.

switch

Switch principle of String

  • 1. First call the String HashCode method to get the corresponding Code
  • 2. Pass this code and give each case a unique identifier
  • 3. Perform operations by identifying them

I’m curious, so I’m going to look at what happens if it’s a char

    public static void main(String[] args) {
        char ch = 'a';
        switch (ch) {
            case 'a':
                System.out.println("hello");
                break;
            case 'b':
                System.out.println("world");
                break;
            default:
                break; }}Copy the code

Switch statement for Char

It’s basically the same as String, so I won’t go into much detail, but Java’s Switch support for String is really just an int support.

StringBuilder

Since a String is immutable, multiple objects are created when overloading. A StringBuilder object is mutable. You can concatenate it directly using the Append method.

StringBuilder inheritance

public final class StringBuilder extends AbstractStringBuilder implements java.io.Serializable, CharSequence {// empty constructor publicStringBuilder() { super(16); } public StringBuffer(int capacity) {super(capacity); Public StringBuffer(String STR) {super(str.length() + 16); append(str); } @Override public StringBuilder append(CharSequence s) { super.append(s);return this;
    }Copy the code

We see that StringBuilder is calling methods from its parent class, and we know that it’s a subclass of AbstractStringBuilder by inheritance, so let’s go ahead and look at its parent class, AbstractStringBuilder implements Appendable and CharSequence interfaces, so it can translate to and from Strings

Member variables

char[] value; // Int count; // Number of charactersCopy the code

A constructor

    AbstractStringBuilder() {
    }
   AbstractStringBuilder(int capacity) {
        value = new char[capacity];
    }Copy the code

As you can see, AbstractStringBuilder has only two constructors, one for an empty implementation and one for specifying the size of the character array. If you know the length of the String in advance, and the length is less than 16, you can save memory. Its array is not the same as that of String, because the value array is not final and so the value of its reference variable can be modified, that is, it can be referenced to the new array object. So a StringBuilder object is mutable

Append method

Append method


@Override public AbstractStringBuilder append(char c) { ensureCapacityInternal(count + 1); Value [count++] = c;returnthis; } private void ensureCapacityInternal(int minimumCapacity) {// overflow-conscious codeif(minimumCapacity - value.length > 0) // The required capacity exceeds the capacity of the value array. ExpandCapacity (minimumCapacity); } void expandCapacity(int minimumCapacity) {// Expand the existing capacity to more than twice the value array 2 int newCapacity = value.length * 2 + 2;if(newCapacity - minimumCapacity < 0) newCapacity = minimumCapacity; (newCapacity - minimumCapacity < 0)if (newCapacity < 0) {
            if(minimumCapacity < 0) // overflow throw new OutOfMemoryError(); newCapacity = Integer.MAX_VALUE; }} value = arrays.copyof (value, newCapacity); }Copy the code

The insert method

The insert method

Insert also has a number of overloading methods, and char is used as an example below

Public AbstractStringBuilder insert(int offset, char c) {ensureCapacityInternal(count + 1); System.arraycopy(value, offset, value, offset + 1, count-offset); Value [offset] = c; count += 1;return this;
    }Copy the code

StringBuffer

StringBuilder inheritance

Similar to StringBuilder, but with a synchronization lock on all methods.

The equals and = =

Equals () : if the values of the two strings are the same, it will return true. ==: If the values of the two strings are the same, it will return true

Create a way Number of objects The reference is to
String a=”wustor” 1 Constant pool
String b=new String(“wustor”) 1 Heap memory
String c=new String() 1 Heap memory
String d=”wust”+”or” 3 Constant pool
String e=a+b 3 Heap memory

Other common methods

ValueOf () converts to the string trim() removes the Spaces at the beginning and end of the string substring() cuts the string indexOf() looks for the first occurrence of a character or substring toCharArray() converts to a character array getBytes() gets an array of bytes CharAt () intercepts the length of a character string, length(), and toLowerCase() converts toLowerCase

conclusion

  • Strings are final and cannot be changed once they are created
  • All of the String methods do not change the String itself; they all return a new object.
  • If you need a modifiable string, you should use StringBuilder or StringBuffer.
  • If you only need to create a string, you can use double quotes, and if you need to create a new object in the heap, you can use constructors.
  • Specify the size as much as possible when using StringBuilder. This reduces the number of times you have to expand the size and helps improve efficiency.