I don’t think anyone is unfamiliar with String. It can be used in any project.

So back to you, are you sure you really know what String is? Do you know the memory allocation of String? Do you know what it looks like in a bytecode file? Do you know what’s involved when you create a String? Let’s discuss it.

“Eight hours for life, eight hours for development”

'

Location: Shunhe Village, Lanshan County, Yongzhou City, Hunan Province

Author: Laugh with heart *

Note: This article discusses strings in Jdk8.

First, String basic features

1.1. Basic knowledge

  1. String creation method

    • String str1 = “Hello”; Defined literally, strings are stored in a common pool
    • String str2 =new String(“hello”); Take the new object approach and store it in the heap
  2. String is declared to be final and not inheritable.

  3. String implements the Serializable and Comparable interfaces: that is, strings support serialization and comparison of sizes.

    public final class String implements java.io.Serializable.Comparable<String>
    Copy the code
  4. Private final char[] value; private final char[] value; To store string data. Private final byte[] value; private final byte[] value; ‘to store string data.

    I only have these versions in my computer for the time being, and I will verify them all when I am free. You can also give suggestions

1.2. Why did you make such a change?

website

The current implementation of the String class stores characters in a char array, using two bytes (sixteen bits) for each character. Data gathered from many different applications indicates that strings are a major component of heap usage and, moreover, that most String objects contain only Latin-1 characters. Such characters require only one byte of storage, hence half of the space in the internal char arrays of such String objects is going unused.

Translated into:

The current implementation of the String class stores characters in character arrays using two bytes (16 bits) per character. Data collected from many different applications shows that strings are a major part of heap usage, and that most string objects contain only the Latin character 1. These characters require only one byte of storage space, so half the space in the internal character array of these string objects is unused. 😚 😯 😲 🙃 😱

Description:

We propose to change the internal representation of the String class from a UTF-16 char array to a byte array plus an encoding-flag field. The new String class will store characters encoded either as ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes per character), based upon the contents of the string. The encoding flag will indicate which encoding is used.

String-related classes such as AbstractStringBuilder, StringBuilder, and StringBuffer will be updated to use the same representation, as will the HotSpot VM’s intrinsic string operations.

This is purely an implementation change, with no changes to existing public interfaces. There are no plans to add any new public APIs or other interfaces.

The prototyping work done to date confirms the expected reduction in memory footprint, substantial reductions of GC activity, and minor performance regressions in some corner cases.

We recommend changing the internal representation of the string class from a UTF-16 Char array to a byte array with encoded flag fields. Store the new string class as isO-8859-1 / Latin-1 (per character) or UTF-16 (per character) characters (two bytes per character) based on the contents of the string. The encoding flag will indicate which encoding is used. 🐱 🏍

String-related classes such as AbstractStringBuilder, StringBuilder and StringBuffer will be updated to use the same representation, as will the internal string operations of HotSpot VM. 😼

This is purely an implementation change, with no changes to existing public interfaces. There are no plans to add any new public apis or other interfaces. 🐱 🐉

The prototyping work completed so far confirms the expected reduction in memory footprint, significant reduction in GC activity, and in some extreme cases, minor performance degradation. 🐱 👓 🐱 🚀

In summary, using byte[] saves space and reduces GC activity compared to using char[]

1.3. String immutability

String: represents an immutable sequence of characters. Immutability for short.

1. When reassigning a value to a string, the specified memory area assignment needs to be overridden. The original value cannot be used for assignment.

public static void main(String[] args) {
    String str1 = "hello";
    String str2 = "hello";
    // The address must be true. As we explained earlier, the string will be stored in the public pool
    System.out.println(str1 == str2);
}
Copy the code

public static void main(String[] args) {
    String str1 = "hello";
    String str2 = "hello";
    str1="abc,hao";
    // Determine the address, which is true -->false
    System.out.println(str1 == str2);
}
Copy the code

Look at it in bytecode

2. When concatenating an existing string, you also need to reassign the memory area instead of using the original value.

public static void main(String[] args) {
    String str1 = "hello";
    String str2 = "hello";
    str1+="abc,hao";
    // Determine the address, which is true -->false
    System.out.println(str1 == str2);
}
Copy the code

As you can see from the bytecode files, the so-called concatenation characters are actually executed using stringBuilder.append () and then returned using the toString () method. So they’re also changing the original direction.

The direction of the diagram is similar to the first one, but I won’t draw it to save space.

3. When you call string’s replace () method to change a character or string, you also need to reassign the memory area. You cannot use the original value.

public static void main(String[] args) {
    String str1 = "hello";
    str1=str1.replace("h"."q");
}
Copy the code

It is obvious from the bytecode files that the objects are different. 😃

4. Assign a string literal (as opposed to new), where the string value is declared in the string constant pool.

Summary: Through the above several small points, I think you should understand this. So that proves the immutability of String. ☺ 😁

Note: the string constant pool does not store strings of the same content, only one copy of the same content is stored, as shown in the code above, to reduce memory consumption

The LDC instruction just means to take something out of the constant pool that the following instruction points to.

String memory allocation

There are eight basic data types in the Java language and a special type, string. These types provide a constant pool concept in order to make them faster and more memory efficient during execution.

A constant pool is like a cache provided at the Java system level. The constant pool for the eight basic data types is system-coordinated, with the string constant pool being special. It can be used in two main ways 😶

  • Strings declared directly in double quotes are stored directly in the constant pool. For example, String info=” I am rather in spring “;

  • If a string object is not declared in double quotes, you can use the string supplied intern () method.

public native String intern(a);
/ / when the intern method, if the pool already contains a String equal to this String Object equals (Object) by the equals (Object) method to determine the equals (Object), then returns the String of the pool. Otherwise, add the String to the pool and return a reference to the String.
Copy the code

String concatenation operation

  • Concatenation of constants to constants results in the constant pool, which is optimized at compile time

  • There are no variables with the same content in the constant pool

  • As long as one of them is a variable, the result is in the heap. The principle for variable concatenation is StringBuilder

  • If the result of the concatenation calls intern(), it actively puts string objects that are not already in the constant pool into the pool and returns the object’s address

public static void main(String[] args) {
    // Constant and constant concatenation result in constant pool, principle is compile-time optimization
    String str1="hello"; // This must be stored in the string constant pool.
    String str2="h"+"e"+"l"+"l"+"o"; // Here you can see the picture.
    System.out.println(str1==str2); // true because it is stored in the constant pool
    System.out.println(str1.equals(str2)); // true
}
Copy the code

Why constant pool optimizer here? Let’s look at this class file.

When our source code is compiled as a.class file, “h”+”e”+”l”+”l”+” O “is already considered by the compiler to be the same as” hello “, so str2 actually refers to “hello” in the string constant pool.

Let’s look at the following problem:

@Test
public void test(a) {
    String s1 = "Java";
    String s2 = "Study";

    String s3 = "JavaStudy";
    String s4 = "Java" + "Study";
    String s5 = s1 + "Study";
    String s6 = "Java" + s2;
    String s7 = s1 + s2;

    Which of the following are true and which are false?
    System.out.println(s3 == s4);
    System.out.println(s3 == s5);
    System.out.println(s3 == s6);
    System.out.println(s3 == s7);

    System.out.println(s5 == s6);
    System.out.println(s5 == s7);

    // So you did everything right on the top.
    String s8 = s6.intern();
    System.out.println(s3 == s8);
}
Copy the code

The answer is:

True, false, false, false, false, false, true

Why? Let’s take a look at the class file as usual.

S3 == S4 is easy to understand, they compile the same.

Why s3! = s5? It’s pretty much the same after explaining this.

s5=s1+”Stduy”; But this line of code, it actually goes through a lot of process.

S1 +”Study” is actually added via stringBuilder.append () and returns an object via toString(). Digging deeper, the stringBuilder.toString () method is actually new String();

So they’re pointing at different places.

String s8 = s6.intern();

System.out.println(s3 == s8); / / to true

The comments on the source code are particularly clear

I.e., when the intern method, if the pool already contains a String equal to this String Object equals (Object) by the equals (Object) method to determine the equals (Object), the String is returned pool. Otherwise, add the String to the pool and return a reference to the String.

Iv. Use of intern(

  • Intern is a native method that calls low-level C methods

  • The String pool is initially empty and maintained privately by the String class. When the intern method is called, returns strings from the pool if the pool already contains strings equal to the string object as determined by equals(object). Otherwise, the string object is added to the pool and a reference to the string object is returned.

  • If a string object is not declared in double quotes, you can use the string provided intern method: The intern method queries the string constant pool to see if the current string exists, and if it does not, puts the current string into the constant pool.

Such as:

@Test
public void test2(a) {
    String str1 = "i miss you";
    String str2 = new String("i miss you").intern();
    System.out.println(str1 == str2); // Result is true
}
Copy the code

In layman’s terms, Interned String ensures that there is only one copy of a string in memory. This saves memory and speeds up string manipulation tasks. Note that this value is stored in the String internal Pool (String Intern Pool😁)

Five, a few small interview questions

Also curious at the time (JDK 8 as the background, the previous JDK may produce different results 😊).

1. How many objects does new String(“ab”) create?

One or two? Is it really so? Are you sure?

public static void main(String[] args) {
    String ab = new String("ab");
}
Copy the code

The code is very simple, and you can’t tell much from the code, so let’s open the bytecode file and look at it.

Analytical process:

  1. First, we create a String, new String(). The new keyword creates a String in the heap space. That’s the first object.
  2. “Ab”, when we use it, we will first look in the string constant pool, find no, that is, create in the string constant pool. The second object.
  3. The third step is to store the address of the String in the heap into the local variable AB.

Conclusion: So the answer is two objects.

2, New String(“a”) + new String(“b”) creates several objects

How many do you think it is? Three? Four or five? Or more le? Or less?

public static void main(String[] args) {
    String ab = new String("a") + new String("b");
}
Copy the code

Again, from the bytecode file:

If you look at the bytecode file like this, you can only see five, but in the previous article, I wrote the stringBuilder.toString () method, which basically calls new String();

So we actually created six objects.

  • Object 1: new StringBuilder()
  • Object 2: New String(“a”)
  • Object 3: A of the constant pool
  • Object 4: new String(“b”)
  • Object 5: CONSTANT pool B
  • Object 6: toString creates a new String(“ab”)
    • Calling toString does not generate ab in the constant pool

How many objects does new String(“a”+”b”) create?

Let us know in the comments.

6. Talk to yourself

Touch fish a day 🧐, Java is also too volume, learning is really tired, efforts of the person special efforts, not efforts of the person shivering ah 😔.

Still feel lie flat comfortable 🛌, come together.