1 Basic String features

  • String, represented by “”
  • Declared final and not inheritable
  • The Serializable interface is implemented, which means serialization is supported, and the Comparable interface, which means size can be compared
  • In JDK8 and previously, final char[] value was internally defined to store string data
  • Byte [] for JDK9
    • Char array a char is 16bits, String is the main part of the heap space, most of the latin-1 characters, one byte is enough, so that half of the space is wasted
    • Chinese and utF-16 are stored in two bytes
    • StringBuffer, StringBuilder has also been modified
  • String represents an immutable sequence of characters
    • When a string is reassigned, the specified memory area assignment needs to be overwritten. The original value cannot be used for assignment
    • When concatenating an existing string, you also need to reassign the memory area. You cannot assign the value using the original value
    • When you call String’s replace method to modify a specified character or String, you also need to reassign the memory area. The original value cannot be used for assignment
  • Literal, as opposed to new assigning a value to a string, where the string value is declared in the string constant pool
  • The same string will not be stored in the string constant pool
    • The String pool of a String is a fixed size HashTable. The default size is 1009. If you put too many strings into a String pool, it will cause Hash collisions and the linked list will be too long. The immediate effect is that performance degrades dramatically when you call string.Intern
    • -xx :StringTableSize Indicates the size of a StringTable
    • In JDK6, the value is 1009. In JDK7, the default StringTable length is 60013. In JDK8, the default StringTable length is 60013

2 String memory allocation

  • There are eight basic data types in the Java language and a special type, String, that provide the concept of a constant pool to make them run faster and more memory-saving
  • The constant pool of String is special and can be used in two main ways
    • By using double quotes, the declared String is stored directly in the constant pool
    • If it is not a String declared in double quotes, you can use the String supplied intern() method
  • In JDK6 and before, string constant pools had permanent generations
  • In JDK7, the pool of string constants is tuned to the Java heap, and you only need to adjust the heap size when tuning
  • In Jdk8, the meta space, string constants are in the heap
  • Why adjust?
    • The permanent generation is small by default, and a large number of strings can easily result in OOM
    • The garbage collection frequency of permanent generation is low

3 String stitching operations

  • Concatenation of constants to constants results in the constant pool, which is optimized at compile time
  • No constant with the same content exists in the constant pool
  • As long as one of them is a variable, the result is in the heap. The principle for variable concatenation is StringBuilder
  • If the result of the concatenation calls intern (), it actively puts string objects that are not already in the constant pool into the pool and returns the object’s address
@Test public void test1(){ String s1 = "a" + "b" + "c"; // compile-time optimization: equivalent to "ABC" String s2 = "ABC "; Class * String s1 = "ABC "; class * String s1 =" ABC "; class * String s1 = "ABC "; * String s2 = "abc" */ System.out.println(s1 == s2); //true System.out.println(s1.equals(s2)); //true } @Test public void test2(){ String s1 = "javaEE"; String s2 = "hadoop"; String s3 = "javaEEhadoop"; String s4 = "javaEE" + "hadoop"; JavaEEhadoop String s5 = s1 + "hadoop"; javaEEhadoop String s5 = s1 + "hadoop"; String s6 = "javaEE" + s2; String s7 = s1 + s2; System.out.println(s3 == s4); //true System.out.println(s3 == s5); //false System.out.println(s3 == s6); //false System.out.println(s3 == s7); //false System.out.println(s5 == s6); //false System.out.println(s5 == s7); //false System.out.println(s6 == s7); //false //intern(): Determine if javaEEhadoop value exists in string constant pool, if so, return javaEEhadoop address in constant pool; // If javaEEhadoop does not exist in the string constant pool, load a copy of javaEEhadoop in the constant pool and return the address of the sub-object. String s8 = s6.intern(); System.out.println(s3 == s8); //true }Copy the code

String splicing

@Test public void test3(){ String s1 = "a"; String s2 = "b"; String s3 = "ab"; StringBuilder s = new StringBuilder(); (2) s.a ppend (" a "), (3) s.a ppend (" b "), (4) s.t oString () - > is approximately equal to the new String (" ab ") : We use StringBuilder after JDk5.0 and StringBuffer */ String before JDK5.0. // System.out.println(s3 == s4); //false} /* 1. String concatenation does not necessarily use StringBuilder! If the concatenation symbol is left and right with string constants or constant references, compile-time optimization is still used, that is, non-StringBuilder. 2. When final modifies structures of classes, methods, primitive data types, and quantities that reference data types, it is recommended to use final whenever possible. */ @Test public void test4(){ final String s1 = "a"; final String s2 = "b"; String s3 = "ab"; String s4 = s1 + s2; System.out.println(s3 == s4); @test public void test5(){String s1 = "javaEEhadoop"; String s2 = "javaEE"; String s3 = s2 + "hadoop"; System.out.println(s1 == s3); //false final String s4 = "javaEE"; //s4: constant String s5 = s4 + "hadoop"; System.out.println(s1 == s5); //true }Copy the code

  • Append is much more efficient than string concatenation
/* Experience execution efficiency: Adding strings via Append () on StringBuilder is much more efficient than concatenating strings! StringBuilder append(); StringBuilder append(); StringBuilder appEnd (); Create a String builder and String object. Create a String builder and String object. Create a String builder and String object. If you do GC, it takes extra time. Room for improvement: In practice, if you are sure that the string length to be added back and forth is not higher than a specified highLevel value, it is recommended to use the constructor to instantiate: StringBuilder s = new StringBuilder(highLevel); //new char[highLevel] */ @Test public void test6(){ long start = System.currentTimeMillis(); // method1(100000); //4014 method2(100000); //7 long end = System.currentTimeMillis(); System.out.println(" + (end-start)); } public void method1(int highLevel){ String src = ""; for(int i = 0; i < highLevel; i++){ src = src + "a"; // Each loop creates a StringBuilder, String} // system.out.println (SRC); } public void method2(int highLevel){// Just create a StringBuilder StringBuilder SRC = new StringBuilder(); for (int i = 0; i < highLevel; i++) { src.append("a"); } // System.out.println(src); }Copy the code

4 Use of intern()

  • If the string is in the constant pool, check with equals to see if it is the same, if not, generate it in the constant pool
  • Ensure that there is only one copy of the string in memory. This saves memory and speeds up string manipulation tasks. Note that the value is stored in the internal string pool. (String Intern Pool)

4.1 New String(“ab”) creates a new String(“a”)+new String(“b”

public class StringNewTest { public static void main(String[] args) { String str = new String("ab"); String str = new String("a") + new String("b"); }}Copy the code
  • How many objects will new String(“ab”) create? If you look at the bytecode, there are two
    • One object is the: new keyword created in the heap space
    • The other object is “ab” in the string constant pool. Bytecode instruction: LDC

  • How about new String(“a”) + new String(“b”)?
    • Object 1: new StringBuilder()
    • Object 2: New String(“a”)
    • Object 3: “A” in the constant pool
    • Object 4: new String(“b”)
    • Object 5: constant pool “B”

4.2 Some Questions

/** * how to ensure that the variable s refers to the data in the string constant pool? String s = "shkstart"; // The literal definition of mode * mode 2: call intern() * String s = new String("shkstart").intern(); * String s = new StringBuilder("shkstart").toString().intern(); * */ public class StringIntern { public static void main(String[] args) { String s = new String("1"); String s1 = s.intern(); String s2 = "1" in the String constant pool before calling this method; //s1 refers to the memory address of "1" in the string constant pool //s2 refers to the memory address of "1" in the string constant pool so s1==s2 system.out.println (s ==s2); //jdk6: false jdk7/8: false system.out. println(s1 == s2); //jdk6: true jdk7/8: true system.out.println (system.identityHashCode (s)); //jdk6: true jdk7/8: true system.out.println (system.identityHashCode (s)); //491044090 System.out.println(System.identityHashCode(s1)); //644117698 System.out.println(System.identityHashCode(s2)); //644117698 // new String("11") String s3 = new String("1") + new String("1"); // Is there an "11" in the string constant pool after the last line of code? Answer: Doesn't exist!! Generate "11" in the string constant pool. Jdk6: creates a new object "11", which has a new address. // jdk7: instead of creating "11" in the constant, create an address that points to new String("11") in the heap space s3.intern(); // address of the s4 variable record: use the address of "11" generated in the constant pool when the code is executed. String s4 = "11"; System.out.println(s3 == s4); //jdk6: false jdk7/8: true}}Copy the code

4.3 develop

Public class string1 {public static void main(String[] args) {//StringIntern.  String s3 = new String("1") + new String("1"); // New String("11") // Is there "11" in the String constant pool after the last line of code? Answer: Doesn't exist!! String s4 = "11"; // Generate object "11" in String constant pool s5 = s3.intern(); System.out.println(s3 == s4); //false System.out.println(s5 == s4); //true } }Copy the code

4.4 Summarize the use of String intern ()

  • In JDK1.6, try to put this string object into the string pool
    • If there is one in the string constant pool, it will not be added. Returns the address of an object in an existing string pool
    • If not, a copy of the object is made, put into the string pool, and the address of the object in the string pool is returned
  • As of Jdk1.7, try to put this string object into the string pool
    • If there is one in the string constant pool, it will not be added. Returns the address of an object in an existing string pool
    • If not, a copy of the object’s reference address is made, added to the string pool, and the reference address in the string pool is returned

4.5 practice

public class StringExer1 { public static void main(String[] args) { //String x = "ab"; String s = new String("a") + new String("b"); // New String("ab") // After the last line of code, there is no "ab" in the String constant pool. //jdk8: instead of creating String "ab", create a reference to new String("ab"), return system.out.println (s2 == "ab"); //jdk6:true jdk8:true System.out.println(s == "ab"); //jdk6:false jdk8:true } }Copy the code

jdk1.6

jdk7/8

  • Large web platforms require large numbers of strings to be stored in memory. For example, on social networking sites, many people store information about Beijing and Haidian district. If the strings all call intern (), the memory size will be significantly reduced

5 Garbage Collection

/** * String garbage collection: * -Xms15m -Xmx15m -XX:+PrintStringTableStatistics -XX:+PrintGCDetails * */ public class StringGCTest { public static void main(String[] args) { // for (int j = 0; j < 100; j++) { // String.valueOf(j).intern(); For (int j = 0; j < 100000; j++) { String.valueOf(j).intern(); }}}Copy the code

  • Background: For many Java applications, the test results are as follows
    • Strings make up 25% of the heap inventory data set
    • The number of repeated strings in the heap survivable data set is 13.5%
    • The average length of a String is 45
  • The bottleneck for many large-scale Java applications is memory. Almost 25% of the surviving data sets in the Java heap are strings. In this case, almost half of the strings are duplicated. Repeating equals =true means that repeating strings on the heap is a waste of memory. The G1 garbage collector implements automatic and persistent deduplicating of strings to avoid waste

5.1 implementation

  • When the garbage collector works, it accesses the objects that are alive on the heap. Each accessed object is checked for a candidate String to be repealed
  • If so, a reference to the object is inserted into the queue for subsequent processing. A de-weight thread runs in the background, processing the queue. Processing an element of the queue means removing the element from the queue and then trying to duplicate the String it references
  • Use a Hashtable to record all the unique char arrays used by strings

When de-duplicating, the hashTable is checked to see if an identical char array already exists on the heap.

  • If it does, the String will be adjusted to refer to that array, freeing the reference to the original array, and eventually being collected by the garbage collector
  • If the lookup fails, the char array is inserted into the HashTable so that the array can be shared at a later time.

5.2 CLI Options

  • UseStringDeduplication (bool) : Enable String deduplication. This function is disabled by default. You must manually enable it
  • PrintStringDedupl icationStatistics (bool) : Prints detailed deduplication statistics
  • StringDedupl icationAgeThreshold (Uintx) : Strings of this age are considered candidates for deduplication

JVM full directory

Class loading mechanism 3. Runtime data area [PC register, vm stack, local method stack] 4. Runtime data area [heap] 5. Runtime data area [method area] 6. Temporary absence 7. Runtime data area [instantiated memory layout and access location of objects, direct memory] 8. String constant pool 10. Garbage collection [overview, related algorithms] 11. Garbage collection [related concepts] 12. Common OOM 14. JDK command line tools