Writing in the front

The question of how much memory the String class in Java takes up is one of the most asked questions in recent job interviews. Many small partners’ answers are not very correct, some say that it does not take up space, some say that 1 byte, some say that 2 bytes, some say that 3 bytes, some say that they do not know, and what is more ironic is that some people even say that 2 to the 31 power. If that’s the case, there’s no room in the server’s memory for a string! As programmers, we can’t make this kind of joke. Today, let’s talk about how much memory is used by strings in Java!

The structure of Java objects

First, let’s look at the structure of Java objects in a virtual machine. Here, let’s take the HotSpot virtual machine as an example.

Note: image source http://r6d.cn/wp7q

As can be seen from the above diagram, the structure of the object in memory mainly consists of the following parts:

  • Mark Word(Mark field) : The Mark Word part of the object is four bytes long and contains a series of Mark bits, such as the Mark bits for lightweight locks, biased lock bits, and so on.
  • Klass Pointer The Klass Pointer to a Class object is also 4 bytes in size and points to the memory address of the corresponding Class object (its corresponding metadata object)
  • Object actual data: This includes all member variables of the object, whose size is determined by the size of each member variable. For example: Byte and Boolean are 1 byte, short and char are 2 bytes, int and float are 4 bytes, long and double are 8 bytes, and reference is 4 bytes
  • Align: The last part is to align the bytes filled, filling in by 8 bytes.

To put it another way:

  • Object header: 8 bytes (holds the object’s class information, ID, and state in the virtual machine)
  • Java primitive data: data of type int, float, char, etc
  • Reference: 4 bytes
  • Padding

The String type in Java

The space taken by an empty String

Here, let’s take Java8 as an example. First, let’s look at the member variables in the String class.

/** The value is used for character storage. */
private final char value[];
 
/** Cache the hash code for the string */
private int hash; // Default to 0
 
/** use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = -6849794470754667710L;

In Java, arrays are also objects, so arrays also have object headers. So, the space occupied by an array is the space occupied by the object header plus the length of the array plus the reference to the array, which is 8 + 4 + 4= 16 bytes.

So, we can figure out the memory footprint of an empty String, as shown below.

Object header (8 bytes) + reference (4 bytes) + char array (16 bytes) + 1 int (4 bytes) + 1 long (8 bytes) = 40 bytes

So, guys, are you right?

The space occupied by a non-empty String

If the String length is greater than 0, we can also calculate the memory footprint of the String, as shown below.

40 + 2 * n

Where n is the length of the string.

Now, some of you might say, well, why is it 40 plus 2n? This is because 40 is the memory space of an empty String. As we mentioned above, String stores data in the char[] member variable array, and a char in the char[] array takes up about 2 bytes of space. The data in the String alone takes up 2N (n is the length of the String) bytes, plus the 40 bytes occupied by the empty String, resulting in 40 + 2 * n (n is the length of the String) bytes of storage.

Therefore, when using strings heavily in your code, you should consider the actual memory footprint.

Note: 40 + 2 * n is a general formula for calculating the memory footprint of a String.

Verify the conclusions

Next, let’s verify our above conclusion. First, create a UUIDUtils class to generate the 32-bit UUID, as shown below.

package io.mykit.binghe.string.test; import java.util.UUID; /** * @Author Binghe * @Version 1.0.0 * @Description UUID */ public class UUIDUtils {public static String getUUID(){ String uuid = UUID.randomUUID().toString(); return uuid.replace("-", ""); }}

Next, create a TestString class, create an array of 4000000 in the main() method, and fill the array with UUID strings, as shown below.

package io.mykit.binghe.string.test; import java.util.UUID; /** * @Author Binghe * @Version 1.0.0 * @Description */ public class TestString{public static void public static void main(String[] args){ String[] strContainer = new String[4000000]; for(int i = 0; i < 4000000; i++){ strContainer[i] = UUIDUtils.getUUID(); System.out.println(i); } // prevent the program from exiting while(true){}}}

Here, there are 4000000 strings, each of which has a length of 32, so the memory space used to store the string data is :(40 + 32 2) 4000000 = 416000000 bytes, which is approximately 416MB.

We used the JProfiler memory analysis tool to do the analysis:

As you can see, using the memory analysis tool of Jprofiler, the result is 321MB + 96632KB, which equals approximately 417MB. The reason why using the memory analysis tool of Jprofiler gives us a larger result than we calculated is that during the actual running of the program, some strings are also generated inside the program, which also takes up memory space!!

So, using the memory analysis tool of Jprofiler gave us the results we expected.

Well, today ended here, I am glacier, you have what question can leave a message in the below, you can also add me WeChat: sun_shine_lyz, I pull you into the group, together exchange technology, together advanced, together niu force ~~