This is the 7th day of my participation in Gwen Challenge

This article is participating in “Java Theme Month – Java Development in Action”, see the activity link for details

Hardcore warning!

What is bytecode

1. What is mechanical code first

Mechanical code is code that the CPU can read and run directly. It is expressed in binary code, also known as mechanical instruction code. In the preparation of this kind of code, the need to take the initiative to control all resources of the CPU, and the need to remember all instructions to do the action, very troublesome, of course, this is the bottom code of the computer, processing development of computer professionals outside, has been very few people to study.

2. The bytecode

Bytecode is an intermediate binary that is compiled from source code and is not as readable as source code. The CPU cannot read bytecode directly; in Java, bytecode has to be translated by the JVM into mechanical code before the CPU can read and run it.

3. Benefits of using bytecode

Compile in one place, run everywhere. Java typically uses bytecode as an intermediate language, compiling the source code in one place, holding the.class file and running it on a variety of computers, each with a different JVM.

4. State of bytecode in JVM

5. Add something extra

Compiled language

Source code can be compiled into mechanical code only once. High execution efficiency, low portability, compiler – dependent. Typical examples: C, C++, Pascal, Object-C and, most recently, Apple’s new languages swift and GO

Interpretive language

On the first compilation, the source code is not compiled directly into mechanical code, but rather into an intermediate binary (bytecode), which the virtual machine compiles a second time, this time into mechanical code. Less efficient than compiled languages, but highly portable and virtual machine dependent. Typical examples: JavaScript, Python, Erlang, PHP, Perl, Ruby

Bytecode in Java

1. View the method of bytecode

  1. First open idea and create a.java file inside
package test;

public class ByteCodeTest {
    private int a = 0;
    public int get(a) {
        returna; }}Copy the code

And then on the other class, I call that class by running the main method

  1. Locate the compiled.class file

Under the out folder there will be a.class file with the same name as the Java file we just wrote

3. Download a Sublime Text and open the.class file

cafe babe 0000 0034 0016 0a00 0400 1209
0003 0013 0700 1407 0015 0100 0161 0100
0149 0100 063c 696e 6974 3e01 0003 2829
5601 0004 436f 6465 0100 0f4c 696e 654e
756d 6265 7254 6162 6c65 0100 124c 6f63
616c 5661 7269 6162 6c65 5461 626c 6501
0004 7468 6973 0100 134c 7465 7374 2f42
7974 6543 6f64 6554 6573 743b 0100 0367
6574 0100 0328 2949 0100 0a53 6f75 7263
6546 696c 6501 0011 4279 7465 436f 6465
5465 7374 2e6a 6176 610c 0007 0008 0c00
0500 0601 0011 7465 7374 2f42 7974 6543
6f64 6554 6573 7401 0010 6a61 7661 2f6c
616e 672f 4f62 6a65 6374 0021 0003 0004
0000 0001 0002 0005 0006 0000 0002 0001
0007 0008 0001 0009 0000 0038 0002 0001
0000 000a 2ab7 0001 2a03 b500 02b1 0000
0002 000a 0000 000a 0002 0000 0003 0004
0004 000b 0000 000c 0001 0000 000a 000c
000d 0000 0001 000e 000f 0001 0009 0000
002f 0001 0001 0000 0005 2ab4 0002 ac00
0000 0200 0a00 0000 0600 0100 0000 0600
0b00 0000 0c00 0100 0000 0500 0c00 0d00
0000 0100 1000 0000 0200 11
Copy the code

2. A puzzle

I also hope to have big guy to be able to answer my doubt, I check Baidu also cannot find the answer, the likelihood is my search means has a problem. View ->Show ByteCode view->Show ByteCode

// class version 52.0 (52)
// access flags 0x21
public class test/ByteCodeTest {

  // compiled from: ByteCodeTest.java

  // access flags 0x2
  private I a

  // access flags 0x1
  public <init>()V
   L0
    LINENUMBER 3 L0
    ALOAD 0
    INVOKESPECIAL java/lang/Object.<init> ()V
   L1
    LINENUMBER 4 L1
    ALOAD 0
    ICONST_0
    PUTFIELD test/ByteCodeTest.a : I
    RETURN
   L2
    LOCALVARIABLE this Ltest/ByteCodeTest; L0 L2 0
    MAXSTACK = 2
    MAXLOCALS = 1

  // access flags 0x1
  public get(a)I
   L0
    LINENUMBER 6 L0
    ALOAD 0
    GETFIELD test/ByteCodeTest.a : I
    IRETURN
   L1
    LOCALVARIABLE this Ltest/ByteCodeTest; L0 L1 0
    MAXSTACK = 1
    MAXLOCALS = 1
}

Copy the code

I wonder what’s the difference between this and a hexadecimal file? How do they switch?

The composition of Java bytecode

1. Basic data types

The data type meaning
u1 An unsigned single-byte integer
u2 An unsigned 2-byte integer
u4 An unsigned 4-byte integer
u8 An unsigned 8-byte integer

1Byte=8 bits. In the hexadecimal system, two bits are required to represent 1Byte. A hexadecimal number requires 4 bits to be represented.

2. Format of Java bytecode

type The number of The name of the meaning
u4 1 magic The magic number
u2 1 minor_version Deputy version number
u2 1 major_version The major version number
u2 1 constant_pool_count A constant number of
cp_info constant_pool_count-1 constant_pool Constant pool list
u2 1 access_flags Access the tag
u2 1 this_class The current class
u2 1 super_class The parent class
u2 1 interfaces_count Number of implemented interfaces
u2 interfaces_count interfaces The interface list
u2 1 fields_count Number of fields
field_info fields_count fields Field list
u2 1 methods_count Methods the number of
method_info methods_count methods Methods list
u2 1 attribute_count Number of attributes
attribute_info attributes_vount attributes Property list

3. Format interpretation

To save space, Java is strict about the format of the bytecode, so we can read the bytecode according to this format table. Non-basic data types are actually composed of basic data types and store data in strict accordance with a certain format. You can see that constant pools, interfaces, fields, methods, and properties are all stored in the quantity + data format.

Read bytecode

Take the bytecodetest.class file we created above.

1. Magic

cafe babe
Copy the code

This number is used to represent the current file type, which was set by James Gosling, the father of Java. There are also magic numbers inside the code, which are commonly referred to as magic values, which generally refer to constant values inside the method.

2. Version number

0000 0034
Copy the code

If the minor version is 0, the major version is 52, which corresponds to java1.8 (8).

3. Constant_pool

Constant pools store data that does not change.

Constant pool base type

Number of constants (constant_pool_count)

0016
Copy the code

The number of constants is 22, the number of constants is 22, the number of constants is 22, and the number of constants is 21.

Constant pool list (pool_count)

When you look at a constant, you need to determine what type it is based on the first byte, and then know its length.

# 1

0a00 0400 12
Copy the code

0x0a=10, corresponding to CONSTANT_Methodref_info this type refers to two u2 (2bit), which is 8 hexadecimal numbers so here are 10 hexadecimal numbers representing a constant

0x0004=4 0x0012=12 so this constant references #4, #12

All constants
0a00 0400 12
09 0003 0013 
0700 14
07 0015 
0100 0161
0100 0149 
0100 063c 696e 6974 3e
01 0003 2829 56
01 0004 436f 6465 
0100 0f4c 696e 654e 756d 6265 7254 6162 6c65 
0100 124c 6f63 616c 5661 7269 6162 6c65 5461 626c 65
01 0004 7468 6973 
0100 134c 7465 7374 2f42 7974 6543 6f64 6554 6573 743b 
0100 0367 6574 
0100 0328 2949 
0100 0a53 6f75 7263 6546 696c 65
01 0011 4279 7465 436f 6465 5465 7374 2e6a 6176 61
0c 0007 0008 
0c00 0500 06
01 0011 7465 7374 2f42 7974 6543 6f64 6554 6573 74
01 0010 6a61 7661 2f6c 616e 672f 4f62 6a65 6374 
Copy the code

Each line represents a constant of type constant_UTF-8_info

4. Access flags (access_flags)

Access the tag

Tag type

Sign the name Value (hexadecimal) (bit) describe
PUBLIC 0x0001 0000000000000001 Class corresponding to type public
PRIVATE 0x0002 0000000000000010 Field is private
PROTECTED 0x0004 0000000000000100 Field is protected
STATIC 0x0008 0000000000001000 Field is static
FINAL 0x0010 0000000000010000 The final declaration of the corresponding class
SUPER 0x0020 0000000000100000 Identifies the new semantics of the JVM invokespecial
VOLATILE 0x0040 0000000000100000 Whether the field is volatile
TRANSIENT 0x0080 0000000001000000 Whether the field is TRANSIENT
INTERFACE 0x0200 0000001000000000 Interface sign
ABSTRACT 0x0400 0000010000000000 Abstract class flag
SYNTHETIC 0x1000 0001000000000000 Identifies that this class is not user code generation
ANNOTATION 0x2000 0010000000000000 Mark this as an annotation
ENUM 0x4000 0100000000000000 Identifies this as an enumeration
The access tag is marked in terms of 0/1 on each bit, which is represented in 16bits as you can see from the table.
0021
Copy the code

The access tag doesn’t have to look directly at the table to find which type it belongs to. 0x0021=0000000000100001, which can be found in the table, is on PUBLIC and SUPER, so this class has the PUBLIC and SUPER flags *[0000000000100001]: binary

5. Current class (this_class)

The current class represents the location specified in the constant pool

0003
Copy the code

0x0003=3 indicates that the current class corresponds to #3, which is #3

0700 14
Copy the code

This leads to #20, #20

01 0011 7465 7374 2f42 7974 6543 6f64 6554 6573 74Acsii code table query result: test/ByteCodeTestCopy the code

6. Super_class

The parent class of the current class represents the location specified in the constant pool

0004
Copy the code

0x0004=4 #4

07 0015 
Copy the code

This leads to #21, #21

01 0010 6a61 7661 2f6c 616e 672f 4f62 6a65 6374Acsii code table query result: Java, lang, and ObjectCopy the code

And you can see here that this class inherits from Object, and all classes inherit from this class, so we didn’t write it, it inherits too.

7. Interfaces

The interface implemented by the current class

Number of interfaces (interfaces_count)

0000
Copy the code

I’m not implementing any interfaces here, but of course the number of interfaces is zero.

List of interfaces

If there is an interface, it is followed by interfaces_count* a 4-digit hexadecimal number, each U2 corresponding to the position in the constant pool

8. Fields

Fields refer to properties of the current class, not properties inside a method

Number of fields (fields_count)

0001
Copy the code

It says that this class has a property and then we read the next 16-bit hexadecimal number

List of fields

The field type
According to the character meaning
B Byte Indicates the byte type.
J Long long integer
C Char Character type
S Short short integer
D Double double precision floating point
Z Boolean Boolean
F Float Single-precision floating point
V Void type
I Int the integer
L Object reference type
# # # # # field
“`java
0002 0005 0006 0000
` ` `
The first U2:The tag type of the field, the tag type, need to look at the previous tag type
The second U2:Field nameCorresponds to the position in the constant pool
The third U2:Type of field, corresponding to the position in the constant pool, need to flip over the field type
The fourth U2:Attributes of fieldsCorresponds to the position in the constant pool

0002 indicates that this field is of type private 0005 and refers to constant pool #5 #5

0100 0161Query result of the ACSII code table: aCopy the code

0006 points to constant pool #6 #6

01000149 Query result of the ACSII code table: ICopy the code

If this property is assigned a value (int 0 is the default), u2 will refer to a constant that is not null

Private int a;

9. Methods

The method of the current class

Method number (methods_count)

0002 
Copy the code

There are two methods, but we only defined one method, so where does the other method come from? We can open the compiled.class file directly with idea and see that the other method is a constructor

Methods (Methods)

Let’s read the 6*4 hexadecimal numbers backwards

Description of method
0001 0007 0008 0001
Copy the code

The first u2 is the tag type of the method, tag type, and you need to look at the previous tag type. The second U2 is the name of the method, which corresponds to the position in the constant pool. The third U2 is the type of the method, which corresponds to the position in the constant pool

Public ()V has a property

Method properties

We need to read the 3*4 hexadecimal numbers that illustrate the properties of the method

0009 0000 0038
Copy the code

The first U2 is the name of the attribute, corresponding to the position in the constant pool. The second U4 is the length of the attribute description, representing the number of u2s that follow, which are all descriptions of the attribute. The first U2 points to #9, #9

01 0004 436f 6465Acsii Code table Query result: CodeCopy the code

This Code is a predefined attribute of the JVM vm, which is equivalent to the Code inside the method. For details, go to Baidu to search for “predefined attributes of the JVM VM specification”, which I will not elaborate on here

The second u2:0x38=56 so let’s read 56*2 hexadecimal numbers

0002 0001 0000 000a 2ab7 0001 2a03 
b500 02b1 0000 0002 000a 0000 000a 
0002 0000 0003 0004 0004 000b 0000 
000c 0001 0000 000a 000c 000d 0000
Copy the code

The first u2: the maximum heap of an attribute the second U2: the maximum local memory of the attribute the third U4: the length of the instruction description, indicating the number of subsequent U2 the fourth N * U2: instruction, need to refer to the JVM bytecode instruction table fifth U2: exception handling sixth U2: Attribute number of attributes attribute description Attribute interpretation is the same as the previous attribute interpretation, but note that these attributes are generally predefined attributes of the JVM, so you need to interpret the attributes according to the corresponding attribute structure.

I’m not going to read it here,

The class attribute

This is the property of the current class

The last few hexadecimal digits describe the class attributes

Number of attributes

00 01
Copy the code

It means it has an attribute

Description of attributes

00 1000 0000 0200 11
Copy the code

The first u2 is the index of the attribute constant, corresponding to the position in the constant pool. The second u4 is the length of the attribute description, representing the number of subsequent U2s. N * U2: corresponds to the position in the constant pool

Five, the summary

Structure of.class files

Magic Number cafe Babe version number0000 0034Constant pool0016 
0a00 0400 12
09 0003 0013 
0700 14
07 0015 
0100 0161
0100 0149 
0100 063c 696e 6974 3e
01 0003 2829 56
01 0004 436f 6465 
0100 0f4c 696e 654e 756d 6265 7254 6162 6c65 
0100 124c 6f63 616c 5661 7269 6162 6c65 5461 626c 65
01 0004 7468 6973 
0100 134c 7465 7374 2f42 7974 6543 6f64 6554 6573 743b 
0100 0367 6574 
0100 0328 2949 
0100 0a53 6f75 7263 6546 696c 65
01 0011 4279 7465 436f 6465 5465 7374 2e6a 6176 61
0c 0007 0008 
0c00 0500 06
01 0011 7465 7374 2f42 7974 6543 6f64 6554 6573 74
01 0010 6a61 7661 2f6c616e 672f 4f62 6a65 6374The access tag for the current class0021The current class0003The parent class0004Number of implemented interfaces0000field0001 
0002 0005 0006 0000Methods Number of methods0002Methods described0001 0007 0008 0001Method property Description 00090000 0038 
0002 0001 0000 000a 2ab7 0001 2a03 
b500 02b1 0000 0002 000a 0000 000a 
0002 0000 0003 0004 0004 000b 0000 
000c 0001 0000 000a 000c 000d 0000 

0001 000e 000f 0001 
0009 0000 002f 
0001 0001 0000 0005 2ab4 0002 ac00
0000 0200 0a00 0000 0600 0100 0000 
0600 0b00 0000 0c00 0100 0000 0500 
0c00 0d00 00Class attribute00 0100 1000 0000 0200 11
Copy the code

revelation

Through this bytecode learning, I understand the composition of bytecode, Java source code is compiled into a. Class file. But this is really hard, because everything is prescriptive and you can read it by looking at the structure table.

— — — — — — — — — — — — — — —

The more you know, the more you don’t know.

If you have any questions about the content of this article, please comment directly or email me. If you think my writing is good, a “like” is also a sign of support

Shall not be reproduced without permission!

Wechat search [programmer Xu Xiaobai], attention can be the first time to read the latest article. There are 50 high-frequency school recruitment interview questions prepared by me, as well as a variety of learning materials.