Note source: Silicon Valley JVM complete tutorial, millions of playback, the peak of the entire network (Song Hongkang details Java virtual machine)

Update: gitee.com/vectorx/NOT…

Codechina.csdn.net/qq_35925558…

Github.com/uxiahnan/NO…

[toc]

1. Class file structure

1.1. Class bytecode file structure

type The name of the instructions The length of the The number of
The magic number u4 magic Magic number, identify Class file format 4 bytes 1
The version number u2 minor_version Minor Version Number (minor version) 2 bytes 1
u2 major_version Major Version number (large version) 2 bytes 1
Constant pool set u2 constant_pool_count Constant pool counter 2 bytes 1
cp_info constant_pool Constant pool table N bytes constant_pool_count – 1
Access to identify u2 access_flags Access to identify 2 bytes 1
The index set u2 this_class Class index 2 bytes 1
u2 super_class Index of the parent class 2 bytes 1
u2 interfaces_count Interface counter 2 bytes 1
u2 interfaces Interface index set 2 bytes interfaces_count
Set of field tables u2 fields_count Field counter 2 bytes 1
field_info fields Field in the table N bytes fields_count
Method table collection u2 methods_count Method counter 2 bytes 1
method_info methods Method table N bytes methods_count
Property sheet collection u2 attributes_count Attribute counter 2 bytes 1
attribute_info attributes Property sheet N bytes attributes_count

1.2. Class file data type

The data type define instructions
Unsigned number Unsigned numbers can be used to describe numbers, index references, quantitative values, or utF-8 encoded string values. Unsigned numbers belong to the basic data type. U1, U2, U4, and U8 represent 1 byte, 2 byte, 4 byte, and 8 byte respectively
table A table is a compound data structure composed of multiple unsigned numbers or other tables. All tables end with “_info”. Since tables have no fixed length, they are usually preceded by a number.

1.3. The magic number

Magic Number

  • The 4-byte unsigned integer at the beginning of each Class file is called the Magic Number.
  • Its sole purpose is to determine whether the file is a valid Class file that can be accepted by the virtual machine. That is: magic number is the Class file identifier.
  • Fixed magic value to 0xCAFEBABE. It won’t change.
  • If a Class file does not start with 0xCAFEBABE, the vm will throw the following error during file verification:
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.ClassFormatError: Incompatible magic value 1885430635 in class file StringTest
Copy the code
  • The use of magic numbers rather than extensions for identification is mainly for security reasons, since file extensions can be changed at will.

1.4. File version number

The next four bytes of the magic number store the version number of the Class file. It’s also 4 bytes. The fifth and sixth bytes represent the compiled minor version number minor_version, while the seventh and eighth bytes represent the compiled major version number major_version.

Together, they form the format version number of the class file. For example, if the major version number of a Class file is M and the minor version number is M, the format version number of the Class file is determined to be M.m.

The mapping between version numbers and Java compilers is as follows:

1.4.1. Class Mapping between file versions

Major Version (decimal) Minor version (decimal) Compiler version
45 3 1.1
46 0 1.2
47 0 1.3
48 0 1.4
49 0 1.5
50 0 1.6
51 0 1.7
52 0 1.8
53 0 1.9
54 0 1.10
55 0 1.11

The Java version number starts at 45 and increases by 1 for each major JDK release after JDK1.1.

Different versions of Java compilers compile Class files that correspond to different versions. Currently, older Java virtual machines can execute Class files generated by older compilers, but older Java virtual machines cannot execute Class files generated by older compilers. Otherwise the JVM throws Java. Lang. UnsupportedClassVersionError anomalies. (Backward compatibility)

In practical application, the difference between development environment and production environment may lead to this problem. Therefore, we need to pay special attention to whether the JDK version compiled for development is the same as the JDK version in the production environment.

  • If the JDK version of the VM is 1.k (k>=2), the corresponding class file version ranges from 45.0 to 44+ K. 0 (including both ends).

1.5. Constant pool collection

The constant pool is one of the richest areas in the Class file. Constant pooling is also crucial for field and method parsing in Class files.

With the development of The Java virtual machine, the content of constant pool is becoming more and more abundant. The constant pool is the cornerstone of the entire Class file.

The version number is followed by the number of constant pools and several constant pool entries.

The number of constants in a constant pool is not fixed, so you need to place an unsigned number of type U2 in the constant pool entry, representing the constant pool capacity count (constant_pool_count). Contrary to language custom in Java, the capacity count starts at 1 instead of 0.

type The name of the The number of
U2 (unsigned number) constant_pool_count 1
Cp_info (table). constant_pool constant_pool_count – 1

As you can see from the table above, the Class file uses a front-loaded capacity counter (constant_pool_count) followed by several consecutive data items (constant_pool) to describe the constant pool contents. We call this continuous set of constant pool data a constant pool set.

  • Constant pool entryFor storing the various generated at compile timeliteralandSymbolic reference, which goes into the method area after the class loadsRun-time constant poolIn the store

1.5.1. Constant pool counters

Constant_pool_count constant pool counter

  • Because the number of constant pools is variable and the duration is short, two bytes need to be placed to represent the constant pool capacity count.
  • Constant pool capacity (U2 type) :Starting from 1Represents how many constants there are in the constant pool. Constant_pool_count =1 indicates that there are zero constant entries in the constant pool.
  • The values of Demo are:

The value is 0x0016, or 22 if you pinch it. Notice that there are actually only 21 constants. The index ranges from 1 to 21. Why is that?

Normally when we write code we start at 0, but this constant pool starts at 1 because it leaves the 0th constant empty. This is so that some of the data that points to the constant pool index value will need to be expressed “without reference to any constant pool item” under certain circumstances, which can be represented by the index value 0.

1.5.2. Constant pool table

Constant_pool is a table structure with indexes ranging from 1 to CONSTANt_pool_count – 1. It tells you how many constant terms there are.

There are two main types of constants in the constant pool: Literal and Symbolic References.

It contains all string constants, class or interface names, field names, and other constants referenced in the class file structure and its substructures. Each item in the constant pool has the same characteristics. The first byte is a type marker that determines the format of the item. This byte is called a Tag byte.

type Mark (or logo) describe
CONSTANT_Utf8_info 1 The character string is utF-8 encoded
CONSTANT_Integer_info 3 Integer literals
CONSTANT_Float_info 4 Floating point literals
CONSTANT_Long_info 5 Long integer literals
CONSTANT_Double_info 6 A double – precision floating-point literal
CONSTANT_Class_info 7 Symbolic reference to a class or interface
CONSTANT_String_info 8 String type literals
CONSTANT_Fieldref_info 9 Symbolic reference to a field
CONSTANT_Methodref_info 10 Symbolic references to methods in a class
CONSTANT_InterfaceMethodref_info 11 Symbolic references to methods in the interface
CONSTANT_NameAndType_info 12 Symbolic reference to a field or method
CONSTANT_MethodHandle_info 15 Represents a method handle
CONSTANT_MethodType_info 16 Flag method type
CONSTANT_InvokeDynamic_info 18 Represents a dynamic method call point

ⅰ. Literal and symbolic references

Before we can interpret these constants, we need to clarify a few concepts.

There are two main types of constants in the constant pool: literals and Symbolic References. The following table:

constant Concrete constant
literal Text string
A constant value declared as final
Symbolic reference Fully qualified names of classes and interfaces
The name and descriptor of the field
The name and descriptor of the method

Fully qualified name

Com /atguigu/test/Demo is the fully qualified name of the class. It simply replaces the “. “of the package name with”/”. In order to avoid confusion between consecutive fully qualified names, it is usually used with a “; “. Indicates that the fully qualified name ends.

The simple name

A simple name is a method or field name that has no type or parameter modification. In the example above, the add() method and num fields of the class have simple names add and num, respectively.

The descriptor

Descriptors are used to describe the data type of a field, the parameter list of a method (including the number, type, and order), and the return value. According to the descriptor rules, basic data types (byte, char, double, float, int, long, short, Boolean) and void types representing no returned values are represented by an uppercase character, while object types are represented by the character L plus the fully qualified name of the object, as shown in the following table:

identifier meaning
B Basic data type Byte
C Char, the base data type
D The base data type double
F The basic data type float
I Base data type int
J Base data type long
S The basic data type short
Z Boolean is the basic data type
V Stands for void
L Object types, such as:Ljava/lang/Object;
[ Array type, representing a one-dimensional array. For example: ‘double[] is [D

Methods are described by descriptors in the order of the argument list followed by the return value. The argument list is placed within a set of parentheses () in strict order. For example, the descriptor of method java.lang.string toString () is ()Ljava/lang/String; , int ABC (int[]x, int y) descriptor is ([II)I.

Supplementary notes:

The virtual machine does dynamic linking only when the Class file is loaded, which means that the final memory layout information for each method and field is not stored in the Class file. Therefore, symbolic references to these fields and methods cannot be used directly by the virtual machine without conversion. When the virtual machine is running, the symbolic reference needs to be obtained from the constant pool, replaced by a direct reference during the parsing phase of the class loading process, and translated into a specific memory address.

Here are the differences and associations between symbolic and direct references:

  • Symbol reference: symbol reference toA set of symbolsThe symbol can be any literal, as long as it is used to unambiguously locate the target.Symbolic references are independent of the memory layout implemented by the virtual machineThe referenced target is not necessarily loaded into memory.
  • Direct reference: A direct reference can be directA pointer to a target, a relative offset, or a handle that can be indirectly located to the target.Direct references are related to the memory layout implemented by the virtual machine, the direct references translated by the same symbol reference on different VM instances may not be the same. If there is a direct reference, the target of the reference must already exist in memory.

Ii. Types and structures of constants

Each constant in the constant pool is a table, and there are 14 different table structure data after J0K1.7. As shown in the following table:

From the description of each type, we can also know what each type is used to describe in the constant pool (mainly literals, symbolic references). For example, CONSTANT_Integer_info is used to describe literal information in the constant pool, and only integer literal information.

The constant item types labeled 15, 16, and 18 are designed to support dynamic language calls (added in jdk1.7).

Details:

  • The CONSTANT_Class_info structure is used to represent a class or interface
  • CONSTAT_Fieldref_info, CONSTAHT_Methodref_infoF, and lCONSTANIT_InterfaceMethodref_info structures represent fields, square aggregates, and oral methods
  • The CONSTANT_String_info structure is used to represent a constant object of type String
  • CONSTANT_Integer_info and CONSTANT_Float_info represent 4-byte (int and float) numeric constants
  • The CONSTANT_Long_info and CONSTAT_Double_info structures represent 8-character numeric constants (long and double)
    • In the constant maximum pool table of a class file, the a-byte line usually takes up space for both table members. If a CONSTAHT_Long_info and CNSTAHT_Double_info structure has index bit N in the constant pool, then there is an index bit N +2 available in the constant pool. Entries with index n+1 in the constant pool length are still valid but must be considered unavailable.
  • The CONSTANT_NameAndType_info structure is used to represent a field or method, but unlike the previous three structures, the CONSTANT_NameAndType_info structure does not specify the class or interface to which the field or method belongs.
  • CONSTANT_Utf8_info is used to represent the value of a character constant
  • The CONSTANT_MethodHandle_info structure is used to represent method handles
  • The CONSTANT_MethodType_info structure represents method types
  • The CONSTANT_InvokeDynamic_info structure represents the bootstrap method used by the Invokedynamic command and the dynamic invocation used by the invokedynamic command Name, arguments, and return types, and you can pass in a series of constants called static arguments to the bootstrap method.

Analytical method:

  • Byte by byte parsing

  • Using javap command parsing: javap-verbose demo. class or jclasslib tools is more convenient.

Conclusion 1:

  • What these 14 tables (or constant item structures) have in common is that the first bit at the beginning of the table is a tag of type U1, which indicates which table structure, or constant type, is being used for the constant item.
  • In the constant pool list, the CONSTANT_Utf8_info constant entry is a modified UTF-8 encoding that stores constant string information such as literal strings, fully qualified names of classes or interfaces, simple names of fields or methods, and descriptors.
  • Another feature of the 14 constant item structures is that 13 of them occupy fixed bytes, while only CONSTANT_Utf8_info occupies fixed bytes and its size is determined by length. Why is that?Because you can see from the contents of the constant pool that it holds literal and symbolic references, ultimately these contents will be a string whose size is determined at program writing timeFor example, if you define a class, the class name can be either long or short, so the size is not fixed until you compile it, and then you know the length by utF-8 encoding.

Conclusion 2:

  • Constant pool: A repository of resources in a Class file. It is the data type that is most associated with other items in the Class file structure (many of the following data types will point to this) and one of the data items that occupies the most space in the Class file.
  • Why include this in the constant pool? Java code is not “wired” for Javac compilation like C and C++, but dynamically linked when the VIRTUAL machine loads the C1ass file. In other words,The final memory layout information for each method or field is not stored in the Class file, so symbolic references to these fields or methods cannot be used directly by the virtual machine without running time conversion to the actual memory entry address. When the virtual machine is running, symbolic references need to be retrieved from the constant pool and then parsed and translated into specific memory addresses at class creation or runtime. More on class creation and dynamic linking will be explained during the virtual machine class loading process

1.6. Access flags

Access flags (access_flag, access flag, access flag)

After the constant pool, the access tag is immediately followed. The two-byte token identifies some Class or interface level access information, including whether the Class is a Class or an interface; Whether it is defined as public; Whether to define an abstract type; If it is a class, whether it is declared final, etc. The various access tags are shown below:

Sign the name Flag values meaning
ACC_PUBLIC 0x0001 The flag is of type public
ACC_FINAL 0x0010 The flag is declared final and only the class can set it
ACC_SUPER 0x0020 The flag allows the use of the new semantics of the Invokespecial bytecode instruction. This flag defaults to true for classes compiled after JDK1.0.2. (Using enhanced methods to call superclass methods)
ACC_INTERFACE 0x0200 Flag this is an interface
ACC_ABSTRACT 0x0400 Whether it is of the abstract type. For interfaces or abstract classes, the second flag value is true and the other types are false
ACC_SYNTHETIC 0x1000 Flag that this class is not generated by user code (i.e., classes generated by the compiler, no source code counterpart)
ACC_ANNOTATION 0x2000 This is a note
ACC_ENUM 0x4000 Flag This is an enumeration

Class access is usually a constant starting with ACC_.

Each type of representation is achieved by setting specific bits in the 32 bits of the access tag. For example, if the public final class, the marked ACC_PUBLIC | ACC_FINAL.

Using ACC_SUPER allows a class to more accurately locate the super.method() method of its parent class, which modern compilers set and use.

Supplementary notes:

  1. A class file with the ACC_INTERFACE flag represents an interface rather than a class, and vice versa represents a class rather than an interface.

    • If a class file has the ACC_INTERFACE flag set, the ACC_ABSTRACT flag must also be set. It can no longer set the ACC_FINAL, ACC_SUPER, or ACC_ENUM flags.
    • If the ACC_INTERFACE flag is not set, the class file can have all of the flags in the table above except the ACC_ANNOTATION flag. Except, of course, for mutually exclusive flags like ACC_FINAL and ACC_ABSTRACT. The two marks shall not be set at the same time.
  2. The ACC_SUPER flag is used to determine which execution semantics are used by the Invokespecial instruction in a class or interface. Compilers targeting the Java virtual machine instruction set should set this flag. For Java SE 8 and later, the Java virtual machine assumes that every class file has the ACC_SUPER flag set, regardless of the actual value of the flag in the class file and regardless of the version number of the class file.

    • The ACC_SUPER flag is designed to be backward compatible with code compiled by older Java compilers. The current ACC_SUPER flag is undefined in access_flags generated by compilers prior to JDK1.0.2 and will be ignored by Java virtual machine implementations of 0RACle if it is set.
  3. The ACC_SYNTHETIC flag means that the class or interface is generated by the compiler, not the source code.

  4. The annotation type must have the ACC_ANNOTATION flag set. If the ACC_ANNOTATION flag is set, then the ACC_INTERFACE flag must also be set.

  5. The ACC_ENUM flag indicates that the class or its parent class is an enumerated type.

1.7. Class index, parent index, and interface index

After the tag is accessed, the class category, the parent class category, and the interface implemented are specified in the following format:

The length of the meaning
u2 this_class
u2 super_class
u2 interfaces_count
u2 interfaces[interfaces_count]

To determine the inheritance relationship of this class:

  • The class index is used to determine the fully qualified name of the class
  • The parent index is used to determine the fully qualified name of the parent of this class. Since the Java language does not allow multiple inheritance, there is only one parent class index. All Java classes except java.1ang.Object have a parent class, so none of the Java classes except java.lang.Object have a parent class index of E.
  • The interface index collection is used to describe the interfaces implemented by the class. The implemented interfaces are arranged from left to right after the implements statement (or, if the class itself is an interface, the extends statement).

1.7.1. This_class (Class index)

An unsigned 2-byte integer pointing to the index of the constant pool. It provides fully qualified names for classes such as com/atguigu/java1/Demo. This_class must be a valid index value for an item in the constant pool table. The members of the constant pool at this index must be a CONSTANT_Class_info structure, which represents the class or interface defined by the class file.

1.7.2. Super_class (Superclass index)

An unsigned 2-byte integer pointing to the index of the constant pool. It provides the fully qualified name of the parent of the current class. If we don’t inherit any classes, the default is Java /lang/ Object classes. Also, since Java does not support multiple inheritance, there is only one parent class.

Super_class cannot point to a parent that is final.

1.7.3. interfaces

Points to a collection of constant pool indexes that provide a symbolic reference to all implemented interfaces

Since a class can implement multiple interfaces, you need to store indexes of multiple interfaces in an array, indicating that each index of the interface is also a CONSTANT_Class pointing to a constant pool (which of course must be an interface, not a class).

ⅰ. Interfaces_count

The value of the interfaces_count item represents the number of direct superinterfaces for the current class or interface.

Ii.interfaces [] (Set of interfaces)

The value of each member in interfaces[] must be a valid index value for an entry in the constant pool table, and its length is interfaces_count. Each member interfaces[I] must be a CONSTANT_Class_info structure, where 0 <= I < interfaces_count. In interfaces[], the interfaces are represented in the same left-to-right order as given in the corresponding source code; that is, interfaces[0] correspond to the leftmost interface in the source code.

1.8. Collection of field tables

fields

Used to describe variables declared in an interface or class. Fields include class-level variables and instance-level variables, but do not include local variables declared inside methods or code blocks.

The name of the field and the data type for which the field is defined are not fixed. They can only be described by referring to constants in the constant pool.

It points to the constant pool index collection, which describes the complete information for each field. For example, the identifier of a field, the access modifier (public, private, or protected), whether it is a class or instance variable (static modifier), whether it is a constant (final modifier), etc.

Matters needing attention:

  • The collection of field tables does not list fields inherited from a parent class or an implemented interface, but it may list fields that do not exist in the original Java code. For example, in order to keep the external class accessible, fields pointing to the external class instance are automatically added to the inner class.
  • Fields cannot be overloaded in the Java language. Two fields must have different names, regardless of whether their data types and modifiers are the same. But for bytecodes, if the descriptors of two fields are different, the same name is valid.

1.8.1. Field counters

Fields_count (field counter)

The value fields_count represents the number of members in the current class file fields table. It is represented by two bytes.

Each member of the fields table is a field_info structure that represents all class or instance fields declared by the class or interface, not including variables declared inside the method or fields inherited from the parent class or interface.

Sign the name Flag values meaning The number of
u2 access_flags Access tokens 1
u2 name_index Field name index 1
u2 descriptor_index Descriptor index 1
u2 attributes_count Attribute counter 1
attribute_info attributes Attribute set attributes_count

1.8.2. Field in the table

ⅰ. Field table access identifier

We know that a field can be modified with a variety of keywords, such as scope modifiers (public, private, protected), static modifiers, final modifiers, volatile modifiers, and so on. Therefore, it can mark fields with flags just like the access flags of a class. The access flags for the fields are as follows:

Sign the name Flag values meaning
ACC_PUBLIC 0x0001 Whether the field is public
ACC_PRIVATE 0x0002 Whether the field is private
ACC_PROTECTED 0x0004 Whether the field is protected
ACC_STATIC 0x0008 Whether the field is static
ACC_FINAL 0x0010 Whether the field is final
ACC_VOLATILE 0x0040 Whether the field is volatile
ACC_TRANSTENT 0x0080 Whether the field is TRANSIENT
ACC_SYNCHETIC 0x1000 Whether the field is generated automatically by the compiler
ACC_ENUM 0x4000 Whether the field is enum

ⅱ. Descriptor index

Descriptors are used to describe the data type of a field, the parameter list of a method (including the number, type, and order), and the return value. According to the descriptor rules, basic data types (byte, char, double, float, int, long, short, Boolean) and void types representing no return values are represented by an uppercase character, while objects are represented by the character L plus the fully qualified name of the object, as follows:

identifier meaning
B Basic data type Byte
C Char, the base data type
D The base data type double
F The basic data type float
I Base data type int
J Base data type long
S The basic data type short
Z Boolean is the basic data type
V Stands for void
L Object types, such as:Ljava/lang/Object;
[ Array type, representing a one-dimensional array. For example: ‘double[][][] is [[[D

ⅲ. Set of attribute tables

A field may also have attributes that store additional information. Initialization values, some comment information, etc. The number of attributes is stored in attribute_count, and the content of the attributes is stored in the Attributes array.

// Take the constant attribute as an example. The structure is:
ConstantValue_attribute{
	u2 attribute_name_index;
	u4 attribute_length;
    u2 constantvalue_index;
}
Copy the code

Note: For constant attributes, the attribute_length value is always 2.

1.9. Collection of method tables

Methods: Points to a collection of constant pool indexes that completely describe the signature of each method.

  • In a bytecode file, each method_info entry corresponds to method information in a class or interface. For example, the method’s access modifier (public, private, or protected), the method’s return value type, and the method’s parameter information.
  • If the method is not abstract or native, it will be reflected in the bytecode.
  • On the one hand, the methods table describes only methods declared in the current class or interface, not methods inherited from a parent class or interface. On the other hand, the methods table may have methods added automatically by the compiler, most typically the method information generated by the compiler (e.g., class (interface) initialization methods

    () and instance initialization methods

    ()).

Precautions for use:

In the Java language, to overloaded (phrase), a method, in addition to simple with the original method with the same name, also requires must have a sign with the original method of different characteristics, characteristics of the signature is a method of each parameter in the constant pool collection referenced by the field symbol, is because the return value does not include among the characteristics of the signature, So there is no way in the Java language to override an existing method just by returning a different value. In the Class file format, however, the signature scope is larger, and two methods can coexist as long as the descriptors are not exactly the same. That is, if two methods have the same name and signature but return different values, they can legally coexist in the same class file.

That is, while the Java syntax specification does not allow multiple methods to be declared with the same signature in a class or interface, bytecode files, in contrast, do allow multiple methods to be signed with the same signature, provided that their return values are not the same.

1.9.1. Method counters

Methods_count (method counter)

Methods_count specifies the number of members of the methods table in the current class file. It is represented by two bytes.

Each member of the Methods table is a method_INFO structure.

1.9.2. Method table

Methods []

Each member in the Methods table must be a method_info structure that represents a complete description of a method in the current class or interface. If the access_flags entry for a method_info structure sets neither the ACC_NATIVE nor ACC_ABSTRACT flag, then the structure should also contain the Java virtual machine instructions used to implement the method.

The method_info structure can represent all methods defined in classes and interfaces, including instance methods, class methods, instance initializers, and class or interface initializers

The structure of the method table is the same as that of the field table.

Sign the name Flag values meaning The number of
u2 access_flags Access tokens 1
u2 name_index Method name index 1
u2 descriptor_index Descriptor index 1
u2 attributes_count Attribute counter 1
attribute_info attributes Attribute set attributes_count

Method table access flags

Like field tables, method tables have access flags, and some of their flags are the same and some of their flags are different.

Sign the name Flag values meaning
ACC_PUBLIC 0x0001 Public, methods can be accessed from outside the package
ACC_PRIVATE 0x0002 Private, methods can only be accessed by this class
ACC_PROTECTED 0x0004 Protected, the method is accessible both by itself and by subclasses
ACC_STATIC 0x0008 Static, static method

1.10. Collection of property sheets

The collection of property tables following the collection of method tables refers to the auxiliary information carried by the class file, such as the name of the source file of the class file. And any annotations with retentionPolicy. CLASS or retentionPolicy. RUNTIME. This kind of information is usually used for validation and running of Java virtual machines, and debugging of Java programs, and does not require in-depth knowledge.

In addition, field tables and method tables can have their own property tables. Used to describe information specific to certain scenarios.

The property sheet collection is less restrictive, requiring the property sheets to be in a strict order, and any compiler implementing it can write its own property information to the property sheet as long as it does not duplicate existing property names, but the Java virtual machine will ignore properties it does not recognize when it runs.

1.10.1. Attribute counters

Attributes_count

The value of attributes_count represents the number of members in the current class file attribute table. Each entry in the attribute table is an Attribute_info structure.

1.10.2. Property sheet

Attributes [] (Attribute sheet)

The value of each entry in the attribute table must be an Attribute_info structure. The structure of the property list is flexible, and various properties can meet the following structure.

A common format for attributes

type The name of the The number of meaning
u2 attribute_name_index 1 Attribute name index
u4 attribute_length 1 Attribute length
u1 info attribute_length Property sheet

Attribute types

There are actually many types of property sheets, and the Code property seen above is just one of them. There are 23 properties defined in Java8. The following are predefined attributes in the virtual machine:

The attribute name Use location meaning
Code Method table Bytecode instructions compiled into Java code
ConstantValue Field in the table Constant pool defined by the final keyword
Deprecated Class, method, field list Methods and fields declared deprecated
Exceptions Method table Method throws an exception
EnclosingMethod The class file This property is only available if a class is local or anonymous, and is used to identify the enclosing method of the class
InnerClass The class file Inner class list
LineNumberTable Code attributes The mapping of Java source line numbers to bytecode instructions
LocalVariableTable Code attributes Method local variable description
StackMapTable Code attributes New property in JDK1.6 that allows the new type checker to match classes needed to process the local variables and operands of the target method
Signature Class, method table, field table Used to support method signatures in case of generics
SourceFile The class file Record the source file name
SourceDebugExtension The class file Store additional debugging information
Synthetic Class, method table, field table Flag methods or fields are automatically generated by the compiler
LocalVariableTypeTable class Unfortunately, signature instead of descriptor was added to describe generic parameterized types after the introduction of generic syntax
RuntimeVisibleAnnotations Class, method table, field table Support for dynamic annotations
RuntimeInvisibleAnnotations Class, method table, field table Use to indicate which annotations are not visible at runtime
RuntimeVisibleParameterAnnotation Method table Role similar to RuntimeVisibleAnnotations attributes, only the function object or method
RuntimeInvisibleParameterAnnotation Method table Role similar to RuntimeInvisibleAnnotations attributes, only the function object or method
AnnotationDefault Method table Use to record the default value of the annotation class element
BootstrapMethods The class file The bootstrap method qualifier used to hold the invokedDynamic instruction reference

Or (check the website)

Detailed explanation of some attributes

1) ConstantValue properties

The ConstantValue property represents a constant field value. Is in the property table of the field_info structure.

ConstantValue_attribute{
	u2 attribute_name_index;
	u4 attribute_length;
	u2 constantvalue_index;// The index of the field value in the constant pool. The entries in the constant pool at the index give the constant value represented by the property. (For example, a value of type 1ong is CONSTANT_Long in the constant pool)
}  
Copy the code

(2) Deprecated attribute

The Deprecated attribute was introduced in JDK1.1 to support the keyword @deprecated in annotations.

Deprecated_attribute{
	u2 attribute_name_index;
	u4 attribute_length;
}
Copy the code

(3) Code attributes

The Code attribute is the Code that holds the method body. However, not all method tables have Code attributes. Like interfaces or abstract methods, they have no concrete method body, and therefore no Code attribute. The structure of the Code property table is shown below:

type The name of the The number of meaning
u2 attribute_name_index 1 Attribute name index
u4 attribute_length 1 Attribute length
u2 max_stack 1 Maximum depth of operand stack
u2 max_locals 1 The storage space required by the local variable scale
u4 code_length 1 The length of a bytecode instruction
u1 code code_lenth Store bytecode instructions
u2 exception_table_length 1 Anomaly table length
exception_info exception_table exception_length Exception table
u2 attributes_count 1 Attribute set counter
attribute_info attributes attributes_count Attribute set

As you can see, the first two entries of the Code property table are consistent with the property table, i.e., the Code property table follows the structure of the property table, and the last ones are his own custom structure.

(4) InnerClasses attribute

For the sake of illustration, define a Class that represents a Class or interface in C format. If C’s constant pool contains a CONSTANT_Class_info member that represents a class or interface that does not belong to any package, then C’s ClassFile structure must have the corresponding InnerClasses attribute in the property table. The InnerClasses attribute is a property table in the ClassFile structure that was introduced in JDK1.1 to support InnerClasses and internal interfaces.

(5) LineNumberTable properties

The LineNumberTable property is an optional variable-length property located in the property table of the Code structure.

The LineNumberTable property describes the mapping between Java source line numbers and bytecode line numbers. This property can be used to locate the number of lines of code executed during debugging.

  • Start_pc, which is the bytecode line number; 1ine_number, which is the Java source line number.

The LineNumberTable attribute can appear in any order in the attribute table of the Code attribute. In addition, multiple LineNumberTable attributes can collectively represent what a line number represents in the source file. That is, the LineNumberTable attribute does not need to correspond to a line in the source file.

// LineNumberTable attributes table structure:
LineNumberTable_attribute{
    u2 attribute_name_index;
    u4 attribute_length;
    u2 line_number_table_length;
    {
        u2 start_pc;
        u2 line_number;
    } line_number_table[line_number_table_length];
}
Copy the code

6 LocalVariableTable properties

LocalVariableTable is an optional variable-length property located in the property table of the Code property. It is used by the debugger to determine information about local variables in a method during execution. In the property table of the Code property, the LocalVariableTable property can appear in any order. Each local variable in the Code attribute can have at most one LocalVariableTable attribute.

  • Start PC + length indicates the offset of the beginning and end of the life cycle of this variable in the bytecode (this life cycle from beginning E to end 10).
  • Index is the slot of this variable in the local variable table(The slots can be reused)
  • Name is the variable name
  • Descriptor indicates a local variable type description
// LocalVariableTable attributes table structure:
LocalVariableTable_attribute{
    u2 attribute_name_index;
    u4 attribute_length;
    u2 local_variable_table_length;
    {
        u2 start_pc;
        u2 length;
        u2 name_index;
        u2 descriptor_index;
        u2 index;
    } local_variable_table[local_variable_table_length];
}
Copy the code

All landowners Signature properties

The Signature attribute is an optional fixed-length attribute in the property table of the ClassFile, field_info, or method_info structures. In the Java language, the Signature attribute records generic Signature information for any class, interface, initializer, or member whose generic Signature contains Type Variables or Parameterized Types.

Today SourceFile properties

SourceFile attribute structure

type The name of the The number of meaning
u2 attribute_name_index 1 Attribute name index
u4 attribute_length 1 Attribute length
u2 sourcefile index 1 Source code file element citation

As you can see, the length is always fixed at 8 bytes.

⑨ Other Attributes

There are more than 20 predefined attributes in the Java VIRTUAL machine. This section does not introduce one of them. After you understand the essence of the above attributes, it is easy to interpret the other attributes.