directory

  • preface
  • How do I read the class file
  • The basic concept
    • Unsigned numbers & tables
    • Constant pool
    • Magic Number & version number
    • Constant pool
    • Access tokens
    • Collection of class leads & parent leads & interface leads
    • Set of field tables
    • Method table collection
    • Property sheet collection
    • Code attributes
  • Parse class files using JavAP
  • Refer to table

preface

Recently I studied reflection and dynamic proxy in Java and found that using these two Java artifacts requires an understanding of the bytecode of the.class file. Then read the relevant information, in this blog to sort out, but also as their own learning record.

How do I read the class file

Java portability is based on the fact that.java files are compiled into unique bytecode files. Class files can run on JVMS on different operating systems. A.class file is a set of binary streams in 8-bit byte units, and the data items are arranged in a strict and compact order in a. Class file, without any delimiters.

When the programmer compiles the. Java file, a. Class file is generated in the specified path, which can be opened directly in Hex Viewer format using EditPlus

ClassTest.java

package com.classloader;
public class ClassTest {
    public static void main(String[] args) {
        System.out.println("Hello,World!"); }}Copy the code

ClassTest.class

To be able to read a class file, you need to understand some of the basic concepts in a.class file

The basic concept

Unsigned numbers & tables

Unsigned numbers are a basic data type, usually u1, U2,u4,u8 for 1, 2, 4, 8 bytes of unsigned numbers. An unsigned number is a string value that describes a number, index reference, quantity value, or UTF-8 encoding, and can be called the basic unit of a. Class file

A table is a compound data type composed of multiple unsigned numbers or other tables, and the essence of the entire. Class file is a table. (Tables usually end with _info)

Whether it is unsigned numbers or tables, when you need to describe multiple data of the same type but with variable amounts, you often use the form of a capacity counter in one position plus several consecutive data items.

Constant pool

Magic Number & version number

The first four bytes of each. Class file are called “magic numbers”, which determine whether the. Class file is a. Class file that can be received by the HOTSPOT virtual machine

The four bytes following the magic number are version numbers, the fifth and sixth bytes are “minor version numbers”, and the seventh and eighth bytes are “major version numbers”. The Java version number starts at 45, and every major JDK release since JDK 1.1 has a major version number that goes up by +1, and older JDKS are backward-compatible with older.class files. (Note: Under hex Viewer, the numbers in the.class file are all hexadecimal numbers.)

Constant pool

Following the main version number is the entry to the constant pool, which is the data type most associated with other items in the.class file structure, and one of the data items that occupy the most space in the.class file. Since the number of constants in the constant pool is not fixed, the entry to the constant pool needs to place a u-2 item representing the constant pool capacity count. This capacity count starts at 1 (as opposed to traditional programmer counting).

The constant pool count for the.class file in the figure above is 34, and since it starts at 1, the number of constants is 33 (22 in hexadecimal is 34 in decimal). In other words, the 33 tables after the count bits represent constants.

There are two main types of constants in the constant pool: literal and Symbolic References. Literals are close to the constant concepts of the Java language layer, such as text strings, constant values declared final, and so on. Character references include three types of variables: fully qualified names of classes and interfaces, field names and descriptors, and method names and descriptors.

As mentioned earlier, each constant in the constant pool is a table, and there are 11 tables with different structures. These tables all have one common feature. The first bit is a flag bit of type U1, which indicates the type of the current constant. (For details, see [Refer to form])

To summarize, the way to view constants is:

1. The first byte is tag. Check the constant pool type table for type 2. Find the table of the corresponding structure, and find the other unsigned numbers that are constants after the tag

Access tokens

After the constant pool ends, the next two bytes represent access flags (access_flags). This flag identifies some class or interface level access information, including: is the classs an entity class or an interface; Whether to define it as public; Is an abstract class; Final class, etc. For the mapping of specific access flags, see [Refer to Table].

Collection of class leads & parent leads & interface leads

This_class and super_class are both a U2 type data, and the interface lead set is a u2 type data set. The access flag is followed by the class lead and the parent lead, which take up four bytes. Each of them points to a class descriptor constant of type CONSTANT_Class_info. The index value in a constant of type CONSTANT_Class_info can be used to find the fully qualified name string defined in a constant of type CONSTANT_Utf8_info.

Set of field tables

The two bytes following the set of interface indexes are of type fields_count, which describes how many field tables are in the set of field tables. A field table corresponds to a programmer’s fields in a. Java file. Each type of the field table corresponds to modifiers, reference names, and so on. See the structure of the field table and the structure types in the field table for details.

Method table collection

After the collection of field tables is complete, the next two bytes are the method_count type, which describes how many method tables there are in the collection of method tables. Method tables correspond to methods written by programmers in. Java files, and the types of method tables correspond to modifiers, reference names, and so on. The structure of the method table and each structure type in the method table are shown in [Refer to table]

Property sheet collection

The four bytes following the method table collection describe the property table collection. The.class file has many attributes, and for each attribute, its name needs to reference a constant representation of type CONSTANT_Utf8_info from the constant pool.

Code attributes

After the Code in the body of a Java program method is compiled and processed by JavAC, the final programming bytecode instructions are stored in the Code property. This brings us to the bytecode execution engine, which will be covered in other blogs. After the property sheet set is the Code property, see [Refer to table] for the specific corresponding type.

Parse class files using JavAP

For parsing.class files, the JDK provides a class resolution tool called Javap. Javap generated. Class file parsing is intuitive, easy to understand, is half raw meat. Combined with the concepts described above, it should not be difficult to understand. To use it, type in CMD:

Javap verbose class nameCopy the code

The output looks something like this :(ClassTest. Class as an example)

Refer to table

Constant pool type table

All structure types

Access tokens

Field table structure

Field table access flag

The meaning of each flag is the same as the content of the last half of the flag, representing the field modifier

Descriptor Indicates the character meaning

== For array types, each bit is described with a prefixed “[“. For example, a java.lang.String[][] is recorded as [[Ljava/lang/String an int[] is recorded as [I==

Method table structure

Method access flag

Code attributes

17:03 minutes and 40 seconds on 14 January 2020

Tip: