To explore the AST

What is AST?

AST stands for Abstract Syntax Tree, which translates to Abstract Syntax Tree.

An AST is an abstract representation of the syntactic structure of source code. It represents the syntactic structure of a programming language as a tree, with each node in the tree representing a structure in the source code. AST Wikipedia

What is the AST structure like?

From the above explanation we know that the AST is analyzed from source code, so we can now use a small piece of code as an example to see what the AST structure looks like. We can open the AST online preview on this website to help us see the specific AST structure of this code. Here, @babel/ Parser is selected to parse the code.

🌰 chestnut: simple console.log

console.log('hello world');

🌰 Analysis: AST of console.log

As you can see, the AST that console.log parses is actually an object, and when we expand this object layer by layer, Object (console) and Property (log) were found in body -> expression -> callee. The simple tree diagram is as follows:

This object contains many other attributes, the meanings of which can also be explained in the AST object documentation.

What can AST do?

If the AST is an object, is it possible to extract the console and log in the form of object deconstruction? Now that we can get the nodes, can we add some other operations to the source code? So you can use AST to write a WebPack Loader or plugin to do some pre-processing of the source code.

Common AST scenarios include code transformation traversal via bable, removing unused variables from code, removing console.log, antD component loading on demand (babel-plugin-import), syntax highlighting, and so on. This is actually a new AST tree generated by iterating through JavaScript’s abstract syntax tree and then returning to it.

Added: Source code execution process

Lexical Analysis (token)

Because the JS engine executes code from top to bottom and from left to right, the first stage of compilation is to scan the source code text. The scanner scans the text from left to right, breaking the text into words. These words are then passed into the segmentation, which goes through a series of recognizers (keyword recognizer, identifier recognizer, constant recognizer, operator recognizer, etc.) to determine the part of speech of these words, and the product of this process is the token sequence. Token sequences are represented by

, where type represents a word type and value is an attribute value.

Lexical analysis preview

Console. log(‘hello world’) parsing token as follows:

Grammar Analysis (AST)

Grammar analysis personal understanding is based on the collection of lexical analysis, on the JS rules of Grammar output AST abstract parsing tree, “Grammar Naming Rules Reference”, in addition, this tree can be arbitrary traversal and add, delete, change and check, because this tree is generated before the code into computer identification.


Now that we have some basic understanding of source code parsing to execution, we can try to write a simple Webpack Loader or plugin to practice understanding source code execution or directly make some specific interventions in the process:

Parser -> traverse -> generate