What is an abstract syntax tree?

In computer science, abstract syntax and abstract syntax trees are online editors for tree-like representations of the abstract syntax structure of source code

Our common browsers convert JS code into abstract syntax trees for further analysis and other operations. Therefore, converting JS into abstract syntax tree is more conducive to program analysis.

As shown in figure:

The variable declaration statement in the image above looks like the one on the right after being converted to an AST.

Take a look at the left image:

Var is a keyword

The AST is a definer

Equals is Equal in many ways, and we’ll see that later

Is tree is a string

; Is Semicoion

Let’s look at the right picture again:

The first abstract syntax tree that a piece of code converts to is an object that has a top-level type attribute ‘Program’. The second attribute is body, which is an array.

Each item in the Body array is an object that contains all the description information for the statement

Type: describes the type of the statement -- variable declaration statement kind: keywords of variable declaration -- var declaration: array of declaration contents, each item in which is also an object type: describes the type id of the statement: Type: type value: value "is tree" without quotes row: "\"is tree"\" With quotesCopy the code

What are the uses of abstract syntax trees?

Code syntax check, code style check, code formatting, code highlighting, code error prompts, code completion and so on

For example: JSLint, JSHint code error or style check, find some potential errors IDE error prompts, formatting, highlighting, autocomplete code confusion compression such as UglifyJS2, etc

Optimize and change the code, change the code structure to achieve the desired structure

CommonJS, AMD, CMD, UMD and other code specifications are converted between CoffeeScript, TypeScript, JSX and other codes into native Javascript

What tools or libraries can be used to convert source code into abstract syntax trees?

The javascript Parser converts JAVASCRIPT source code into abstract syntax trees.

The browser converts the JS source code through the parser into an abstract syntax tree, which is further converted into bytecode or generated directly into machine code

Generally speaking, every JS engine has its own abstract syntax tree format, such as Chrome V8 engine, Firefox SpiderMonkey engine, etc. MDN provides a detailed description of SpiderMonkey AST format, which is the industry standard. SpiderMonkey, part of the Mozilla project, is a JavaScript scripting engine implemented in C. In order to run JavaScript code in SpiderMonkey, an application must have three elements: JSRuntime, JSContext, and global objects.

Common javascript Parser

esprima

traceur

acorn

shift

Let’s take esprima as an example

The installation

 npm install esprima estraverse escodegen -S
Copy the code

Esprima involves three library names and features as follows:

Esprima transforms source code into an abstract syntax tree


	let esprima = require('esprima'); / / introduce esprimalet jsOrigin = 'function eat(){}; '; // Define a js sourceletAST = esprima.parse(jsOrigin); Parse transforms the JS source code into an abstract syntax tree console.log(AST); // Print the generated abstract syntax treetype: 'Program'Top, / /typeAttribute body: [FunctionDeclaration {type: 'FunctionDeclaration'Id: [Identifier], params: [], Body: [BlockStatement], generator:false// Is the generator function expression:false// is not an expression async:false// is an asynchronous function}, EmptyStatement {type: 'EmptyStatement'}].sourceType: 'script'} * /Copy the code

Estraverse traverses and updates the abstract syntax tree

Let’s take a look at this library on NPM before we introduce it to you. This library has more than 5 million downloads and no README documentation.

Before we walk through the abstract syntax tree, we need to understand that it is traversal smoothly

	
	let estraverse = require('estraverse');
	

	estraverse.traverse(AST, {
	    enter(node){
	        console.log('enter', node.type)
	        if(node.type === 'Identifier') {
	            node.name += '_enter'
	        }
	    },
	    leave(node){
	        console.log('leave', node.type)
	        if(node.type === 'Identifier') {
	            node.name += '_leave'
	        }
	    }
	})
	
	// enter Program
	// enter FunctionDeclaration
	// enter Identifier
	// leave Identifier
	// enter BlockStatement
	// leave BlockStatement
	// leave FunctionDeclaration
	// enter EmptyStatement
	// leave EmptyStatement
	// leave Program

Copy the code

From the printed results of the node types above, it is not difficult to see that each node of our abstract syntax tree is visited twice, once when entering and once when leaving. We can understand the traversal order of the abstract syntax tree more clearly through the following figure

After looking at the traversal order, we can see that the name of the variable is changed once on the first access and again when it leaves the code. Then we need to verify that the variable name of the node of the abstract syntax tree is changed successfully. We have two options. The first option is to print the abstract syntax tree directly, which is very simple and I’ll show you here. Plan two: Let’s convert the existing abstract syntax tree into source code and see if the variable names have been successfully changed so that it’s clear at a glance. So how do we get our abstract syntax tree back to source code? This brings us to our third library, EscodeGen

Escodegen restores the abstract syntax tree to JS source code


	let escodegen = require('escodegen');
	
	let originReback = escodegen.generate(AST);
	console.log(originReback);
    // function eat_enter_leave() {};

Copy the code

From the source code restored above we can see that the variable name has indeed been changed.

Now let’s explore how to use abstract syntax trees to convert arrow functions into normal functions

We all know es6 syntax to ES5 syntax and we’re using Babel, so let’s see how Babel converts arrow functions into normal functions.

The first step is to use two plug-ins for Babel, the babel-core module of the babel-types type module

	npm i babel-core babel-types -S

Copy the code

Step 1: We will compare the abstract syntax tree of normal function and arrow function, find out the differences by comparison, and then change the differences as little as possible under the premise that nodes can be reused, so as to successfully convert arrow function into normal function.

Let’s take this arrow function as an example:


	let sum = (a,b) = > a+b; 
	------>
	var sum = function sum(a, b) {
	  return a + b;
	};
Copy the code

As shown above, the difference between the AST of a normal function and an arrow function is init, so what we need to do now is convert the arrow function’s arrowFunctionExpression to FunctionExpression

Use babel-types to generate a new partial AST syntax tree, replacing the original one. To create a syntax tree for a node, search for babel-types on the url below

Const Babel = require(const Babel = require('babel-core'); Const types = require('babel-types');
	let code = 'let sum = (a,b) => a+b; ';
	let es5Code = function (a,b) {
	    returna+b; }; // The Babel conversion takes a Visitor pattern where Visitor processes a specific type of node for an object or a group of objects depending on the VisitorletVisitor = {ArrowFunctionExpression(path) {visitor = {ArrowFunctionExpression(path) {visitor = {ArrowFunctionExpression(path) {visitor = {ArrowFunctionExpression(path) {visitor = {ArrowFunctionExpression(path) {visitor = {ArrowFunctionExpression(path) {visitor = {ArrowFunctionExpression(path) { Reuse params parameterslet params = path.node.params;
	        let blockStatement = types.blockStatement([types.returnStatement(path.node.body)])
	        let func = types.functionExpression(null, params, blockStatement, false.false);
	        path.replaceWith(func)
	
	    }
	};
	
	letarrayPlugin = {visitor}; // Inside Babel, the code is converted to AST and then iteratedlet result = babel.transform(code, {
	    plugins: [
	        arrayPlugin
	    ]
	});
	
	console.log(result.code);
    // let sum = function (a, b) {
    //     return a + b;
    // };

Copy the code

Let’s write a predictive plug-in for Babel


	let code = `const result = 1000 * 60 * 60 * 24`;
	//let code = `const result = 1000 * 60`;
	let babel = require('babel-core');
	let types = require('babel-types'); / / is expectedlet visitor = {
	    BinaryExpression(path){
	        let node = path.node;
	        if(! isNaN(node.left.value)&&! isNaN(node.right.value)){let result = eval(node.left.value+node.operator+node.right.value); result = types.numericLiteral(result); path.replaceWith(result); // If the parent of this expression is also an expression, it needs to be evaluated recursivelyif(path.parentPath.node.type == 'BinaryExpression'){ visitor.BinaryExpression.call(null,path.parentPath); }}}}let r = babel.transform(code,{
	    plugins:[
	        {visitor}
	    ]
	});
	console.log(r.code);

Copy the code

The above is my understanding of abstract syntax tree, there is any incorrect place, please correct.

Don’t be bored with the familiar, make a little progress every day; Don’t be afraid of new things. Learn a little every day.