preface

In previous studies. I have come across AST many times and passed it by. Examples include WebPack, Taro, reverse debugging of the front end (with some weird confusion), and some “toys” that you developed yourself. But every time is the front after learning for a long time, forget behind! The official documents, though well-written, are a pain in the ass to reread. Write it down so you don’t forget. And I want to share it with you. Our emphasis is on hands-on, concepts that can be explained in more detail than I can in the references and tools section at the end of this article.

Speaking of AST syntax trees, Babel’s workflow can be illustrated in the following diagram, where the code is first parsed into an abstract syntax tree (AST) via Babylon, followed by some traversal and analysis transformations (the main process), and finally generates new general code from the transformed AST.

The AST syntax tree looks like this:

To prepare

The following list is a prerequisite for starting this article.

  • ES6
  • Nodejs and vscode (familiar with node script debugging)
  • babel
  • Understand the importance of reading documents carefully!

Start preparing the environment

  1. mkdir AST_Test
  2. cnpm init -y
  3. cnpm i @babel/generator @babel/parser @babel/traverse @babel/types --save
  4. touch index.js

Package. json Note the addition of “type”: “module” to turn on our ESM module notation (note the Node version).

Without further explanation, here’s what four bags do:

  • Babel/Parser: Parses Javascript code into AST structures. (Parsing package)
  • Babel /traverse: Convert AST structures to what you need. (Conversion package)
  • The @babel/types: AST structure is a bit more complex, this is a toolkit to help generate those complex syntax objects.
  • Babel /generator: Parse the AST structure into Javascript code. (Generate package)

@babel/types is a helper package, and the other three correspond to the three steps of our AST: parse – transform – build. We will also follow these three steps to expand!

parsing

Let’s first modify index.js and run the following code:

import parser from "@babel/parser";
const { parse } = parser;
const code = `var square = function (n) { return n * n; } `
const ast = parse(code);
console.log(ast.program.body)
Copy the code

We started with a Javascript to AST structure, and then printed something that might be a little confusing if you’re looking at it for the first time.

You can see that there’s a lot of information, so let’s just focus on what’s familiar. Looking at the curly braces, remember that a curly brace is a Node. Each Node contains an important attribute, type!

With the help of astexplorer

This site is highly recommended by AstExplorer.net/, you can view the AST syntax tree structure of a particular code directly online, and supports multiple programming languages.

In the area shown in the red line, I have hidden some of the less important content (which also works, but is ignored here). So we can look at the properties, what do they represent?

Demo code given by the website:

let tips = [
  "Click on any AST node with a '+' to expand it"."Hovering over a node highlights the \ corresponding location in the source code"."Shift click on an AST node to expand the whole subtree"
];

function printTips() {
  tips.forEach((tip, i) = > console.log(`Tip ${i}: ` + tip));
}

Copy the code

The Program object has a sourceType. It has two values: module and script. One is modular Javascript and the other is scripted.

Next, the body property is an array containing a VariableDeclaration and a FunctionDeclaration. They correspond to two chunks of our code.

To analyze

VariableDeclaration

When it’s fully unfolded, it looks something like this.

  • Kind: “var” | “let” | “const” (required)
  • Declarations: array in VariableDeclarator form (required)

So what’s in VariableDeclarator?

  • Id: indicates the identifier.
  • Init: Expression or null.

FunctionDeclaration for simplicity, I’ve simplified the demo code here:

function printTips() {
  return 1;
}
Copy the code

The AST is as shown below:

  • Id: Identifier (optional, can be null)
  • Params: Array identifier (required)
  • Body: block statement (required)

The block contains a return statement with an argument of 1.

How to check

Now that we’ve talked about those two statements, I think you’re starting to wonder, what a bother, do you have to remember all that?

Of course, it’s impossible to remember that we need to rely on documents to query, remember the four packages we mentioned earlier, there was one called @babel/types. Click on it, is it easy to see? All statement definitions and their respective interfaces can be found.

conversion

Here’s a big chapter, and this is where we do the most tinkering with the code. Following the convention of other technology leaders, we will start with a demo of arrow function to anonymous function (ES6>ES5).

We’ll explain it after you experience it

//index.js
import parser from "@babel/parser";
import traverse from "@babel/traverse";
import generate from "@babel/generator";
import t from '@babel/types';
const code = Var a = () => '; `;

const ast = parser.parse(code);

traverse.default(ast, {
    ArrowFunctionExpression(path) {
        var func = t.functionExpression(
           null,
            [],
            t.blockStatement([t.returnStatement(path.node.body)])
        )
        path.replaceWith(func)
    }
})

const res = generate.default(ast)

console.log(res.code)
Copy the code

Parse (code) from const ast = parser.parse(code); We’ve been through this before at the beginning. The key logic is:

traverse.default(ast, {
    ArrowFunctionExpression(path) {
        var func = t.functionExpression(
           null,
            [],
            t.blockStatement([t.returnStatement(path.node.body)])
        )
        path.replaceWith(func)
    }
})
Copy the code

Visitor

In the process of transformation, we need to keep in mind the concept of Visitor.

Each parsed syntax is iterated over, running the logic inside it as it encounters things like arrow functions, variable declarations, and so on.

You can think of it as a hook function that triggers written logic. For example, in the code above: ArrowFunctionExpression will trigger the logic inside.

If you want to fire each one, there are two corresponding hook functions: Enter (enter traversal) and exit(exit traversal). Each hook function has these two hooks:

traverse.default(ast, {
    ArrowFunctionExpression: {
        enter(path) {
            console.log("Entered!");
        },
        exit(path) {
            console.log("Exited!"); }}})Copy the code

Will trigger Enter and then Exit. Global triggering is generally not recommended; try to locate nodes in an accurate manner. If you need to specify more than one type of logic at the same time, you can write:

traverse.default(ast, {
    ['ArrowFunctionExpression|Identifier'](path) {
        console.log(path.node.type)
    }
})
Copy the code

It fires when arrow functions and identifiers appear! Which hooks do I need to query? Also found in @babel/types given earlier.

Path

It is important to note that path and AST nodes, not to be confused, are two people, albeit related. Maybe it’s not highlighted in the official tutorial (but it does).

Path is an object that represents the connection between two nodes.

We use this as a way to modify the AST syntax tree. To understand this, we must do a detailed debugging in vscode, as shown above!

There are a lot of attributes, which are all about path operations and path-related information. Let’s focus on:

  • The node node
  • ParentPath parent node
  • The scope scope

As you can see, Node contains the most needed content. Back to the previous code:

traverse.default(ast, {
    ArrowFunctionExpression(path) {
        var func = t.functionExpression(
           null,
            [],
            t.blockStatement([t.returnStatement(path.node.body)])
        )
        path.replaceWith(func)
    }
})
Copy the code

I used a t object call, @babel/types toolkit, to generate a normal anonymous function. You might be wondering, how did you write that? The answer is VScode:

When we use a sentence, we will be prompted for input parameters. What is the best sentence pattern to useastexplorerOnline a turn to know!

We operate on path to implement some of the logic we want. In this example I used the path.replaceWith method.

In addition, there are examples:

  • Path.tostring () converts to code
  • Path-traverse () eliminates global states with recursion.
  • Path.get () makes it easier to get paths, for example:path.get('body.0');It can be understood as:path.node.body[0]But note that only paths can operate in this way, access to properties is not allowed!
  • Path.isxx () XX indicates the node type. You can check whether it matches the node type. (The Types package is also available)
  • Path.getfunctionparent () finds the nearest parent function or program
  • Path.getstatementparent () walks up the syntax tree until the parent path is found in the list
  • Path.findparent ((path) => path.isObjectExpression()) Calls callback for each parent path with its NodePath as an argument, and returns its NodePath when callback returns true.
  • Path.find ((path) => path.isobjectexpression ()) If the current node also needs to be traversed
  • Path. inList to determine if the path has sibling nodes,
  • Path.getsibling (index) to getSibling paths,
  • Path. key Retrieves the index of the container where the path resides.
  • Path. container The container that gets the path (an array of all sibling nodes)
  • Path. listKey Gets the key of the container
  • Path.skip () skips this node without traversing it
  • Path.stop () stops traversal
  • Path.replacewithmultiple () replaces multiple nodes, passing in arrays
  • Path. ReplaceWithSourceString () are replaced with a string of source code
  • path.insertBefore(); Insert the node before this node
  • path.insertAfter(); Insert the node after this node
  • path.remove(); Remove nodes

There are a lot of general operations on path, and although I’ve listed them, unfortunately, I can’t find a complete documentation that covers each method very clearly. Here’S how I do it (there’s a better way or find documentation to leave a comment)

ReplaceWith is one of the many functions that are mounted on the prototype of the entire Path object, and you can get a sense of its usage by looking it up.

After the expansion can also find its code implementation place, refer to the source view. For example, the path.addComment method, the core logic of the source code is as follows:

function addComments(node, type, comments) {
  if(! comments || ! node)return node;
  const key = `${type}Comments`;

  if (node[key]) {
    if (type === "leading") {
      node[key] = comments.concat(node[key]);
    } else{ node[key] = node[key].concat(comments); }}else {
    node[key] = comments;
  }

  return node;
}
Copy the code

According to breakpoint debugging, the first parameter must be the comment type, the second parameter is the content of the comment, and the third parameter controls multi-line and single-line comments. Even if we can add it manually, it works!

  path.node.leadingComments = [
            {
              type: "CommentBlock".value: "Here's my note 2.",},]Copy the code

Scope

Scope I believe we are familiar with can not be familiar with it! Code first!

//index.js
import parser from "@babel/parser";
import traverse from "@babel/traverse";
import generate from "@babel/generator";
import t from '@babel/types';
const code = 'function a() {var STR = 'Hello! ' return str; }; `;

const ast = parser.parse(code);

traverse.default(ast, {
    VariableDeclaration(path) {
        console.log(path.scope)
    }
})

const res = generate.default(ast)

console.log(res.code)
Copy the code

As you can see, describe a scope. Start by knowing an important attribute (Bindings) that stands for variable bindings.

Let’s look at the scope method:

  • Path.scope.hasbinding (“n”) checks whether the local n variable is bound (defined).
  • Path.scope.hasownbinding (“n”) checks if n variables are bound in its scope.
  • Path. The scope. GenerateUidIdentifier (” uid “) to generate a current scope certainly does not exist under the variable name.
  • Path.scope.rename (“n”, “x”) renames a variable.
  • Path.scope.parent. Push (node) Inserts a node in the parent scope.

Here I will focus on the path.scope.parent. Push method. First I will cite an example from the official documentation to see if you will have the same doubt as me.

Sometimes you might want to push a VariableDeclaration so that you can assign to it.

FunctionDeclaration(path) {
  const id = path.scope.generateUidIdentifierBasedOnNode(path.node.id);
  path.remove();
  path.scope.parent.push({ id, init: path.node });
}
Copy the code
- function square(n) {+var _square = function square(n) {
    returnn * n; -} +};Copy the code

If you do this, you’ll soon find out that it’s not exactly the same thing:

var _square;
Copy the code

First post my successful conversion code:

import parser from "@babel/parser";
import traverse from "@babel/traverse";
import generate from "@babel/generator";
import t from '@babel/types';
const code = ` function square(n) { return n * n; } `;

const ast = parser.parse(code);

traverse.default(ast, {
    FunctionDeclaration(path) {
        const funcName = path.node.id.name;// Save the function name
        const oldPath = path.node.body;// Old internal node
        const params = path.node.params[0].name;// Parameter name
        path.remove();// Remove the original node (note that I saved it in a variable, if you remove it without storing it in a variable, the path will be lost!)
        path.scope.parent.push(
            t.variableDeclarator(
                path.scope.generateUidIdentifier(funcName),// Generate a unique variable name
                t.functionExpression(// Define the functiont.identifier(funcName), [ t.identifier(params) ], oldPath ) ) ); }})const res = generate.default(ast)

console.log(res.code)
Copy the code

This conversion is in line with our expectations! But that’s not my problem. When I first saw the API, I kept mistook it for the official examplepath.scope.parent.push({ id, init: path.node });You need to pass in an object of a particular structure, but after my tests, this is used to pass in nodes, so you can also get some inspiration from my example of the same misdirection.

generate

There’s not much to say about generation, head over to @babel/ Generator for more.

const res = generate.default(ast)
console.log(res)
Copy the code

notes

The generateUid family of apis smells good

Before just know it can automatically create variables, and did not find that here unexpectedly sent on the big use!

// The code waiting for the conversion is as follows
var n = 123// external scope n
function square(n) {
    return n * n;// n of the inner scope
}
console.log(n)
Copy the code

Requirement: Rename both n’s separately!

import parser from "@babel/parser";
import traverse from "@babel/traverse";
import generate from "@babel/generator";
import t from '@babel/types';
const code = ` var n = 123 function square(n) { return n * n; } console.log(n) `;
const ast = parser.parse(code);
traverse.default(ast, {
    ['FunctionDeclaration|VariableDeclaration'](path) {
        path.scope.rename('n',path.scope.generateUid())// If you write a value instead of generating a uid, you can't do this!}})const res = generate.default(ast)
console.log(res.code)
Copy the code

Final output:

var _temp = 123;
function square(_temp2) {
  return _temp2 * _temp2;
}
console.log(_temp);
Copy the code

References and tools

  • @babel/parser
  • @babel/traverse
  • @babel/types
  • @babel/generator
  • babel-plugin-handbook
  • Babel Plugin Manual – Chinese
  • Javascript-Babel-API
  • astexplorer