Babel syntax features extensions and principles

Creating Custom JavaScript syntax with Babel: Creating Custom JavaScript syntax with Babel: Creating Custom JavaScript syntax with Babel: Creating Custom JavaScript syntax with Babel It focuses on the compiler workflow and basic concepts, and finally looks at how they correspond to Babel, and completes the example based on what you know about them.

The target

The ultimate goal of this article is to implement a new JS syntactic feature with Babel that allows the function to be currized by the @@ tag after the function keyword, in the final form:

// '@@' makes the function `foo` curried
function@ @foo(a, b, c) {
  return a + b + c;
}
console.log(foo(1.2) (3)); / / 6
Copy the code

To achieve this, we need to:

Know some basic concepts of the compiler and have some understanding of its workflow.
Understand the structure of Babel project and its development and testing process

So let’s get started.

Compiler workflow and basic concepts

AST

AST is the Abstract Syntax Tree, which describes the language structure of a code text in the form of a Tree. This structure is an IR intermediate representation between the source language and the target language. Corresponding to AST is Parse Tree (also known as CST Concrete Syntax Trees), the difference between the two mainly lies in:

The AST does not contain syntax details such as commas, parentheses, and semicolons
The AST uses folded versions that do not contain single-succeeding nodes (nodes with only one child, such as factor -> 3 / factor -> 4 in the following example)
Operators token(e.g. +, -, x, /) do not appear as leaf nodes, but as internal nodes (usually as attributes of parent nodes, e.g. “operator”: “*”)

Parse Tree Abstract Syntax Tree ========== ==================== exp * | / \ term 3 + /|\ / \ term * factor 4 2 | /|\ | /  | \ factor ( exp ) | /|\ 3 exp + term | | term factor | | factor 2 | 4Copy the code

Here is an example of AST parsing:

function * foo(){
  yield 1;
  function bar(){
   return 2;
  }
  return bar();
}
Copy the code

This code has a foo function, which is a generator function, with a yield expression, a return statement, and an internal function. The AST structure after esprima parsing is as follows:

esprima parsed AST

{
  "type": "Program",
  "body": [
    {
      "type": "FunctionDeclaration",
      "id": {
        "type": "Identifier",
        "name": "foo",
        "range": [
          11,
          14
        ]
      },
      "params": [],
      "body": {
        "type": "BlockStatement",
        "body": [
          {
            "type": "ExpressionStatement",
            "expression": {
              "type": "YieldExpression",
              "argument": {
                "type": "Literal",
                "value": 1,
                "raw": "1",
                "range": [
                  26,
                  27
                ]
              },
              "delegate": false,
              "range": [
                20,
                27
              ]
            },
            "range": [
              20,
              28
            ]
          },
          {
            "type": "FunctionDeclaration",
            "id": {
              "type": "Identifier",
              "name": "bar",
              "range": [
                40,
                43
              ]
            },
            "params": [],
            "body": {
              "type": "BlockStatement",
              "body": [
                {
                  "type": "ReturnStatement",
                  "argument": {
                    "type": "Literal",
                    "value": 2,
                    "raw": "2",
                    "range": [
                      57,
                      58
                    ]
                  },
                  "range": [
                    50,
                    59
                  ]
                }
              ],
              "range": [
                45,
                63
              ]
            },
            "generator": false,
            "expression": false,
            "async": false,
            "range": [
              31,
              63
            ]
          },
          {
            "type": "ReturnStatement",
            "argument": {
              "type": "CallExpression",
              "callee": {
                "type": "Identifier",
                "name": "bar",
                "range": [
                  73,
                  76
                ]
              },
              "arguments": [],
              "range": [
                73,
                78
              ]
            },
            "range": [
              66,
              79
            ]
          }
        ],
        "range": [
          16,
          81
        ]
      },
      "generator": true,
      "expression": false,
      "async": false,
      "range": [
        0,
        81
      ]
    }
  ],
  "sourceType": "module",
  "range": [
    0,
    81
  ]
}
Copy the code

{type: string,… Properties}, which are nodes in the AST tree.

Type ‘FunctionDeclaration’ – FunctionDeclaration, which together with all of the following forms the foo function node
Id specifies the function id
- type: ‘Identifier’
- Name: ‘foo’ — function name
Params: [], function parameter list
Generator: true, generator function identifier
The body function body
- Type: ‘BlockStatement’ Specifies the statement block inside the function
  - The contents of the statement block in the body function, containing three nodes
    - Type: ‘ExpressionStatement’ ExpressionStatement. Type is YieldExpression
    - Type: ‘FunctionDeclaration’ FunctionDeclaration, bar function
    - Type: ‘BlockStatement’ expression statement. Type is ReturnStatement

Note:

The structure generated using @babel/ Parser is similar, except that Esprima is more concise and easy to show here.
In ASTExplorer you can view the AST of the input code and support multiple languages.

This process of converting a piece of code from textual input to AST output is called parse for most compilers, and it generally goes through two phases, Lexical Analysis and Syntactic Analysis

Parse: Lexical Analysis

The first step in lexical analysis is also the first step in AST parsing:

Text scanning, the process of breaking text into the smallest possible parts, called lexemes, which are still language-independent, looks something like this:

Source: function * foo(){} ⬇️ lexemes:【function】【*】【foo】【(】【)】【{】【}】Copy the code

Next, the goal of lexical analysis is to convert this content into tokens, which are generally called tokenizers. The converted tokens already have some language features, and they are marked for the meaning they express in the language they belong to:

[{"type": "Keyword"."value": "function"
    },
    {
        "type": "Punctuator"."value": "*"
    },
    {
        "type": "Identifier"."value": "foo"
    },
    {
        "type": "Punctuator"."value": "("
    },
    / /...
]
Copy the code

In a concrete implementation, text scanning and token recognition are generally carried out at the same time, and the entire tokenizer process looks something like this:

// Text scan pointer
let index = 0;
// Complete the token list
let tokens = [];

// Iterate over the input text
while(current < input.length) {
  // Skip comments and whitespace and move index
  this.skipComment();
  this.skipSpace();
  // Gets the character of the current scanning position
  const char = input.charAt(index);
  // Match symbols, operators, etc one by one
  switch(char) {
    case Char.comma: // , 
      tokens.push({
        type: Char.comma
      });
      index++;
      break;
    / /... Other symbols

    // Handle identifier or keyword token
    default: 
      // Read the full word
      let word = readWord(char, index);
      // Determine if word is a keyword to set a different type to the current token
      tokens.push({
        type: isKeyword(word) ? 'keyword' : 'Identifier'}); index += word.length; }}Copy the code

Esprima provides online tools that make it easy to view token generation: esprima.org/demo/parse….

Parse: Syntactic Analysis

Parsing receives tokens input and outputs AST. In this process, the parser iterates tokens one by one and makes more detailed structural marks on the AST tree based on the characteristics of tokens and languages.

AST node

An AST consists of several nodes. A basic Node looks like this. This is the information that each Node on the AST carries:

type NodeBase = {
  // Node type, for example, ExpressionStatement/FunctionDeclaration
  type: string;
  // The index of the starting position text in the source code corresponding to the node
  start: number;
  // The index of the end position text in the source code corresponding to the node
  end: number;
  // Source location, row and column values for start and end characters, used to generate sourcemap
  loc: SourceLocation;
}
Copy the code

For a node of a particular type, which needs to carry different information, NodeBase will be extended based on the above. For example, for a regular expression literal, its expression (pattern) and matching pattern (flag) need to be recorded:

// Regular expression literals
type RegExpLiteral = NodeBase & {
  type: "RegExpLiteral";
  / / expression
  pattern: string;
  // Match pattern tag: "gimsuy"
  flags: RegExp$flags;
};
Copy the code

There are some types of nodes that can hold child nodes, usually extending the body attribute to form a complete AST by nesting the body:

// for statement, which can hold child nodes
type ForStatement = NodeBase & {
  type: "ForStatement";
  // for(init; ;)init? : VariableDeclaration | Expression;// for(init; test;)test? : Expression;// Update statement: for(init; test; update)update? : Expression;// substatement: for(init; test; update) { body }
  body: Statement;
};
Copy the code

The body type can also be an array. For example, the root Program node may contain multiple substatements. There are not many of these types of nodes: Program/BlockStatement/ClassBody/StaticBlock:

type ClassBody = NodeBase & {
  type: "ClassBody";
  // ClassBody is a node that can contain multiple substatements
  body: Array<ClassMember | StaticBlock | TsIndexSignature>;
};
Copy the code

PS. For Babel, you can see the definitions for all nodes, including the attributes that each node should contain, the attribute value types, and the validation rules.

The parsing process

Let’s take a look at how parsing handles the results of lexical parsing and marks on Node, again using this code as an example: Function * foo(){} function * foo(){} function * foo(){} is a generator.

After Lexical Analysis, we already know that the token list contains the keyword function, generator token *, and so on, which are represented in Babel as follows:

[tt._function, tt.star, ...]
Copy the code

Note: In Babel/Parser, if the token type is a syntactic keyword, it starts with an _, as in function here.

ParseStatementContent is parsed by a switch statement that attempts to identify the statement type by the key at the beginning of the statement. When token is tt._function, it is handed to parseFunctionStatement:

parseStatementContent() {
  switch (starttype) {
    case tt._function:
      return this.parseFunctionStatement(node, false, !context);
    // case tt._if;  
    // case tt._return;
    // case tt._const:
    // case tt._var:
    // case tt.braceL:
    / /... ;}}Copy the code

Supplementary notes: When a statement does not begin with a statement keyword or an open curly brace {(that is, it does not match any case of the switch), @babel/ Parser attempts to parse it as an expression. There are two possible results. 1. LabeledStatement 2.

Next we go to the parseFunctionStatement, and since this is already a function statement, we first call this.next() to move the token to the next location for processing, and then tt.star.

parseFunctionStatement() {
  this.next();
  return this.parseFunction(node);
}
Copy the code

After calling this.next() : [tt._function, tt.star...  => [tt._function, tt.star, ...] ⬆ ️ ⬆ ️Copy the code

The last and most critical step, this.eat(tt.star):

parseFunction(node){
  // ...
  node.generator = this.eat(tt.star);
  // ...
}
Copy the code

Here the node is the node in the AST being processed, the Generator is marked on node as a Boolean property, this.eat does two things:

Check whether the type of the token where the pointer resides is the same as that of the input parameter
If a match is made, the pointer continues to move to the next token

eat(type: TokenType): boolean {
  if (this.state.type === type) {
    this.next();
    return true;
  } else {
    return false; }}Copy the code

Because our current token is tt.star, the resulting AST has a generator property of True on this node, which is exactly the same as the AST example from the beginning of this article.

Token operation

As in the example above, the compiler needs to constantly move Pointers and try to match specific syntax features to handle all tokens during parsing. Here are some generalizations of these methods:

match(type: TokenType): boolean: Whether the current pointer matches the input token (of the same type)
next(): void: Moves the pointer to the next token
eat(type: TokenType): boolean: Returns whether the token where the current pointer is located matches the token of the input parameter (of the same type). If so, it is callednext()
lookahead(): LookaheadState: Only gets the next token without moving the pointer. Often used when the processing logic for the current token depends on the next token

The Transform transformation

After going through the parse phase, you’ve got the AST. The Transform phase is all about modifying the AST for the language you’re compiling for. All Babel plug-ins work in this phase. The specific modification logic of the AST is provided by the Visitor object. The compiler does depth-first traversal of the AST and calls the corresponding method in the Visitor object when a node of type matches:

const MyVisitor = {
  Identifier(path) {
    // This method is called every time the {type='Identifier'} node is encountered}};Copy the code

You may notice that the example above have one parameter, it contains information of node and link node information, in fact, in most cases, rarely appears alone to deal with a node, the condition of the various operations need to be considered and its associated nodes, the path parameter provides this information, and it also provides various of the methods of operation, Such as deleting, modifying, creating nodes and so on, there are descriptions about these contents in Babel Plugin Handbook. As the compilation manual of plugin, these contents are described in detail here.

Generate

Generate is the last step in the compiler to Generate the target language based on the modified AST.

Extension: the super – a tiny – the compiler

If you’re interested in this section, it’s recommended to look at the the-super-tiny-Compiler project, which is an ultra-small compiler implemented in javascript. The compiler’s goal is to compile Lisp-style function calls into C-style statements, such as: (add 2 2) => add(2, 2) => add(2, 2)

Babel engineering

Engineering structure

Babel uses monorepo mode to manage the project, including official plug-ins, which means that you only need to clone a repository to get all of Babel’s code.

git clone [email protected]:babel/babel.git
Copy the code

Going into the Packages directory, we find the following packages based on what we just learned about the compiler:

@babel/ Parser is responsible for lexical and syntactic parsing, producing AST, originally calledbabylon, fromAcornA fork of the
@babel/traverse AST iterator that takes two arguments: AST/Visitor object

@babel/plugin-* Official plugin, a concrete implementation of AST modifications, whose return content is mainly made up of Visitor objects

import { declare } from "@babel/helper-plugin-utils";

export default declare((api, options, dirname) = > {
  return {
    name: 'Plug-in name'.visitor: Visitor object}});Copy the code

@babel/preset-* Set of plugins
@babel/ Generator Generates the final code from the AST

There are also packages that are combinations of the above packages for different usage scenarios:

@babel/core encapsulates various types of translation methods to cover more comprehensive input (string/file), output (code/AST), call procedure (sync/async), call configuration (various options) scenarios, Things like parse/Transform/transformFromAst/transformFromFile, etc.
@babel/cli @babel/core executed in CLI mode
@babel/registerthroughPiratesThe require hook hijacks node, at run time to compile code, dynamic compilation performance is poor, generally rarely used.
```
require("@babel/register");
Copy the code
```

Finally, there are a few more packages as ancillary features:

@babel/typesIt contains apis for manually constructing various AST nodes. Each type of node has three methods in the form of(Prefix: None /is/assert) Node type (... Parameters), the prefix corresponds to create type, judge type, and assert type respectively. For example, for functionExpression, its API includes:functionExpressionCreate a function expression node,isFunctionExpressionDetermines whether a node is a function expressionassertFunctionExpressionAssert that a node is a function expression. There are a lot of apis provided in this package, and although they are used a lot when writing plug-ins, it is recommended to be aware of them and refer to the documentation for the AST construction scenarios when you really need them.
@babel/template Uses string templates to create aN AST, which is handy when generating an AST on a large scale
@babel/helpersSeries of uses@babel/template template.program.ast(tpl)The helper functions that are created, usually common functions at runtime, can be passed in plug-insthis.addHelper(helperName)Calling these helpers to easily create an AST seems common_classCallCheck _definePropertiesThese functions are preset in this package.

Compile engineering

Initialize the

Babel uses the pattern of Makefile choreography + Gulp execution to build the project.

First, you need to initialize the project like this:

make bootstrap
# or
yarn bootstrap
Copy the code

Build and watch

# build
make build
# or
yarn build

# watch
make watch
Copy the code

test

# test + lint
make test

# test
make test-only

#Only one module is testedTEST_ONLY= Babel - make test-only
#Only use cases in a module that match TEST_GREP are testedTEST_ONLY= Babel - module name TEST_GREP="text" make test-onlyCopy the code

Because Babel’s unit tests are actually executed jEST, they can also be executed directly with JEST:

jest [TestPathPattern]
Copy the code

Okay, now make Watch and start coding to get there.

Add syntax features to Babel

First, let’s review the goal: we want to corrify functions by adding @@ after function. However, js does not have this syntax, and attempts to compile such syntax will result in an error:

function@ @foo(a, b, c) {
  return a + b + c;
}

SyntaxError: unknown: Unexpected token (1:9)
Copy the code

Lexical analysis support: Add new tokens

The compiler does not recognize @@ as a token. It’s easy to see how we need to tweak the lexical analysis part of the Parser module and add our newly defined token to it: @ @, this code is located in the packages/Babel – parser/SRC/tokenizer/types, js, this file is the main content is the definition of the types, which declares the js support token type, From the text layout, we can see that these tokens are generally classified into some categories: punctuation/operators/keywords, etc. Add @@ here:

export const types: { [name: string]: TokenType } = {
  num: new TokenType("num", { startsExpr }),
  string: new TokenType("string", { startsExpr }),

  // Punctuations
  comma: new TokenType(",", { beforeExpr }),
  colon: new TokenType(":", { beforeExpr }),
  atat: new TokenType("@ @"),  

  // Operators
  eq: new TokenType("=", { beforeExpr, isAssign }),
  logicalOR: createBinop("| |".1),

  // Keywords
  _function: createKeyword("function", { startsExpr }),
  _if: createKeyword("if"),};Copy the code

Once the new token type is added, it needs to be parsed in a lexical analyzer. This part of the code is located in tokenizer/index.js, which is adjacent to the types definition.

getTokenFromCode(code: number): void {
    switch (code) {
    // ...
    case charCodes.atSign:
      // If the next character is @, that is, @@
      if (this.input.charCodeAt(this.state.pos + 1) === charCodes.atSign) {
        // Create as @@token
        this.finishOp(tt.atat, 2);
      } else {
        // Otherwise, it is created as @token
        this.finishOp(tt.at, 1);
      }
      return;
      // ...}}Copy the code

FinishOp (type: TokenType, size: number): void is the method that needs to be called after a token is resolved. It records the current token information and moves the resolution position backward. The main trunk process used in the parsing part of Tokenizer is as follows:

export default class Tokenizer extends ParserErrors {
  // List of parsed tokens.
  tokens: Array<Token | N.Comment> = [];

  nextToken(): void {
    // Resolve from the this.state.pos position
    this.getTokenFromCode(this.codePointAtPos(this.state.pos));
  }

  finishOp(type: TokenType, size: number): void {
    const str = this.input.slice(this.state.pos, this.state.pos + size);
    // Move the size bit back
    this.state.pos += size;
    this.finishToken(type, str);
  }

  finishToken(type: TokenType, val: any): void {
    // Mark the token information
    this.state.end = this.state.pos;
    const prevType = this.state.type;
    this.state.type = type;
    this.state.value = val;
  }

  next(): void {
    if (this.options.tokens) {
      // Add the token to the list
      this.pushToken(new Token(this.state));
    }
    // ...
    this.nextToken(); }}Copy the code

Now the new token: @@ should be properly parsed. To verify this step, let’s try parsing with code that uses the new feature. Create a test file in the test directory and write simple parsing code:

import { parse } from ".. /lib";

describe("curry function syntax".function () {
  it("should parse".function () {
    const code = `function @@ foo(){}`;
    const ast = parse(code);
    console.log(ast);
  });
});
Copy the code

> jest curry-function

SyntaxError: Unexpected token (1:9)
    at Parser._raise (/path-to-babel-project/packages/babel-parser/lib/parser/error.js:105:17)
    at Parser.raiseWithData (/path-to-babel-project/packages/babel-parser/lib/parser/error.js:98:17)
    at Parser.raise (/path-to-babel-project/packages/babel-parser/lib/parser/error.js:59:17)
    at Parser.unexpected (/path-to-babel-project/packages/babel-parser/lib/parser/util.js:123:16)
    at Parser.parseIdentifierName (/path-to-babel-project/packages/babel-parser/lib/parser/expression.js:1645:18)
Copy the code

It still returns an error after running, but this is to be expected, because so far only the lexical parsing part of the @@ feature is supported. The existing parsing logic still cannot handle this token, which we will address later. Now look at the call stack, go to the normal flow closest to the error (highlighted above) and print the log to see:

parseIdentifierName(pos: number, liberal? : boolean): string {console.log(this.state.type); // Token currently processed
  console.log(this.lookahead().type); // Next token
  // ...
}
Copy the code

// Token currently processed
TokenType {
  label: '@ @'.// ...
}

// Next token
TokenType {
  label: 'name'.// ...
}
Copy the code

The above output confirms that Babel has been able to parse the new tokens correctly.

Babel Parser (optional)

If you follow the above steps to debug the code, there may be some confusion in the process, let’s first resolve these issues:

1. Why Tokenizer’s tokens are always empty?

Babel does not have a separate method to generate tokens. It only provides methods to generate AST. To view tokens, you need to set tokens to true when parse, which is equivalent to telling Babel-Parser to collect tokens. Parser looks at this configuration during execution to decide whether to add tokens to the list of tokens:

export default class Tokenizer extends ParserErrors {
  next(): void {
    if (this.options.tokens) {
      // Add the token to the list
      this.pushToken(new Token(this.state)); }}}// When calling parse, configure tokens: true
const astWithTokens = parse(code, {
  tokens: true});// astWithTokens:
{
  // ...
  tokens: [Token, Token, ...]
}
Copy the code

2. Why don’t the tokens collected contain the tokens in the code after the parsing error, like the parentheses in the above example?

In fact, babel-Parser does not parse all tokens first and then hand them to syntax analysis (unlike the-super-tiny- Compiler). Lexical analysis and syntax analysis are carried out alternately. Lexical analysis — a token — grammatical analysis — Next lexical analysis — next token — next grammatical analysis. Therefore, if a parsing exception occurs during this execution, lexical analysis will not continue.

Structurally, the parts of Babel-Parser are not combinatorial, but extend the capabilities of Parser through successive hierarchies as follows:

Parser -> StatementParser -> ExpressionParser -> LValParser -> NodeUtils -> UtilParser -> Tokenizer -> ParserErrors -> CommentsParser -> BaseParser

Next (), as used in the above code, is implemented in Tokenizer and is called primarily in StatementParser and ExpressionParser.

3. I just want the token and skip the parsing part. What should I do?

It should be easy to see from Question 2 that this is not possible with Babel, but some scenarios may simply require lexical analysis, such as syntax highlighting, and the extra, useless parsing process may incur additional performance overhead. Esprima Tokenizer provides a separate lexical analysis module dedicated to generating tokens. However, Esprima’s execution process is similar to Babel’s, and it is used alternately for lexical analysis and syntax analysis. Tokenizer modules are only provided for external modules, and have no connection with the internal syntax analysis modules.

Parsing support: marking to AST

For generator functions, add “generator” to the FunctionDeclaration node during the parsing phase: True For the attribute values, we’ll do something similar: if the function has an @@token, then we’ll add the curry: Boolean flag to this node as well. If you remember from the previous chapter, we’ll deal with this attribute later in the Transform phase. Now find the parseFunction method from the parser/statement.js file.

parseFunction<T: N.NormalFunction>( node: T, statement? : number = FUNC_NO_FLAGS, isAsync? : boolean =false,
  ): T {
    // ...
    node.generator = this.eat(tt.star);
    node.curry = this.eat(tt.atat);
    // ...
}
Copy the code

This. Eat can support both generator and curry, but if both tokens are present, the * must be written before the @@ symbol. You can also use different token manipulation logic here. To define your own syntax features, let’s perform the following tests according to the above rules:

describe("curry function syntax".function() {
  const code = `function * @@ foo(){}`;
  const ast = parse(code);

  it("should has generator".function() {
    expect(ast.program.body[0].generator).toBe(true);
  });

  it("should has curry".function() {
    expect(ast.program.body[0].curry).toBe(true);
  });
});
Copy the code

$ jest curry-functionPASS packages/babel-parser/test/ currying function ✓ should have generator (1 ms) ✓ should have curry  (1 ms)Copy the code

At this point, the two things that the parse phase does, lexical analysis and syntax analysis, are complete.

Transform: Write plug-ins

In the previous step, we finished adding the Curry: Boolean property to the FunctionDeclaration node of the AST, and then dealt with the AST transformation, which is implemented in Babel via the Plugin mechanism.

Assuming that we have a currying function called currying that takes a function as an argument and returns a currying function, now the target of Transform, A FunctionDeclaration node with a Curry attribute of true can be wrapped with a currying function.

// Original code:
function@ @foo(a, b, c) {
  return a + b + c;
}

// Expected code after Transform:
const foo = currying(
  function foo(a, b, c) {
    returna + b + c; })Copy the code

In addition, the parser must use the modified version, overriding the default parser using the plugin’s parserOverride configuration:

import customParser from '/path-to-babel-project/packages/babel-parser/lib/index.js';

function myCustomPlugin({types}) {
  return {
    parserOverride(code, opts) {
      returncustomParser.parse(code, opts); }}; }Copy the code

Next we use the Visitor object to access and modify the AST. This part of the code may seem tedious, but read it first. The comments in the code explain what each statement does as best they can:

function myCustomPlugin({t}) {
  return {
    / / the Visitor object
    visitor: {
      // Access the FunctionDeclaration node
      FunctionDeclaration(path) {
        // The FunctionDeclaration node marked curry in the parse phase
        if (path.node.curry) {
          // Restore the curry tag
          path.node.curry = false;
          // Start the substitution operation with the contents of the parameters:
          path.replaceWith(
            // variable declaration t.variableDeclaration(kind declaration, declarations);
            t.variableDeclaration(
              / / declare the type "var" | | "let" "const", used here const
              'const'[// t.variableDeclarator(id: variable name, expression: expression);
                t.variableDeclarator(
                  // Use the original function name as the variable name
                  t.identifier(path.get('id.name').node),
                  // Call expression: t. calxpression (callee: called, arguments: parameters);
                  t.callExpression(
                    / / call 'currying'
                    t.identifier('currying'), 
                    // Pass in the argument, here we need to pass in the original function:
                    [ 
                      // Method 1: Convert the FunctionDeclaration of the current node to an expression
                      // t.toExpression(path.node),

                      // Method 2: Manually create a function expression
                      t.functionExpression(
                        null /* Anonymous function */, 
                        path.node.params /* Use the same function argument */, 
                        path.node.body /* Use the same body */.false./ * than the generator * /
                        false / * the async * /() (), [) (), [) () }}}}; }Copy the code

That’s all we need to implement the Visitor, which is divided into two parts altogether:

usepathAccess information about the current node.
usetCreate the AST, heret 是 babel-typesThe content that Babel passes in when the plug-in executes is equivalent toimport * as t from "@babel/types"orimport { types as t } from "@babel/core"Using t as a variable name for acceptance seems to be a convention (official plug-ins all write it this way).

Now that the plug-in is working, try using it in transformSync:

const code = `function @@ test(a, b, c) { return a + b + c }`;

const output = babel.transformSync(code, {
  plugins: [ myCustomPlugin ],
});

// output: {... .code: 'const test = currying(function (a, b, c) {\n return a + b + c; \n}); '
}
Copy the code

Now there is only one question left, where can the function currying itself be stored? Reference for Babel, the content of the engineering structure the answer here is @ the Babel/helpers, but unfortunately the @ Babel/helpers at present is not open to the plug-in extension, so this part requires in Babel source code:

helpers.currying = helper("7.6.0" /* min version */)` export default function currying(fn) { const numParamsRequired = fn.length; function curryFactory(params) { return function (... args) { const newParams = params.concat(args); if (newParams.length >= numParamsRequired) { return fn(... newParams); } return curryFactory(newParams); } } return curryFactory([]); } `;
Copy the code

Next, modify the plugin to use a helper for calling currying:

// ...
types.callExpression(
  // types.identifier('currying'),
  this.addHelper("currying"),
  [
    types.functionExpression(null, path.node.params, path.node.body, false.false)])// ...
Copy the code

Re-execute the code:

// output: {... .code: 'function _currying(fn) { const numParamsRequired = fn.length; function curryFactory(params) { return function (... args) { const newParams = params.concat(args); if (newParams.length >= numParamsRequired) { return fn(... newParams); } return curryFactory(newParams); }; } return curryFactory([]); }\n' +
    '\n' +
    'const test = _currying(function (a, b, c) {\n' +
    '  return a + b + c;\n' +
    '}); '
}
Copy the code

Now that currying is printed into compiled code, only one of them will exist.

At this point, the plugin is fully implemented. Finally, let’s review the implementation process:

Modify according to the understanding of lexical analysis and grammatical analysisbabel-parserPackage, supported in the Parse phase@ @Syntax, and be able to output the bandcurry: booleanMark the AST node
Access during the Transform phase using the Babel plugin mechanismFunctionDeclarationNode, and usebabel-typesProvides methods forcurry: trueIs used to adjust ASTcurryingEnveloping function
willcurryingThe implementation ofbabel-helpersAnd passed in the plug-inaddHelperCall to make the whole function complete at run time

reference

Parsing Up One’s Game With ASTs
Creating custom JavaScript syntax with Babel
What’s the difference between parse trees and abstract syntax trees?
the-super-tiny-compiler
Esprima
Babel Plugin Handbook