Learn notes on TypeScript compilation principles

The source files for the TypeScript compiler are in the SRC/Compiler directory of the TypeScript source code.

The TypeScript compiler

1. Key components

The TypeScript compiler is divided into five key parts:

Scanner, in scanner.ts
Parser, in Parser.ts
Binder, in binder.ts
Checker, in checker.ts
Emitter (Emitter), in Emitters. Ts

The collaboration between these key components is shown below:

1) The source code passes through the scanner to obtain the token stream;

2) The token stream passes through the parser to obtain the AST (Abstract Syntax Tree)

3) The AST gets the Symbol from the binder.

Symbols are the result of bindings and are the main building blocks of the TypeScript semantic system. Symbols connect declaration nodes in the AST to other declarations of the same entity. Symbols and AST are used by inspectors to verify source code semantics.

4) AST + Symbol input inspector, can be type verification

5) When JavaScript output is needed, the AST passes through the emitter, and calls the checker for type checking, and outputs JavaScript code

2. Core tools — Core.ts

The core.ts file is the core toolset used by the TypeScript compiler, where Let objectAllocator: objectAllocator is a variable defined as a global singleton that provides the following definitions:

getNodeConstructor
getSymbolConstructor
getTypeConstructor
GetSignatureConstructor (signature for index, invocation, and construction)

3. Key data structure — types.ts

Types.ts contains the key data structures and interfaces used by the entire compiler. Here are some of the key parts:

SyntaxKind AST: Identifies the node type through the SyntaxKind enumeration
TypeChecker: interface provided by the TypeChecker
CompilerHost: Used for interaction between programs and systems
Node AST: indicates a Node

4, system file — system.ts

All interactions between the TypeScript compiler and the operating System occur through the System interface.

2. Program. Ts

A compilation context is treated as a Program in the TypeScript compiler that contains SourceFile and compilation options.

1. Use CompilerHost

CompilerHost is the mechanism for interacting with the operating environment. Program uses == > CompilerHost uses == > System

The reason for using CompilerHost as an intermediary is that it allows the interface to fine-tune the requirements of the Program regardless of the operating environment. For example, Program does not need to care about the fileExists function of System.

2, SourceFile

Program provides an API for getting SourceFile: getSourceFiles () : SourceFile[]; . Each resulting element is the root node of an abstract syntax tree (called a SourceFile).

SourceFile contains two parts, that is, SyntaxKind.SourceFile and Interface SourceFile.

Each SourceFile is the root node of an AST and is included in the Program.

Abstract syntax tree

1. Node

Nodes are the basic building blocks of Abstract Syntax Tree (AST).

The AST node document consists of two key parts. One is the SyntaxKind enumeration of nodes, which identifies the types in the AST. The other is its interface, the API provided by the node when instantiating the AST.

Here are some key members of an Interface Node:

TextRange: Identifies the start and end positions of the node in the source file.
The parent? : Node: indicates the parent Node of the current Node in the AST.

Node also has other members, such as flag and modifier. You can search for interface nodes in the source code to see, traversal of nodes is very important.

2. Access the child nodes of the node

Accessing child nodes There is a utility function ts.foreachChild that can be used to access all child nodes of any AST node.

Here’s a simplified code snippet to show how it works:

export function forEachChild<T> (node: Node, cbNode: (node: Node) => T, cbNodeArray? : (nodes: Node[]) => T) :T {
    if(! node)return;

    switch (node.kind) {
        case SyntaxKind.BinaryExpression:
            return visitNode(cbNode, (<BinaryExpression>node).left) ||
                visitNode(cbNode, (<BinaryExpression>node).operatorToken) ||
                visitNode(cbNode, (<BinaryExpression>node).right);
        case SyntaxKind.IfStatement:
            returnvisitNode(cbNode, (<IfStatement>node).expression) || visitNode(cbNode, (<IfStatement>node).thenStatement) || visitNode(cbNode, (<IfStatement>node).elseStatement); }}Copy the code

This function checks node.kind and determines node’s interface, and then calls cbNode on its children. However, attention should be paid to the function will not call visitNode for all child nodes (such as SyntaxKind. SemicolonToken). To get all the children of an AST point, simply call the node’s member function. GetChildren.

The following functions print the details of the AST node:

function printAllChildren(node: ts.Node, depth = 0) {
    console.log(new Array(depth + 1).join(The '-'), ts.syntaxKindToName(node.kind), node.pos, node.end);
    depth++;
    node.getChildren().forEach(c= > printAllChildren(c, depth));
}
Copy the code

3, SyntaxKind enumeration

SyntaxKind is defined as a constant enumeration, as follows:

export const enum SyntaxKind {
    Unkonwn,
    EndOfFileToken,
    SingleLineCommentTrivia,
    // ...
}
Copy the code

This is a constant enumeration, convenient inline (for example, ts. SyntaxKind. EndOfFileToken becomes 1), so that when using the AST wouldn’t have the overhead costs of treatment reference. But at compile time you need to use — Preserve ConstEnums to compile the tag so that the enumeration is still available at run time. In JavaScript, you can also according to the need to use the ts. SyntaxKind. EndOfFileToken. Alternatively, we can convert enumerators to readable strings using the following functions:

export function syntaxKindToName(kind: ts.Syntaxkind) {
    return (<any>ts).SyntaxKind[kind];
}
Copy the code

4. AST Miscellaneous miscellaneous (Trivia)

AST Miscellaneous Miscellaneous (Trivia) are parts of the source code that are not important for proper understanding of the code. For example, white space, comments, conflicting markers, and so on. To keep things light, miscellaneous items are not stored in the AST, but they can be retrieved using some TS API as needed.

Before showing these apis, you need to understand the following:

1) Ownership of miscellaneous items

In general, the token owns all miscellaneous items on the same line after it up to the next token; The comments after this line are all related to the next token.

For leading and ending comments in the file. The first token in the source file owns all the starting miscellaneous items; Miscellaneous items at the end of the file are appended to the end-of-file token, which has a length of 0.

2) Miscellaneous apis In most basic uses, annotations are miscellaneous items of concern. Node comments can be obtained by using the following function.

function	describe
ts.getLeadingCommentRanges	Returns the comment range from the first newline to the token itself, given the source text and its position.
ts.getTrailingCommentRanges	Given the source text and its position, returns the comment range before the first newline after the given position.

Suppose the following is part of a source file:

debugger;/*hello*/
    //bye
  /*hi*/  function
Copy the code

So, for function, getLeadingCommentRanges only returns the last two comments //bye and /*hi*/. Also, at the end of the debugger statement, a call to getTrailingCommentRanges gets the comment /*hello*/.

3) Token start and full start positions

Nodes have so-called token start and full start positions.

Token start: The more natural version, where a token text starts in a file.

Full start: indicates the position where the scanner starts scanning from the last significant token.

The AST node has the getStart API and getFullStart API, which are used to obtain the above two positions. Again, for function, token start is the position of function, and full start is the position of /*hello*/.

Note that Full Start even contains miscellaneous items owned by the previous node.

Scanner. Ts

The parser controls the scanner to convert the source code into an abstract syntax tree (AST). That is, the source code goes through the scanner into a token stream, and the token stream goes through the parser into an AST.

1. Parser calls to the scanner

To avoid the overhead of creating scanners repeatedly, a scanner singleton is created in Parser.ts. The parser uses the initializeState function to start the scanner as needed.

Here is a simplified version of the code in the parser that you can run to demonstrate the above concepts:

import * as ts from 'typescript';

// singleton scanner
const scanner = ts.createScanner(ts.ScriptTarget.Latest, /* Ignore miscellaneous items */ true);

// This function is similar to the initializeState function used for initialization
function initializeState(text: string) {
    scanner.setText(text);
    scanner.setOnError((message: ts.DiagnosticMessage, length: number) = > {
        console.error(message);
    });
    scanner.setScriptTarget(ts.ScriptTarget.ES5);
    scanner.setLanguageVariant(ts.LanguageVariant.Standard);
}

/ / use cases
initializeState(
    ` var foo = 123; `.trim()
)

// Start scanning
var token = scanner.scan();
while(token ! = ts.SyntaxKind.EndOfFileToken) {console.log(ts.formatSyntaxKind(token));
    token = scanner.scan();
}
Copy the code

This code outputs the following:

VarKeyword
Identifier
FirstAssignment
FirstLiteralToken
SemicolonToken
Copy the code

2. Scanner status

After scan is called, the scanner updates its local state, such as scan location, current token details, and so on. The scanner provides a set of utility functions to obtain the state of the current scanner. In the example below, we create a scanner and use it to identify the token and token place in the code (code compiler/scanner/runScannerWithPosition ts).

/ / use cases
initializeState(
    ` var foo = 123; `.trim()
);

// Start scanning
var token = scanner.scan();
while(token ! = ts.SyntaxKind.EndOfFileToken) {let currentToken = ts.formatSyntaxKind(token);
    let tokenStart = scanner.getStartPos();
    token = scanner.scan();
    let tokenEnd = scanner.getStartPos();
    console.log(currentToken, tokenStart, tokenEnd);
}
Copy the code

This code outputs the following:

VarKeyword 0 3
Identifier 3 7
FirstAssignment 7 9
FirstLiteralToken 9 13
SemicolonToken 13 14
Copy the code

3. Independent scanner

Even though TypeScript parsers have singleton scanners, you can still create a separate scanner using createScanner and then use setText and setTextPos to scan files at random.

5. Parser — Praser.ts

As described earlier, the parser controls the scanner to convert the source code to the AST. The parser implementation uses the singleton pattern (for similar reasons to the scanner). It is actually implemented as a Namespace Parser, which contains various state variables for the Parser and a singleton scanner (const scanner).

1. Program calls to the parser

The parser is driven indirectly by the program (through the previously mentioned CompilerHost). Basically, the simplified call stack looks like this:

Program - > CompilerHost. GetSourceFile - > (global function parser. Ts) createSourceFile - > parser. ParseSourceFileCopy the code

ParseSourceFile not only prepares the state of the parser, but also calls initializeState to prepare the state of the scanner. Then, use the parseSourceFileWorker to continue parsing the source code.

import * as ts from 'typescript';

function printAllChildren(node: ts.Node, depth = 0) {
    console.log(new Array(depth + 1).join(The '-'), ts.formatSyntaxKind(node.kind), node.pos, node.end);
    depth++;
    node.getChildren().forEach(c= > printAllChildren(c, depth));
}

var sourceCode = 'var foo = 123; '.trim();
var sourceFile = ts.createSourceFile('foo.ts', sourceCode, ts.ScriptTarget.ES5, true);

printAllChildren(sourceFile);
Copy the code

This code outputs the following:

SourceFile 0 14
---- SyntaxList 0 14
-------- VariableStatement 0 14
---------------- VariableDeclarationList 0 13
-------------------- VarKeyword 0 3
-------------------- SyntaxList 3 13
------------------------ VariableDeclaration 3 13
---------------------------- Identifier 3 7
---------------------------- FirstAssignment 7 9
---------------------------- FirstLiteralToken 9 13
------------ SemicolonToken 13 14
---- EndOfFileToken 14 14
Copy the code

3. Parser functions

ParseSourceFile sets the initial state and hands off the work to the parseSourceFileWorker function.

parseSourceFileWorker

The parseSourceFileWorker function creates a SourceFile AST node and then parses the source code from the parseStatements function. Once the result is returned, the SourceFile node is refined with additional information such as nodeCount, identifierCount, and so on.

parseStatements

The parseStatements function is one of the most important parseFoo functions. It switches based on the current token returned by the scanner (calling the corresponding parseFoo function). For example, if the current token is a SemicolonToken, it calls paserEmptyStatement to create an AST node for the empty statement.

Node to create

The parser has a series of parseFoo functions to create nodes of type Foo, which are usually called by other parser functions when a node of the corresponding type is needed. A typical example of this process is parsing empty statements (such as ；；；；；；) The parseEmptyStatement function is used as follows:

function parseEmptyStatement() :Statement {
    let node = <Statement>createNode(SyntaxKind.EmptyStatement);
    parseExpected(SyntaxKind.SemicolonToken);
    
    return finishNode(node);
}
Copy the code

It shows three key functions: createNode, parseExpected, and finishNode.

createNode

CreateNode (kind: SyntaxKind, pos? : number) : Node is responsible for creating the Node, setting the SyntaxKind passed in, and the initial position (the default is to use the position information provided by the current scanner state).

parseExpected

Function parseExpected (kind: SyntaxKind, diagnosticMessage? DiagnosticMessage) : Boolean checks if the current token in the parser state matches the specified SyntaxKind. If there is a mismatch, it is reported to the incoming diagnosticMessage or some generic form foo Expected is created. Internally, the function uses the parseErrorAtPosition function (which uses scan positions) to provide good error reporting.

finishNode

Function finishNode < T extends Node > (Node: T, end? : number) : T Sets the end position of the node and adds some useful information.

Examples include the context flag parserContextFlags, and parsing errors that precede the node (if there are any errors, this AST node cannot be reused in incremental parsing).

Six, binder

Most transpilers are simpler than TypeScript translators because they provide few means of code analysis. A typical JavaScript converter has only the following flow.

Source code ~ ~ scanner -> Token ~ ~ parser ->AST ~ ~ emitter ->JavaScript

The above flow does help simplify the process of understanding TypeScript’s JavaScript generation, but one key feature is missing: TypeScript’s semantic system. To assist in type checking (performed by the inspector), the binder connects the parts of the source code into a related type system for the inspector to use. The main responsibility of the binder is to create symbols.

1, the symbol

Symbols connect declaration nodes in the AST to the same entities as other declarations. Symbols are the basic building blocks of semantic systems. Symbols of the constructor defined in the core. The ts, binder through objectAllocator. Actually getSymbolConstructor to obtain the constructor. Here is the code for the symbol constructor:

function Symbol(flags: SymbolFlags, name: string) {
    this.flags = flags;
    this.name = name;
    this.declarations = undefined;
}
Copy the code

The SymbolFlags symbol flag is an enumeration of tokens used to identify additional symbol categories, such as variable scoped tokens FunctionScopedVariable, BlockScopedVariable, and so on.

2. Check the use of binders by the checker

In effect, the binder is called internally by the type checker, which in turn is called by the program. The simplified call stack looks like this:

Program.gettypechecker => ts.createTypechecker => initializeTypeChecker =>for each SourceFile 'ts.bindSourceFile'(in the binder)/ / the following
            for each SourceFile 'ts.mergeSymbolTree'(In the inspector)Copy the code

SourceFile is the binder’s unit of work, binder.ts is driven by Checker.ts.

3. Binder function

bindSourceFileandmergeSymbolTableAre the two key binder functions.

For now, let’s focus on bindSource Cefile.

bindSourceFile

This function is used to check if file.locals is defined, and if not, to bind.

Note: Locals is defined on the node and is of type SymbolTable. SourceFile is also a node (in fact, the root node in the AST).

The TypeScript compiler makes extensive use of native functions. Local functions are likely to use variables from the parent (captured via closures). For example, bind, which is a local function in bindSourceFile, or the function it calls, sets the symbolCount and classifiableNames states and then saves them in the returned SourceFile.

bind

Bind can handle any node (not just SourceFile). The first thing it does is assign Node. parent. Then bindWorker does a lot of “heavy work”; Finally, call bindChildren.

This function simply stores the state of the binder (such as parent) into the function’s local variable, then calls bind on each child node, and then dumps the state back into the binder.

bindWorker

BindWorker This function switches according to Node. kind (type SyntaxKind) and delegates the work to the appropriate bindXXX function (also defined in binder.ts). For example, bindAnonymousDeclaration is called if the node is SourceFile (finally and only if the node is an external file module).

BindXXX function

BindXXX family functions have some general patterns and utility functions. One of the most commonly used is the createSymbol function, whose full code looks like this:

function createSymbol(flags: SymbolFlags, name: string) :Symbol {
    symbolCount++;
    return new Symbol(flags, name);
}
Copy the code

As you can see, it simply updates symbolCount (a local variable of bindSourceFile) and creates the symbol with the specified parameters.

4. Binder declaration

1) Links between symbols and declaration nodes and symbols are performed by several functions. One of the functions used to bind the SourceFile node to the SourceFile symbol (in the case of an external module) is addDeclarationToSymbol.

Note: External module source files are set to flags: SymbolFlags.valuemodule and name: ‘”‘+removeFileExtension (file.filename) +'”‘.

function addDeclarationToSymbol(symbol: Symbol, node: Declaration, symbolFlags: SymbolFlags) {
  symbol.flags |= symbolFlags;

  // Create a connection between the AST node and symbol
  node.symbol = symbol;

  if(! symbol.declarations) { symbol.declarations = []; }// Add the node as a declaration of the symbol
  symbol.declarations.push(node);

  if(symbolFlags & SymbolFlags.HasExports && ! symbol.exports) { symbol.exports = {}; }if(symbolFlags & SymbolFlags.HasMembers && ! symbol.members) { symbol.members = {}; }if (symbolFlags & SymbolFlags.Value && !symbol.valueDeclaration) {
    symbol.valueDeclaration = node;
  }
}
Copy the code

The above code does the following. ● Create a connection from the AST node to the symbol (node.symbol). ● Add the node as a declaration of the symbol.

A code declaration is a node with an optional name. Here is an example of a declaration in types.ts.

interface Declaration extends Node {
    _declarationBrand: any; name? : DeclarationName; }Copy the code

5. Binder container

The nodes of the AST can be used as containers. This determines the SymbolTables category for the nodes and their associated symbols.

The nodes of the AST can be used as containers. This determines the SymbolTables category for the nodes and their associated symbols. A container is an abstract concept (with no associated data structures). The concept is driven by a few things, and the ContainerFlags enumeration is one of them.

The getContainerFlags function (located at binder.ts) drives ContainerFlags as shown in the following example.

function getContainerFlags(node: Node) :ContainerFlags {
  switch (node.kind) {
    case SyntaxKind.ClassExpression:
    case SyntaxKind.ClassDeclaration:
    case SyntaxKind.InterfaceDeclaration:
    case SyntaxKind.EnumDeclaration:
    case SyntaxKind.TypeLiteral:
    case SyntaxKind.ObjectLiteralExpression:
      return ContainerFlags.IsContainer;

    case SyntaxKind.CallSignature:
    case SyntaxKind.ConstructSignature:
    case SyntaxKind.IndexSignature:
    case SyntaxKind.MethodDeclaration:
    case SyntaxKind.MethodSignature:
    case SyntaxKind.FunctionDeclaration:
    case SyntaxKind.Constructor:
    case SyntaxKind.GetAccessor:
    case SyntaxKind.SetAccessor:
    case SyntaxKind.FunctionType:
    case SyntaxKind.ConstructorType:
    case SyntaxKind.FunctionExpression:
    case SyntaxKind.ArrowFunction:
    case SyntaxKind.ModuleDeclaration:
    case SyntaxKind.SourceFile:
    case SyntaxKind.TypeAliasDeclaration:
      return ContainerFlags.IsContainerWithLocals;

    case SyntaxKind.CatchClause:
    case SyntaxKind.ForStatement:
    case SyntaxKind.ForInStatement:
    case SyntaxKind.ForOfStatement:
    case SyntaxKind.CaseBlock:
      return ContainerFlags.IsBlockScopedContainer;

    case SyntaxKind.Block:
      // Do not treat a block inside a function directly as a container for the block scope.
      // Local variables in this block should be placed in functions, otherwise 'x' in the following example will not be redeclared as a block-scoped local variable:
      //
      // function foo() {
      // var x;
      // let x;
      / /}
      //
      // If 'var x' is left inside the function and 'let x' is placed inside the block (outside the function), there is no conflict.
      //
      // If a new block-scoped container is not created here, both 'var x' and 'let x' will go into the local function container, so there will be collisions.
      return isFunctionLike(node.parent) ? ContainerFlags.None : ContainerFlags.IsBlockScopedContainer;
  }

  return ContainerFlags.None;
}
Copy the code

This function is called only in the binder function bindChildren, which sets the node to container or blockScopedContainer based on the result of the getContainerFlags run. The function bindChildren is shown below.

// All container nodes are stored in a linked list in declarative order.
The getLocalNameOfContainer function in the type checker uses this list to verify the uniqueness of the local names used by the container.
function bindChildren(node: Node) {
  // We need to save the parent, container, and block container before recursing to the child. After processing the pop-up child nodes, store the values back.
  let saveParent = parent;
  let saveContainer = container;
  let savedBlockScopeContainer = blockScopeContainer;

  // Now to make this node the parent, we recurse to its children.
  parent = node;

  // Depending on the type of node, the current container or block container needs to be adjusted. If the current node is a container, it is automatically treated as the current block container.
  // Since we know that the container may contain local variables, we initialize the.locals field beforehand.
  // This is done because it is likely that some children will need to be put into.locals (for example, function arguments or variable declarations).
  //
  // However, we will not actively create.locals for block containers, as block containers usually do not have block-scoped variables.
  // We don't want to assign an object to every block we encounter, in most cases it's not necessary.
  //
  // Finally, if it's a block container, we clean up any.locals objects that might exist in the container. This often happens in incremental compilation scenarios.
  // Since we can reuse the last compiled node, which may already have locals objects created.
  // It must be cleaned up to avoid accidentally moving obsolete data from the previous compilation.
  let containerFlags = getContainerFlags(node);
  if (containerFlags & ContainerFlags.IsContainer) {
    container = blockScopeContainer = node;

    if (containerFlags & ContainerFlags.HasLocals) {
      container.locals = {};
    }

    addToContainerChain(container);
  } else if (containerFlags & ContainerFlags.IsBlockScopedContainer) {
    blockScopeContainer = node;
    blockScopeContainer.locals = undefined;
  }

  forEachChild(node, bind);

  container = saveContainer;
  parent = saveParent;
  blockScopeContainer = savedBlockScopeContainer;
}
Copy the code

6. Binder symbol table

The SymbolTable is implemented as a simple HashMap. Here is the code for its interface (types.ts).

interafce SymbolTable {
    [index: string] :Symbol;
}
Copy the code

Symbol tables are initialized by binding, and here are some of the symbol tables used by the compiler. On the node.

locals? : SymbolTable;Copy the code

In notation.

members? : SymbolTable;exports? : SymbolTable;Copy the code

Note: bindChildren is initialized from ContainerFlags to locals ({}).

Symbol table padding:

The symbol table is populated with symbols, primarily by calling declareSymbol, the full code for which is shown below.

/** * declares a symbol for the specified node and adds symbols. Error reported during identity name conflict. *@param symbolTable- Symbol table * to which the node is to be added@param parent- Declaration of the parent node of the specified node *@param node- The (node) declaration * to add to the symbol table@param includes- SymbolFlags, which specifies the additional declaration type for the node (for example, export, ambient, etc.) *@param excludes- Flag that cannot be declared in the symbol table, used to report prohibited declarations */
function declareSymbol(
  symbolTable: SymbolTable,
  parent: Symbol,
  node: Declaration,
  includes: SymbolFlags,
  excludes: SymbolFlags
) :Symbol { Debug.assert(! hasDynamicName(node));// The default exported function node or class node is always "default".
  let name = node.flags & NodeFlags.Default && parent ? 'default' : getDeclarationName(node);

  let symbol: Symbol;
  if(name ! = =undefined) {
    // Check if there is a symbol with the same name in the symbol table. If not, create a new symbol with that name and add it to the table.
    // Note that we have not specified any flags for the new symbol. This ensures you don't run afoul of the incoming Excludes sign.
    //
    // If an existing symbol conflicts with the new symbol to be created.
    For example, in the same symbol table, the 'var' symbol and the 'class' symbol will conflict.
    // If there is a conflict, report the problem to each declaration of that symbol, and then create a new symbol for that declaration
    //
    If the new symbol we create neither has the same name in the symbol table nor conflicts with an existing symbol, this node is added as the unique declaration of the new symbol.
    //
    // Otherwise, merge into existing compatible symbols (for example, if there are multiple 'var' names in the same container). In this case, the node is added to the declared list of symbols.
    symbol = hasProperty(symbolTable, name)
      ? symbolTable[name]
      : (symbolTable[name] = createSymbol(SymbolFlags.None, name));

    if (name && includes & SymbolFlags.Classifiable) {
      classifiableNames[name] = name;
    }

    if (symbol.flags & excludes) {
      if (node.name) {
        node.name.parent = node;
      }

      // Report the error location of each repeated declaration
      // Report a previously encountered declaration error
      let message =
        symbol.flags & SymbolFlags.BlockScopedVariable
          ? Diagnostics.Cannot_redeclare_block_scoped_variable_0
          : Diagnostics.Duplicate_identifier_0;
      forEach(symbol.declarations, declaration= >{ file.bindDiagnostics.push( createDiagnosticForNode(declaration.name || declaration, message, getDisplayName(declaration)) ); }); file.bindDiagnostics.push(createDiagnosticForNode(node.name || node, message, getDisplayName(node))); symbol = createSymbol(SymbolFlags.None, name); }}else {
    symbol = createSymbol(SymbolFlags.None, '__missing');
  }

  addDeclarationToSymbol(symbol, node, includes);
  symbol.parent = parent;

  return symbol;
}
Copy the code

7. Binder error report

The binding error is added to the bindDiagnostics list for the source file.

Seven, the inspector

Checkers make TypeScript unique and more powerful than other JavaScript translators. The inspector is located in Checker.ts and currently has more than 23,000 lines of code, the largest part of the compiler.

1. The use of the inspector by the program

The inspector is initialized by the program, and here is an example of the code for the call stack (also shown in the binder section).

Program.gettypechecker -> ts.createTypechecker (in the inspector) -> initializeTypeChecker (in the inspector) ->for each SourceFile 'ts.bindSourceFile'(In the binder)/ / the following
            for each SourceFile 'ts.mergeSymbolTable'(In the inspector)Copy the code

2. Contact with the transmitter

The real type checking happens when getDiagnostics is called.

When this function is called, such as when a request is made to program.emit, the inspector returns an EmitResolver (obtained by the Program calling the inspector’s getEmitResolver function), EmitResolver is a collection of local functions of createTypeChecker.

Here is the call stack for checkSourceFile (checkSourceFile is a local function of createTypeChecker).

program.emit ->
    emitWorker (program local) ->
        createTypeChecker.getEmitResolver ->
            // Call the following local createTypeChecker functions for the first time
            call getDiagnostics ->
                getDiagnosticsWorker ->
                    checkSourceFile

            / / then
            return resolver
            Resolver is initialized in createTypeChecker by calling the local createResolver() function.
Copy the code

Global namespace merge

The following code exists in initializeTypeChecker.

// initialize the global SymbolTable.
forEach(host.getSourceFiles(), file= > {
  if (!isExternalModule(file)) {
    mergeSymbolTable(globals, file.locals);
  }
});
Copy the code

This code basically merges all of the global symbols into the let globals: SymbolTable={} SymbolTable (in createTypeChecker). MergeSymbolTable primarily calls the mergeSymbol function.

4. Inspector error report

The inspector reports errors using the native error function, as shown below.

function error(location: Node, message: DiagnosticMessage, arg0? :any, arg1? :any, arg2? :any) :void {
  let diagnostic = location
    ? createDiagnosticForNode(location, message, arg0, arg1, arg2)
    : createCompilerDiagnostic(message, arg0, arg1, arg2);
  diagnostics.add(diagnostic);
}
Copy the code

Viii. Launcher

The TypeScript compiler provides two emitters.

Emitters. Ts: Emitters that compile TypeScript into JavaScript.
DeclarationEmitter. Ts: This emitter is used to create declaration files (.d.ts) for TypeScript source files (.ts).

In this section we’ll look at Emitters.

1. Promgram’s use of the launcher

Program provides an EMIT function. This function primarily delegates functionality to emitFiles in Emitter.ts.

Here is an example call stack:

Emit -> 'emitWorker'(createProgram in program.ts) -> 'emitFiles'(function in Emitter. Ts)Copy the code

2. Emitter function

1) emitFiles

EmitFiles are defined in Emitter. ts, and the following is the signature of this function:

Export function emitFiles(resolver:) export function emitFiles(resolver:) EmitResolver, host: EmitHost, targetSourceFile? : SourceFile): EmitResultCopy the code

EmitHost is a simplified version of CompilerHost, and many use cases are actually CompilerHost at run time.

2) emitJavaScript This function is well commented, as shown below. ```ts function emitJavaScript(jsFilePath: string, root? : SourceFile) { let writer = createTextWriter(newLine); let write = writer.write; let writeTextOfNode = writer.writeTextOfNode; let writeLine = writer.writeLine; let increaseIndent = writer.increaseIndent; let decreaseIndent = writer.decreaseIndent; let currentSourceFile: SourceFile; // The name of the export function, if the file is an external module // system.register ([...]) , function (<exporter>) {... }) // export var x; . x = 1 // => // var x; . exporter("x", x = 1) let exportFunctionForFile: string; let generatedNameSet: Map<string> = {}; let nodeToGeneratedName: string[] = []; let computedPropertyNamesToGeneratedNames: string[]; let extendsEmitted = false; let decorateEmitted = false; let paramEmitted = false; let awaiterEmitted = false; let tempFlags = 0; let tempVariables: Identifier[]; let tempParameters: Identifier[]; let externalImports: (ImportDeclaration | ImportEqualsDeclaration | ExportDeclaration)[]; let exportSpecifiers: Map<ExportSpecifier[]>; let exportEquals: ExportAssignment; let hasExportStars: boolean; */ let writeEmittedFiles = writeJavaScriptFile; let detachedCommentsInfo: { nodePos: number; detachedCommentEndPos: number }[]; let writeComment = writeCommentRange; /** emit a node */ let emit = emitNodeWithoutSourceMap; / / let emitStart = function(node: node) {}; */ let emitEnd = function(node: node) {}; /** Emits text for the specified token, starting at startPos. The default written text is provided by tokenKind, * but if an optional emitFn callback is provided, that callback will be used to emit text instead of the default. * @param tokenKind the type of token to search for and fire * @param startPos source code to start the search for token * @param emitFn, if given, will be called to fire text. */ let emitToken = emitTokenText; /** This function is called before the scope is enabled in the function or class in the transmitted code because of the node. * @param scopeDeclaration The node on which the scope is enabled. */ let scopeEmitStart = function(scopeDeclaration: Node, scopeName? : string) {}; */ let scopeEmitEnd = function() {}; /** Let sourceMapData: sourceMapData; if (compilerOptions.sourceMap || compilerOptions.inlineSourceMap) { initializeEmitterWithSourceMaps(); } emitSourceFile emitSourceFile(root);} emitSourceFile(root); } else { forEach(host.getSourceFiles(), sourceFile => { if (! isExternalModuleOrDeclarationFile(sourceFile)) { emitSourceFile(sourceFile); }}); } writeLine(); writeEmittedFiles(writer.getText(), /*writeByteOrderMark*/ compilerOptions.emitBOM); return; // a batch of local functions}Copy the code

It basically sets up a bunch of local variables and functions (which make up most of emitSourceFile), and then hands them over to emitSourceFile to emit text. The emitSourceFile function sets currentSourceFile and gives it to the local function to emit.

function emitSourceFile(sourceFile: SourceFile) :void {
    currentSourceFile = sourceFile;
    exportFunctionForFile = undefined;
    emit(sourceFile);
}
Copy the code

The EMIT function handles the emission of comments and actual JavaScript. The actual JavaScript emission is the job of the emitJavaScriptWorker function.

3) The complete functions of emitJavaScriptWorker are shown below.

function emitJavaScriptWorker(node: Node) {
  // Check if the node can ignore ScriptTarget emission
  switch (node.kind) {
    case SyntaxKind.Identifier:
      return emitIdentifier(<Identifier>node);
    case SyntaxKind.Parameter:
      return emitParameter(<ParameterDeclaration>node);
    case SyntaxKind.MethodDeclaration:
    case SyntaxKind.MethodSignature:
      return emitMethod(<MethodDeclaration>node);
    case SyntaxKind.GetAccessor:
    case SyntaxKind.SetAccessor:
      return emitAccessor(<AccessorDeclaration>node);
    case SyntaxKind.ThisKeyword:
      return emitThis(node);
    case SyntaxKind.SuperKeyword:
      return emitSuper(node);
    case SyntaxKind.NullKeyword:
      return write('null');
    case SyntaxKind.TrueKeyword:
      return write('true');
    case SyntaxKind.FalseKeyword:
      return write('false');
    case SyntaxKind.NumericLiteral:
    case SyntaxKind.StringLiteral:
    case SyntaxKind.RegularExpressionLiteral:
    case SyntaxKind.NoSubstitutionTemplateLiteral:
    case SyntaxKind.TemplateHead:
    case SyntaxKind.TemplateMiddle:
    case SyntaxKind.TemplateTail:
      return emitLiteral(<LiteralExpression>node);
    case SyntaxKind.TemplateExpression:
      return emitTemplateExpression(<TemplateExpression>node);
    case SyntaxKind.TemplateSpan:
      return emitTemplateSpan(<TemplateSpan>node);
    case SyntaxKind.JsxElement:
    case SyntaxKind.JsxSelfClosingElement:
      return emitJsxElement(<JsxElement | JsxSelfClosingElement>node);
    case SyntaxKind.JsxText:
      return emitJsxText(<JsxText>node);
    case SyntaxKind.JsxExpression:
      return emitJsxExpression(<JsxExpression>node);
    case SyntaxKind.QualifiedName:
      return emitQualifiedName(<QualifiedName>node);
    case SyntaxKind.ObjectBindingPattern:
      return emitObjectBindingPattern(<BindingPattern>node);
    case SyntaxKind.ArrayBindingPattern:
      return emitArrayBindingPattern(<BindingPattern>node);
    case SyntaxKind.BindingElement:
      return emitBindingElement(<BindingElement>node);
    case SyntaxKind.ArrayLiteralExpression:
      return emitArrayLiteral(<ArrayLiteralExpression>node);
    case SyntaxKind.ObjectLiteralExpression:
      return emitObjectLiteral(<ObjectLiteralExpression>node);
    case SyntaxKind.PropertyAssignment:
      return emitPropertyAssignment(<PropertyDeclaration>node);
    case SyntaxKind.ShorthandPropertyAssignment:
      return emitShorthandPropertyAssignment(<ShorthandPropertyAssignment>node);
    case SyntaxKind.ComputedPropertyName:
      return emitComputedPropertyName(<ComputedPropertyName>node);
    case SyntaxKind.PropertyAccessExpression:
      return emitPropertyAccess(<PropertyAccessExpression>node);
    case SyntaxKind.ElementAccessExpression:
      return emitIndexedAccess(<ElementAccessExpression>node);
    case SyntaxKind.CallExpression:
      return emitCallExpression(<CallExpression>node);
    case SyntaxKind.NewExpression:
      return emitNewExpression(<NewExpression>node);
    case SyntaxKind.TaggedTemplateExpression:
      return emitTaggedTemplateExpression(<TaggedTemplateExpression>node);
    case SyntaxKind.TypeAssertionExpression:
      return emit((<TypeAssertion>node).expression);
    case SyntaxKind.AsExpression:
      return emit((<AsExpression>node).expression);
    case SyntaxKind.ParenthesizedExpression:
      return emitParenExpression(<ParenthesizedExpression>node);
    case SyntaxKind.FunctionDeclaration:
    case SyntaxKind.FunctionExpression:
    case SyntaxKind.ArrowFunction:
      return emitFunctionDeclaration(<FunctionLikeDeclaration>node);
    case SyntaxKind.DeleteExpression:
      return emitDeleteExpression(<DeleteExpression>node);
    case SyntaxKind.TypeOfExpression:
      return emitTypeOfExpression(<TypeOfExpression>node);
    case SyntaxKind.VoidExpression:
      return emitVoidExpression(<VoidExpression>node);
    case SyntaxKind.AwaitExpression:
      return emitAwaitExpression(<AwaitExpression>node);
    case SyntaxKind.PrefixUnaryExpression:
      return emitPrefixUnaryExpression(<PrefixUnaryExpression>node);
    case SyntaxKind.PostfixUnaryExpression:
      return emitPostfixUnaryExpression(<PostfixUnaryExpression>node);
    case SyntaxKind.BinaryExpression:
      return emitBinaryExpression(<BinaryExpression>node);
    case SyntaxKind.ConditionalExpression:
      return emitConditionalExpression(<ConditionalExpression>node);
    case SyntaxKind.SpreadElementExpression:
      return emitSpreadElementExpression(<SpreadElementExpression>node);
    case SyntaxKind.YieldExpression:
      return emitYieldExpression(<YieldExpression>node);
    case SyntaxKind.OmittedExpression:
      return;
    case SyntaxKind.Block:
    case SyntaxKind.ModuleBlock:
      return emitBlock(<Block>node);
    case SyntaxKind.VariableStatement:
      return emitVariableStatement(<VariableStatement>node);
    case SyntaxKind.EmptyStatement:
      return write(';');
    case SyntaxKind.ExpressionStatement:
      return emitExpressionStatement(<ExpressionStatement>node);
    case SyntaxKind.IfStatement:
      return emitIfStatement(<IfStatement>node);
    case SyntaxKind.DoStatement:
      return emitDoStatement(<DoStatement>node);
    case SyntaxKind.WhileStatement:
      return emitWhileStatement(<WhileStatement>node);
    case SyntaxKind.ForStatement:
      return emitForStatement(<ForStatement>node);
    case SyntaxKind.ForOfStatement:
    case SyntaxKind.ForInStatement:
      return emitForInOrForOfStatement(<ForInStatement>node);
    case SyntaxKind.ContinueStatement:
    case SyntaxKind.BreakStatement:
      return emitBreakOrContinueStatement(<BreakOrContinueStatement>node);
    case SyntaxKind.ReturnStatement:
      return emitReturnStatement(<ReturnStatement>node);
    case SyntaxKind.WithStatement:
      return emitWithStatement(<WithStatement>node);
    case SyntaxKind.SwitchStatement:
      return emitSwitchStatement(<SwitchStatement>node);
    case SyntaxKind.CaseClause:
    case SyntaxKind.DefaultClause:
      return emitCaseOrDefaultClause(<CaseOrDefaultClause>node);
    case SyntaxKind.LabeledStatement:
      return emitLabelledStatement(<LabeledStatement>node);
    case SyntaxKind.ThrowStatement:
      return emitThrowStatement(<ThrowStatement>node);
    case SyntaxKind.TryStatement:
      return emitTryStatement(<TryStatement>node);
    case SyntaxKind.CatchClause:
      return emitCatchClause(<CatchClause>node);
    case SyntaxKind.DebuggerStatement:
      return emitDebuggerStatement(node);
    case SyntaxKind.VariableDeclaration:
      return emitVariableDeclaration(<VariableDeclaration>node);
    case SyntaxKind.ClassExpression:
      return emitClassExpression(<ClassExpression>node);
    case SyntaxKind.ClassDeclaration:
      return emitClassDeclaration(<ClassDeclaration>node);
    case SyntaxKind.InterfaceDeclaration:
      return emitInterfaceDeclaration(<InterfaceDeclaration>node);
    case SyntaxKind.EnumDeclaration:
      return emitEnumDeclaration(<EnumDeclaration>node);
    case SyntaxKind.EnumMember:
      return emitEnumMember(<EnumMember>node);
    case SyntaxKind.ModuleDeclaration:
      return emitModuleDeclaration(<ModuleDeclaration>node);
    case SyntaxKind.ImportDeclaration:
      return emitImportDeclaration(<ImportDeclaration>node);
    case SyntaxKind.ImportEqualsDeclaration:
      return emitImportEqualsDeclaration(<ImportEqualsDeclaration>node);
    case SyntaxKind.ExportDeclaration:
      return emitExportDeclaration(<ExportDeclaration>node);
    case SyntaxKind.ExportAssignment:
      return emitExportAssignment(<ExportAssignment>node);
    case SyntaxKind.SourceFile:
      return emitSourceFileNode(<SourceFile>node);
  }
}
Copy the code

The recursion is accomplished by simply calling the corresponding emitXXX function, such as emitFunctionDeclaration.

function emitFunctionDeclaration(node: FunctionLikeDeclaration) {
  if (nodeIsMissing(node.body)) {
    return emitOnlyPinnedOrTripleSlashComments(node);
  }

  if(node.kind ! == SyntaxKind.MethodDeclaration && node.kind ! == SyntaxKind.MethodSignature) {// comments are emitted as part of the method declaration.
    emitLeadingComments(node);
  }

  // If the target is pre-ES6, use the function keyword to emit function-like declarations, including arrow functions
  When the target is ES6, you can fire native ES6 arrow functions and use wide arrows instead of the function keyword.
  if(! shouldEmitAsArrowFunction(node)) {if (isES6ExportedDeclaration(node)) {
      write('export ');
      if (node.flags & NodeFlags.Default) {
        write('default ');
      }
    }

    write('function');
    if (languageVersion >= ScriptTarget.ES6 && node.asteriskToken) {
      write(The '*');
    }
    write(' ');
  }

  if (shouldEmitFunctionName(node)) {
    emitDeclarationName(node);
  }

  emitSignatureAndBody(node);
  if (
    languageVersion < ScriptTarget.ES6 &&
    node.kind === SyntaxKind.FunctionDeclaration &&
    node.parent === currentSourceFile &&
    node.name
  ) {
    emitExportMemberAssignments((<FunctionDeclaration>node).name);
  }
  if(node.kind ! == SyntaxKind.MethodDeclaration && node.kind ! == SyntaxKind.MethodSignature) { emitTrailingComments(node); }}Copy the code

Emitter SourceMap

As mentioned earlier, much of the code in Emitters. Ts is functions

The code for emitJavaScript (we showed initialization of this function earlier).

It basically sets up a bunch of local variables and hands them off to emitSourceFile for processing. Let’s look at the function again, this time focusing on the SourceMap part.

function emitJavaScript(jsFilePath: string, root? : SourceFile) {

    // Irrelevant code........... Has been removed
    let writeComment = writeCommentRange;

    /** Writes the emitted output to disk */
    let writeEmittedFiles = writeJavaScriptFile;

    /** Emits a node */
    let emit = emitNodeWithoutSourceMap;

    /** call */ before the node launches
    let emitStart = function (node: Node) {};/** call */ after the node launches
    let emitEnd = function (node: Node) {};/** Emits text for the specified token, starting at startPos. The default written text is provided by tokenKind, * but if an optional emitFn callback is provided, that callback will be used to emit text instead of the default. *@param The category of tokens tokenKind is searching for and firing@param StartPos source search token starting position *@param EmitFn, if given, is called to emit text. * /
    let emitToken = emitTokenText;

    /** This function is called in the transmitted code before the lexical scope is enabled in the function or class because of the node@param ScopeDeclaration The node * that starts the lexical scope@param ScopeName The name of the optional scope, rather than deriving */ from the node declaration
    let scopeEmitStart = function(scopeDeclaration: Node, scopeName? :string) {};Call */ when ** is out of scope
    let scopeEmitEnd = function() {};/** The Sourcemap data to be encoded */
    let sourceMapData: SourceMapData;

    if (compilerOptions.sourceMap || compilerOptions.inlineSourceMap) {
        initializeEmitterWithSourceMaps();
    }

    if (root) {
        // Do not call emit directly as currentSourceFile will not be set
        emitSourceFile(root);
    }
    else {
        forEach(host.getSourceFiles(), sourceFile= > {
            if (!isExternalModuleOrDeclarationFile(sourceFile)) {
                emitSourceFile(sourceFile);
            }
        });
    }

    writeLine();
    writeEmittedFiles(writer.getText(), /*writeByteOrderMark*/ compilerOptions.emitBOM);
    return;
}
Copy the code

Important function call: initializeEmitterWithSourceMaps, this function is emitJavaScript local function,

It overrides some of the defined local functions. Overwritten function can be found at the bottom of the initalizeEmitterWithSourceMap.

The last part of the / / initializeEmitterWithSourceMaps function

writeEmittedFiles = writeJavascriptAndSourceMapFile;
emit = emitNodeWithSourceMap;
emitStart = recordEmitNodeStartSpan;
emitEnd = recordEmitNodeEndSpan;
emitToken = writeTextWithSpanRecord;
scopeEmitStart = recordScopeNameOfNode;
scopeEmitEnd = recordScopeNameEnd;
writeComment = writeCommentRangeWithMap;
Copy the code

That is, most emitter code doesn’t care about SourceMap and uses these native functions with or without SourceMap in the same way.

Reference: jkchao. Making. IO/typescript -…

Learn notes on TypeScript compilation principles

The TypeScript compiler

1. Key components

2. Core tools — Core.ts

3. Key data structure — types.ts

4, system file — system.ts

2. Program. Ts

1. Use CompilerHost

2, SourceFile

Abstract syntax tree

1. Node

2. Access the child nodes of the node

3, SyntaxKind enumeration

4. AST Miscellaneous miscellaneous (Trivia)

Scanner. Ts

1. Parser calls to the scanner

2. Scanner status

3. Independent scanner

5. Parser — Praser.ts

1. Program calls to the parser

3. Parser functions

Six, binder

1, the symbol

2. Check the use of binders by the checker

3. Binder function

4. Binder declaration

5. Binder container

6. Binder symbol table

7. Binder error report

Seven, the inspector

1. The use of the inspector by the program

2. Contact with the transmitter

Global namespace merge

4. Inspector error report

Viii. Launcher

1. Promgram’s use of the launcher

2. Emitter function

Emitter SourceMap

Related Posts

Element – UI Secondary Packaging series – Button

Flutter MouseRegion links highlight style only you can think of nothing you can’t do

Promise static four brothers and implementation, have you learned?