scenario

  • Compilation principle of this book needless to say, alias dragon book is a programmer’s bible. I saw a little a year ago, that is, simply read, now can be said to be basic no impression, one is not to do reading notes, two is not to practice, or that sentence on the paper come zhongjue shallow, must know this to practice.

  • Why did you think to reread the dragon book? Or the pain point of the project, the project is running too slow, with the expansion of the project, the start of the project becomes an extremely long thing! I wonder why node must be used as the engineering foundation of the front end. Why not use another language? Why not do it with golang? Golang can do that. How do you compile it? Look at the dragon book!

  • What would an ideal front-end engineering look like? An efficient compiler development experience, a simple file directory containing an executable file plus front-end code resources, a container for grinding differences do not worry about Windows, MAC platform differences.

Thoughts origin

  • Esbuild was used to run react project when I first learned about it last year. Although it failed, using Go as a build tool planted the seed in my mind.
  • Is Go really faster than NodeJS? Actions speak louder than words, and scripting languages are naturally slow. Here’s nodejs and Go doing the summation up to 100,000
/ / nodejs code
console.time("test");
var sum = 0;
var target = 100000;
for (let i = 0; i < target; i++) {
  sum += i;
}
console.log("JS:sum:", sum);
console.timeEnd("test");
Copy the code
/ / golang code
package main

import (
	"fmt"
	"time"
)

func main(a) {
	start := time.Now()
	var sum = 0
	var target = 100000
	for i := 0; i < target; i++ {
		sum += i
	}
	fmt.Println("GO:sum:", sum)
	cost := time.Since(start)
	fmt.Println("Runtime:", cost)
}
Copy the code

language The execution time
nodejs 13.074 ms
nodejs 15.89 ms
nodejs 14.844 ms
nodejs 13.337 ms
nodejs 13.316 ms
The average time 14.1448 ms
language The execution time
golang 67.087 (including s
golang 66.343 (including s
golang 73.034 (including s
golang 71.219 (including s
golang 68.233 (including s
The average time 69.1832 (including s
  • Is that really a comparison? The unit of nodeJS time is ms and go is µs.
  • As a derivative, is golang concurrency necessarily faster than sequential execution? Leave a little question!

Compilation principle learning share – get down to business

Learning methodology

  • Originally had written a part of the similar reading notes of the blog, their own reading may be the kind of sleepy! It was a record of the core concepts in the book, and the final decision was to keep a few key diagrams.

  1. A compiler is a program that reads a language and converts the program into another language.
Graph TD source --> Emperor ([compiler]) --> target

One of the main tasks of a compiler is to report errors in its translation

If the target program is an executable machine language program, it can be called to process input and produce output

Graph TD input --> Emperor ([target program]) --> output
  1. The parser does not generate the target program through translation. Performs specified operations of the source program directly with input provided
Graph of TD source program - > emperor ([interpreter]) -- - > input - output > emperor ([interpreter]) -- - > output
  1. Compilers produce machine language object programs that are generally faster than interpreters, which diagnose errors better than compilers, and which execute source programs statement by statement

  2. Compile the process sequence:

Flow graph of TD characters -- - > the parser [(parser)] - > symbol stream - > syntax analysis [(parsing)] - > syntax tree - > intermediate code generator [code generator (middle)] - > intermediate representation -- > code generator [(code generator)] - > Target Machine Language 1[Target machine Language] --> Machine independent Code optimizer [(Intermediate code generator)] --> Target Machine Language 2[Target machine language]

Learn to summarize

  • Saw a wave of the above figure, may be some impatient, the theory is very important lies in its guiding significance and thought, if there is no these up is dry, behind the maintenance of reading will be very poor.
  1. Input one character and output another machine code that can be executed by the machine (for example, converting less to CSS)
  2. Not only need to implement the text conversion, the middle code of the standard warning alarm is also necessary (for example, if the width is written as widdth, need to be prompted)
  3. The compilation process is as follows: read characters and remove invalid Spaces, and invalid symbols such as “;” The token object is generated, and the TOKEN list is traversed to generate an AST object. The AST object is traversed depth-first to generate the target machine code

Code practice and theory interspersed

Code practice – Read files

  • The design of a library is far from that simple, currently it is simply implemented, the real library requires configuration items and plug-in mechanisms, as well as business requirements such as Sourcemap, etc.
func main(a) {
        // Count time
	start := time.Now()
        // Read the file
	file, err := os.Open("test.less")
	iferr ! =nil {
		fmt.Printf("Error: %s\n", err)
		return
	}
	defer file.Close()
        / / cache
	br := bufio.NewReader(fi)
	for {
                // Read by line
		line, _, c := br.ReadLine()
		if c == io.EOF {
			break
		}
                // Read the generated token
		pkg.ReadLine1(line)
	}
        // Declare the AST root object
	var astData pkg.DataNode
	astData.SelectName = "root"
	astData.Children = make([]pkg.DataNode, 0)
        // Generate the entire AST object tree based on token traversal
	astData = pkg.GenerateAST(astData)
        Depth-first traversal generates the result string
	astDataString := pkg.GenerateChild(astData)
        // Check whether the target file exists. If so, delete it to ensure that a new file is generated
	if pkg.CheckFileIsExist("test.css") {
		_ = os.Remove("test.css")}// Write to the target file
	pkg.WriteFile1(astDataString)
        // Calculate the running time
	cost := time.Since(start)
	fmt.Println("Runtime:", cost)
}
Copy the code

Code practices – Define structures

// Token structure
/ / sample
/ / {
TypeName: "Select", // selector
// Value: "#video" // selector Value
// }
type Token struct {
	TypeName string
	Value    string
}
// Attr attribute structure
/ / sample
/ / {
// Name: "width", // Attribute Name
// Value: "100px" // attribute Value
// }
type Attr struct {
	Name  string
	Value string
}
// ast structure
type DataNode struct {
	SelectName   string
	Declarations []Attr
	Children     []DataNode
}

Copy the code

Code practice – Generate tokens

// Read line bytes to generate tokens
func ReadLine1(lineData []byte) {
	dataString := string(lineData)
        // less for example @big:100px
	if strings.HasPrefix(dataString, "@") {
                // Variable cache operation
		variableFormat(dataString)
		return
	}
        // Style opening line such as #video {
	if strings.Index(dataString, "{") > =0 {
		index := strings.Index(dataString, "{")
		selectToken := Token{
			TypeName: "Select",
			Value:    TrimSpace(dataString[:index]),
		}
		Tokens = append(Tokens, selectToken)
	}
        // Style ending line such as}
	if strings.Index(dataString, "}") > =0 {
		PunctuatorToken := Token{
			TypeName: "Punctuator",
			Value:    "}",
		}
		Tokens = append(Tokens, PunctuatorToken)
	}
        // Property middle line for example width:100px;
	if strings.Index(dataString, ":") > =0 {
                // Style attribute name such as width
		index := strings.Index(dataString, ":")
		before := TrimSpace(dataString[:index])
		attributeToken := Token{
			TypeName: "Attribute",
			Value:    before,
		}
		Tokens = append(Tokens, attributeToken)
		indexEnd := len(dataString) - 1
                // Attribute values are not in variables defined by less
		value := TrimSpace(dataString[index+1 : indexEnd])
		_, ok := variableMap[value]
                // Style attribute values such as @big or 100px
		if ok {
			attrValue := variableMap[value]
			ValueToken := Token{
				TypeName: "Value",
				Value:    attrValue,
			}
			Tokens = append(Tokens, ValueToken)
		} else {
			ValueToken := Token{
				TypeName: "Value",
				Value:    value,
			}
			Tokens = append(Tokens, ValueToken)
		}
	}
}
Copy the code

Theory – Generate tokens

  • < span style = “max-width: 100px; In such a string, we can see that there is a certain amount of space before or after the width character or: concord, which needs to be removed
  • Remove invalid symbols: during compilation such as “;” Such symbols have no practical meaning to delete (for example in the less scenario)
  • Termination symbol: When reading files such as the “}” symbol is an explicit termination symbol, which helps us parse the logic of token processing ast
  • Map table: in less, a character such as @big is a variable reference. In the following parsing, we need to replace @big with 100px. We need such a cache space to cache such data

Code practice – Generate ast

/ / tokens list
// index Indicates the token index
// characterList symbol table
func GenerateChildren1(tokens []Token, index int, characterList []string) (children DataNode, i int) {
	child := DataNode{}
	attr := Attr{}
        // less has hierarchy nesting
        // #body{
        // #child{
        / /}
        / /}
	var isBodyClose = false
	conNum := - 1
	for childIndex := 0; childIndex < len(tokens); childIndex++ {
		if childIndex <= conNum {
			continue
		}
		token := tokens[childIndex]
                // Is the first layer body nested
		if token.TypeName == "Select" && !isBodyClose {
			isBodyClose = true
			child = generateChildNode1(token, characterList)
			characterList = append(characterList, token.Value)
			continue
                // If it is layer 2, the child node is used for recursive algorithm
		} else if token.TypeName == "Select" && isBodyClose {
			childTokens := tokens[childIndex:]
			childNode, i := GenerateChildren1(childTokens, childIndex, characterList)
			conNum = i
			child.Children = append(child.Children, childNode)
			continue
                // End of body
		} else if token.Value == "}" {
			return child, index + childIndex
                // The style properties stored in the current selector include the property name and value
		} else if token.TypeName == "Attribute" {
			attr.Name = token.Value
			attr.Value = tokens[childIndex+1].Value
			child.Declarations = append(child.Declarations, attr)
			conNum = childIndex + 1}}return children, index
}
Copy the code

Theory – Generate AST

  • Symbol table: For example, in less, #body{#child{width:100px}} has such a nested scene that you need to carry the selector from the previous layer to the next layer because the result is actually #body #child{width:100px}, As we can see, #body # Child is stacking up as the hierarchy goes deeper, body->parent->child->grandson tree link. In the above code I used an array like characterList to do this, adding new layers to the array if they appear.
  • Recursion: Due to the emergence of sub-levels, we need to use recursion to traverse all the child nodes
  • End body: the use of recursion is bound to focus on the end, in less obviously “}” this symbol, is our natural end sign

Code practice – Write to file

//child Abstract syntax tree node generates a string
func GenerateChild(child DataNode) string {
	stringLines := child.SelectName + "{" + "\n"
	declarations := child.Declarations
	for _, declaration := range declarations {
		stringLines += "" + declaration.Name + ":" + declaration.Value + "; \n"
	}
	// Determine if there is an empty attribute in the body
	if strings.HasSuffix(stringLines, "{\n") {
		stringLines = ""
	} else {
		stringLines += "}\n"
	}
        // Recursively iterate over the child nodes
	for _, childNode := range child.Children {
		stringLines += GenerateChild(childNode)
	}
	return stringLines
}
// Write to the file
func WriteFile1(data string) {
	f, _ := os.Create("test.css")
	f.Close()
	file, err := os.OpenFile("test.css", os.O_WRONLY|os.O_CREATE, 0666)
	iferr ! =nil {
		fmt.Printf("open file error=%v\n", err)
		return
	}
	defer file.Close()
	// _, err = f.write ([]byte(" Text to write "))
	write := bufio.NewWriter(file)
	write.WriteString(data)
	write.Flush()
}
Copy the code

Theory – Write to file

  • Abstract Syntax tree: It is obvious that we need to take this tree and generate each line of CSS code using deep traversal first
  • Write efficiency: The use of Bufio can improve write efficiency

conclusion

  • The current code is just plain logic and doesn’t get to the heart of how compilation works. For example, the calculation scenes in LESS, the state machine, grammar analysis, intermediate code and so on in theory have not been written at all, which are also my learning objectives and practice scenes in the next stage.
  • Things to do: Scene completion is turning less into CSS, improving compilation efficiency, using better algorithms and design patterns
  • Whether go concurrency can definitely improve efficiency, the answer is no. In the scene, do we need sequential execution, minimal trip, time-consuming operation, computer cores and so on? Take a look at a code example
func main(a) {
	start := time.Now()
	var sum = 0
	var target = 100000
        // concurrent lock
	var waitGroup sync.WaitGroup
	var mutex sync.Mutex
	for i := 0; i < target; i++ {
		waitGroup.Add(1)
		go func(val int) {
			mutex.Lock()
			sum += val
			mutex.Unlock()
			waitGroup.Done()
		}(i)
	}
	waitGroup.Wait()
	fmt.Println("GO:sum:", sum)
	cost := time.Since(start)
	fmt.Println("Runtime:", cost)
}
Copy the code

  • It can be seen that more elaborate design schemes are needed to improve the performance of concurrent processes. Simple concurrency will not improve but reduce the execution efficiency.
  • Embrace change. I have experienced front-end JSP, the three frameworks are popular, and the scaffolding of NodeJS is popular. I have experienced the separation of front and back ends, and the middle layer of NodeJS. Whether it’s a toolchain or an intermediate service, someone would say the ecosystem, has the NodeJS ecosystem been around for a long time?
  • Focus on the underlying library, focus on the core library, improve performance, improve development experience — go for it.