Written in the beginning

Those of you who have written Vue know how convenient it is to have a single file component like.vue. However, we also know that the underlying Vue is rendered through the virtual DOM, so how on earth can the template of.vue file be converted into the virtual DOM? This piece has always been a black box to me, and I haven’t studied it in depth before, so TODAY I’m going to find out.

Vue 3 is about to be released, I wanted to see the template compilation of Vue 3 directly, but when I opened the source code of Vue 3, I found that I don’t even seem to know how Vue 2 compiles the template. Lu Xun has told us since childhood that we can not eat a fat man, then I can only look back at Vue 2 template compiler source code, as for Vue 3 left to the official release of the time to see.

The version of the Vue

Many people use Vue as template code generated directly from vue-CLI, not knowing that Vue actually provides two build versions.

  • vue.js: The full version, including the ability to compile templates
  • vue.runtime.js: Runtime version. This version does not provide the template compilation capability and needs to be compiled in advance using vue-Loader.

In short, if you are using Vue-Loader, you can use vue.runtime.min.js to compile the template through vue-Loader. If you are importing vue directly from your browser, you need to use vue.min.js. Compile the template at run time.

Compile the entrance

SRC /platforms/web/entry- Runtime-with-compiler.js

// Omit some of the code, keeping only the key parts
import { compileToFunctions } from './compiler/index'

const mount = Vue.prototype.$mount
Vue.prototype.$mount = function (el) {
  const options = this.$options
  
  // If there is no render method, then compile the template
  if(! options.render) {let template = options.template
    if (template) {
      // Call compileToFunctions and get the render method
      const { render, staticRenderFns } = compileToFunctions(template, {
        shouldDecodeNewlines,
        shouldDecodeNewlinesForHref,
        delimiters: options.delimiters,
        comments: options.comments
      }, this)
      // The render method is the method that generates the virtual DOM
      options.render = render
    }
  }
  return mount.call(this, el, hydrating)
}
Copy the code

Take a look at where the compileToFunctions method for./compiler/index files comes from.

import { baseOptions } from './options'
import { createCompiler } from 'compiler/index'

// Generate the compile function via the createCompiler method
const { compile, compileToFunctions } = createCompiler(baseOptions)
export { compile, compileToFunctions }
Copy the code

The following main logic is in the compiler module, which is a bit convoluting because this article is not to do source code analysis, so the whole source code is not posted. Just take a look at the logic of this paragraph.

export function createCompiler(baseOptions) {
  const baseCompile = (template, options) = > {
    // Parse the HTML and convert it to an AST
    const ast = parse(template.trim(), options)
    // Optimize the AST to mark static nodes
    optimize(ast, options)
    // Convert the AST to executable code
    const code = generate(ast, options)
    return {
      ast,
      render: code.render,
      staticRenderFns: code.staticRenderFns
    }
  }
  const compile = (template, options) = > {
    const tips = []
    const errors = []
    // Collect error information during compilation
    options.warn = (msg, tip) = > {
      (tip ? tips : errors).push(msg)
    }
    / / compile
    const compiled = baseCompile(template, options)
    compiled.errors = errors
    compiled.tips = tips

    return compiled
  }
  const createCompileToFunctionFn = () = > {
    // Compile the cache
    const cache = Object.create(null)
    return (template, options, vm) = > {
      // The compiled template is cached directly
      if (cache[template]) {
        return cache[template]
      }
      const compiled = compile(template, options)
    	return (cache[key] = compiled)
    }
  }
  return {
    compile,
    compileToFunctions: createCompileToFunctionFn(compile)
  }
}
Copy the code

Main process

As you can see, the main compilation logic is basically in the BasecomCompile method, which is divided into three steps:

  1. Template compilation to convert template code into AST;
  2. Optimize AST to facilitate subsequent virtual DOM update;
  3. Generate code to convert the AST into executable code;
const baseCompile = (template, options) = > {
  // Parse the HTML and convert it to an AST
  const ast = parse(template.trim(), options)
  // Optimize the AST to mark static nodes
  optimize(ast, options)
  // Convert the AST to executable code
  const code = generate(ast, options)
  return {
    ast,
    render: code.render,
    staticRenderFns: code.staticRenderFns
  }
}
Copy the code

parse

AST

First saw the parse method, this method is the main function of parsing HTML, and translated into the AST (abstract syntax tree), contact ESLint, Babel classmates affirmation is not unfamiliar to the AST, we can look at the AST look like after the parse.

Here’s a generic Vue template:

new Vue({
  el: '#app'.template: ` 
      

{{message}}

`
.data: { name: 'shenfq'.message: 'Hello Vue! ' }, methods: { showName() { alert(this.name) } } }) Copy the code

The AST after parse:

The AST is a tree-structured object. Each layer represents a node. The first layer is div (tag: “div”). The children of the div are in the children property, which is the H2 tag, the blank line, and the button tag. We can also notice that there is an attribute that marks the type of the node: type. In this case, div has type 1, indicating that it is an element node. There are three types of type:

  1. Element node;
  2. The expression;
  3. Text;

The blank line between the H2 and button labels is a text node of type 3, and under the H2 label is an expression node.

Parsing HTML

Parse’s overall logic is complex, so let’s take a look at the process by simplifying the code.

import { parseHTML } from './html-parser'

export function parse(template, options) {
  let root
  parseHTML(template, {
    // some options...
    start() {}, // Resolve the callback to the beginning of the tag position
    end() {}, // Resolve the callback to the end of the label position
    chars() {}, // Callback when parsing to text
    comment() {} // The callback when parsing to the comment
  })
  return root
}
Copy the code

As you can see, Parse works primarily through parseHTML, which itself comes from the open source library htmlParser.js, with some modifications by the Vue team to fix the issues.

Let’s take a look at the parseHTML logic.

export function parseHTML(html, options) {
  let index = 0
  let last,lastTag
  const stack = []
  while(html) {
    last = html
    let textEnd = html.indexOf('<')

    // The "<" character is at the beginning of the current HTML string
    if (textEnd === 0) {
      // 1, match to comment: <! -- -->
      if (/ ^ 
      .test(html)) {
        const commentEnd = html.indexOf('-->')
        if (commentEnd >= 0) {
          // Call the options.comment callback, passing in the comments
          options.comment(html.substring(4, commentEnd))
          // Cut out the comments
          advance(commentEnd + 3)
          continue}}// 2, match conditional comment: 
       
      
      if (/ ^ 
      .test(html)) {
        / /... The logic is similar to matching to an annotation
      }

      // 3, match to Doctype: 
      
      const doctypeMatch = html.match(/ ^ 
      ]+>/i)
      if (doctypeMatch) {
        / /... The logic is similar to matching to an annotation
      }

      // 4, match to the end tag: 
      const endTagMatch = html.match(endTag)
      if (endTagMatch) {}

      // 5, match to the start tag: 
      
const startTagMatch = parseStartTag() if (startTagMatch) {} } // The "<" character is in the middle of the current HTML string let text, rest, next if (textEnd > 0) { // Extract the middle character rest = html.slice(textEnd) // This part is treated as text text = html.substring(0, textEnd) advance(textEnd) } // The "<" character does not exist in the current HTML string if (textEnd < 0) { text = html html = ' ' } // If there is text // Call the options.chars callback, passing in text if (options.chars && text) { // Character dependent callback options.chars(text) } } // Go ahead and crop the HTML function advance(n) { index += n html = html.substring(n) } } Copy the code

The code above is parseHTML, a simplified version of which the while loop intercepts one section of HTML text at a time and then processes the text using the regular pattern to determine the type of the text, similar to the finite state machine commonly used in compilation principles. Every time you get the text before and after the “<” character, the text before the “<” character is treated as text, and the text after the “<” character is judged by the regular pattern, which can be deduced to a limited number of states.

The rest of the logical processing is not complicated, mainly the start and end tags. Let’s first look at the re related to the start and end tags.

const ncname = '[a-zA-Z_][\\w\\-\\.]*'
const qnameCapture = ` ((? :${ncname}\ \ :)?${ncname}) `
const startTagOpen = new RegExp(` ^ <${qnameCapture}`)
Copy the code

This looks like a long regex, but it’s not that hard to figure out. A regular visualization tool is recommended here. Let’s go to the tool and look at startTagOpen:

A bit of a puzzle here is why tagName exists: this is an XML namespace that is rarely used anymore, so we can just ignore it, so let’s simplify the re:

const ncname = '[a-zA-Z_][\\w\\-\\.]*'
const startTagOpen = new RegExp(` ^ <${ncname}`)
const startTagClose = /^\s*(\/?) >/
const endTag = new RegExp(` ^ < \ \ /${ncname}[^ >] * > `)
Copy the code

In addition to the above re about the beginning and end of tags, there is a really long re for extracting tag attributes.

const attribute = /^\s*([^\s"'<>\/=]+)(? :\s*(=)\s*(? :"([^"]*)"+|'([^']*)'+|([^\s"'=<>`]+)))? /
Copy the code

It’s easy to see when you put the re on the tool, which is bounded by =, followed by the property name and the property value.

Having the regex sorted out will make it easier for us to look at the rest of the code.

while(html) {
  last = html
  let textEnd = html.indexOf('<')

  // The "<" character is at the beginning of the current HTML string
  if (textEnd === 0) {
    // some code ...

    < div style = "box-sizing: border-box; color: RGB (74, 74, 74); line-height: 22px; white-space: inherit;"
    const endTagMatch = html.match(endTag)
    if (endTagMatch) {
      const curIndex = index
      advance(endTagMatch[0].length)
      parseEndTag(endTagMatch[1], curIndex, index)
      continue
    }

    // 5, match to the start of the tag: 
      
const startTagMatch = parseStartTag() if (startTagMatch) { handleStartTag(startTagMatch) continue}}}// Go ahead and crop the HTML function advance(n) { index += n html = html.substring(n) } // Determine whether the tag starts. If yes, extract the tag name and related attributes function parseStartTag () { / / extraction < XXX const start = html.match(startTagOpen) if (start) { const [fullStr, tag] = start const match = { attrs: [].start: index, tagName: tag, } advance(fullStr.length) let end, attr // Extract attributes recursively until a ">" or "/>" character is present while ( !(end = html.match(startTagClose)) && (attr = html.match(attribute)) ) { advance(attr[0].length) match.attrs.push(attr) } if (end) { // "/>" indicates a single label match.unarySlash = end[1] advance(end[0].length) match.end = index return match } } } // Process the start tag function handleStartTag (match) { const tagName = match.tagName const unary = match.unarySlash const len = match.attrs.length const attrs = new Array(len) for (let i = 0; i < l; i++) { const args = match.attrs[i] // Here 3, 4, 5 correspond to three different ways of copying attributes // 3: attr=" XXX "double quotes // 4: attr=' XXX 'single quote // 5: attr= XXX Omit quotation marks const value = args[3] || args[4] || args[5] | |' ' attrs[i] = { name: args[1], value } } if(! unary) {// Not a single label, push stack.push({ tag: tagName, lowerCasedTag: tagName.toLowerCase(), attrs: attrs }) lastTag = tagName } if (options.start) { // Start the tag callback options.start(tagName, attrs, unary, match.start, match.end) } } // Process the closing tag function parseEndTag (tagName, start, end) { let pos, lowerCasedTagName if (start == null) start = index if (end == null) end = index if (tagName) { lowerCasedTagName = tagName.toLowerCase() } // Look for open labels of the same type in the stack if (tagName) { for (pos = stack.length - 1; pos >= 0; pos--) { if (stack[pos].lowerCasedTag === lowerCasedTagName) { break}}}else { pos = 0 } if (pos >= 0) { // Update the stack by closing the open tags within the tag for (let i = stack.length - 1; i >= pos; i--) { if (options.end) { / / end correction options.end(stack[i].tag, start, end) } } // Remove closed tags from the stack stack.length = pos lastTag = pos && stack[pos - 1].tag } } Copy the code

When the opening tag is parsed, if it is not a single tag, it is placed on a stack. Every time the tag is closed, it is searched down from the top of the stack until it is found. This operation closes all the tags above the same tag. Here’s an example:

<div>
  <h2>test</h2>
  <p>
  <p>
</div>
Copy the code

After parsing the start tags for div and H2, there are two elements on the stack. When H2 is closed, we’re going to pull H2 off the stack. The two open P tags are then parsed, and there are three elements in the stack (div, p, p). If at this point, the closed tag of the div is parsed, in addition to closing the div, the two unclosed P tags in the div will also be closed, and the stack will be cleared.

To make it easier to understand, we have recorded a GIF, as follows:

With the parseHTML logic out of the way, we go back to calling parseHTML, which passes four callbacks for the start and end of the tag, text, and comments.

parseHTML(template, {
  // some options...

  // Resolve the callback to the beginning of the tag position
  start(tag, attrs, unary) {},
  // Resolve the callback to the end of the label position
  end(tag) {},
  // Callback when parsing to text
  chars(text: string) {},
  // The callback when parsing to the comment
  comment(text: string){}})Copy the code

Process start tag

First look at parsing to the start tag, which generates an AST node, then processes the attributes on the tag, and finally puts the AST node into the tree.

function makeAttrsMap(attrs) {
  const map = {}
  for (let i = 0, l = attrs.length; i < l; i++) {
    const { name, value } = attrs[i]
    map[name] = value
  }
  return map
}
function createASTElement(tag, attrs, parent) {
  const attrsList = attrs
  const attrsMap = makeAttrsMap(attrsList)
  return {
    type: 1.// Node type
    tag,           // Node name
    attrsMap,      // Node attribute mapping
    attrsList,     // Array of node properties
    parent,        / / the parent node
    children: []./ / child nodes}}const stack = []
let root / / the root node
let currentParent // Hold the current parent node
parseHTML(template, {
  // some options...

  // Resolve the callback to the beginning of the tag position
  start(tag, attrs, unary) {
    // Create an AST node
    let element = createASTElement(tag, attrs, currentParent)

    // Processing instruction: v-for V-if V-once
    processFor(element)
    processIf(element)
    processOnce(element)
    processElement(element, options)

    // Process the AST tree
    // If the root node does not exist, set the element as the root node
   	if(! root) { root = element checkRootConstraints(root) }// The parent node exists
    if (currentParent) {
      // Push the element into the children of the parent node
      currentParent.children.push(element)
      element.parent = currentParent
    }
    if(! unary) {// A non-single label needs to be pushed and the current parent element is switched
      currentParent = element
      stack.push(element)
    }
  }
})
Copy the code

Processing End tag

The logic of closing the label is relatively simple. You just need to remove the last open label from the stack and close it.

parseHTML(template, {
  // some options...

  // Resolve the callback to the end of the label position
  end() {
    const element = stack[stack.length - 1]
    const lastNode = element.children[element.children.length - 1]
    // Handle trailing Spaces
    if (lastNode && lastNode.type === 3 && lastNode.text === ' ') {
      element.children.pop()
    }
    // Exit the stack and reset the current parent node
    stack.length -= 1
    currentParent = stack[stack.length - 1]}})Copy the code

Handling text

After the tags are processed, the text inside the tags needs to be processed. Text processing is divided into two cases, one is the text with expressions, but also a pure static text.

parseHTML(template, {
  // some options...

  // Callback when parsing to text
  chars(text) {
    if(! currentParent) {// If there is no parent outside the text node, no processing
      return
    }
    
    const children = currentParent.children
    text = text.trim()
    if (text) {
      // parseText is used to parse an expression
      // delimiters indicate expression identifiers, default is ['{{', '}}']
      const res = parseText(text, delimiters))
      if (res) {
        / / expression
        children.push({
          type: 2.expression: res.expression,
          tokens: res.tokens,
          text
        })
      } else {
        // Static text
        children.push({
          type: 3,
          text
        })
      }
    }
  }
})
Copy the code

Let’s see how parseText parses expressions.

// Construct a regular expression to match the expression
const buildRegex = delimiters= > {
  const open = delimiters[0]
  const close = delimiters[1]
  return new RegExp(open + '((? :.|\\n)+?) ' + close, 'g')}function parseText (text, delimiters){
  // delimiters default {{}}
  const tagRE = buildRegex(delimiters || ['{{'.'}} '])
  // No expression is matched
  if(! tagRE.test(text)) {return
  }
  const tokens = []
  const rawTokens = []
  let lastIndex = tagRE.lastIndex = 0
  let match, index, tokenValue
  while ((match = tagRE.exec(text))) {
    // The starting position of the expression
    index = match.index
    // Extract the static character from the beginning of the expression and put it into the token
    if (index > lastIndex) {
      rawTokens.push(tokenValue = text.slice(lastIndex, index))
      tokens.push(JSON.stringify(tokenValue))
    }
    // Extract the content inside the expression and wrap it with the _s() method
    const exp = match[1].trim()
    tokens.push(`_s(${exp}) `)
    rawTokens.push({ '@binding': exp })
    lastIndex = index + match[0].length
  }
  // The expression is followed by other static characters, which are put into the token
  if (lastIndex < text.length) {
    rawTokens.push(tokenValue = text.slice(lastIndex))
    tokens.push(JSON.stringify(tokenValue))
  }
  return {
    expression: tokens.join('+'),
    tokens: rawTokens
  }
}

Copy the code

The expression is first extracted through a regular:

It might be a little hard to look at the code, but let’s go straight to the example, and here’s a text that contains the expression.

<div>Login or not: {{isLogin? 'yes' :' No '}}</div>
Copy the code

optimize

With the above column processing, we get the AST for the Vue template. Since Vue is designed in a responsive manner, a series of optimizations are required after the AST is obtained to ensure that static data will not enter the update phase of the virtual DOM, so as to optimize performance.

export function optimize (root, options) {
  if(! root)return
  // Mark the static node
  markStatic(root)
}
Copy the code

In simple terms, set the static property of all static nodes to true.

function isStatic (node) {
  if (node.type === 2) { // Expression that returns false
    return false
  }
  if (node.type === 3) { // Static text, return true
    return true
  }
  // Some conditions have been omitted
  return!!!!! (! node.hasBindings &&// There is no dynamic binding! node.if && ! node.for &&/ / not v - if/v - for! isBuiltInTag(node.tag) &&// Not a built-in slot/ Component! isDirectChildOfTemplateFor(node) &&// Not in the template for loop
    Object.keys(node).every(isStaticKey) // Non-static nodes)}function markStatic (node) {
  node.static = isStatic(node)
  if (node.type === 1) {
    // If it is an element node, all child nodes need to be traversed
    for (let i = 0, l = node.children.length; i < l; i++) {
      const child = node.children[i]
      markStatic(child)
      if(! child.static) {// If a child node is not static, then that node must also be dynamic
        node.static = false}}}}Copy the code

generate

Once you have the optimized AST, you need to convert the AST to the Render method. Using the previous template, let’s take a look at what the generated code looks like:

<div>
  <h2 v-if="message">{{message}}</h2>
  <button @click="showName">showName</button>
</div>
Copy the code
{
  render: "with(this){return _c('div',[(message)?_c('h2',[_v(_s(message))]):_e(),_v(" "),_c('button',{on:{"click":showName}},[_v("showName")]])}"
}
Copy the code

Expand the generated code:

with (this) {
    return _c(
      'div',
      [
        (message) ? _c('h2', [_v(_s(message))]) : _e(),
        _v(' '),
        _c('button', { on: { click: showName } }, [_v('showName')]]); }Copy the code

It’s a bit confusing to see all the underlining here, but the _c corresponds to the createElement method in the virtual DOM. Other underlining methods are defined in Core/Instance /render-helpers, each of which does exactly what it needs to do.

The specific transformation method is some simple character concatenation, the following is a simplified part of the logic, do not do too much.

export function generate(ast, options) {
  const state = new CodegenState(options)
  const code = ast ? genElement(ast, state) : '_c("div")'
  return {
    render: `with(this){return ${code}} `.staticRenderFns: state.staticRenderFns
  }
}

export function genElement (el, state) {
  let code
  const data = genData(el, state)
  const children = genChildren(el, state, true)
  code = `_c('${el.tag}'${
    data ? `,${data}` : ' ' // data
  }${
    children ? `,${children}` : ' ' // children
  }) `
  return code
}
Copy the code

conclusion

The entire Vue template compilation process is sorted out, with the focus on parsing the HTML to generate the AST. This article is just a general overview of the main process, which omits a lot of details, such as the processing of template/slot, instruction processing, and so on. If you want to understand the details, you can read the source code directly. I hope you found this article useful.