Author: Ji Zhi

preface

In order to understand the source code for MarkdownIt, it is necessary to understand the two base classes, Ruler & Token.

Token

Commonly known as lexical unit.

Md receives a string, turns it into one token after a series of parser processing, then calls the rule corresponding to render, takes token as input, and finally outputs HTML string.

Let’s start with the definition of Token, which is located in lib/token.js.

function Token(type, tag, nesting) {

  this.type     = type;

  this.tag      = tag;

  this.attrs    = null;

  this.map      = null;

  this.nesting  = nesting;

  this.level    = 0;

  this.children = null;

  this.content  = ' ';

  this.markup   = ' ';

  this.info     = ' ';

  this.meta     = null;

  this.block    = false;

  this.hidden   = false;
}
Copy the code
  • type

    The token types, for example, paragraph_open, paragraph_close, and hr, go to

    ,

    , and


    , respectively.

  • tag

    Tag names such as P, strong, ”(empty string). Stands for words, etc.

  • attrs

    The property of an HTML tag element, if present, is a two-dimensional array, such as [[“href”, “http://dev.nodeca.com”]]

  • map

    The array has only two elements, the first is the start line and the second is the end line.

  • nesting

    Label type: 1 indicates an open label, 0 indicates a self-closing label, and -1 indicates a closed label. For example,

    ,


    ,

    .

  • level

    The level of compaction.

  • children

    Token. Only tokens with type inline or image will have children. Since inline tokens also go through a parser to extract more detailed tokens, as in the following scenario.

    const src = '__advertisement__'
    const result = md.render(src)
    
    // Get the following token first{... .content:"__Advertisement :)__".children: [Token, ...]
    }
    // Content needs to be parsed and extracted with "__", "__" needs to be rendered with  tags. Therefore inline children is used to store child tokens.
    Copy the code
  • content

    Place content between labels.

  • markup

    Some markup for a particular syntax. For example, “” indicates a code block. **” is the grammar of emphasis.” -” or “+” is a list.

  • info

    A token whose type is fence has an info attribute. What is a fence, it goes like this:

    /**
    ```js
    let md = new MarkdownIt()
    ```
    **/
    Copy the code

    Inside the above comment is a fence token. Its info is JS, and markup is “.”

  • meta

    Plugins are used to store arbitrary data.

  • block

    The block for the token generated by ParserCore is true, and the block for the token generated by ParserInline is true.

  • hidden

    If true, the token is not rendered.

Let’s look at the prototype approach.

  • attrIndex()

    Token.prototype.attrIndex = function attrIndex(name) {
      var attrs, i, len;
    
      if (!this.attrs) { return - 1; }
    
      attrs = this.attrs;
    
      for (i = 0, len = attrs.length; i < len; i++) {
        if (attrs[i][0] === name) { returni; }}return - 1;
    };
    Copy the code

    Returns the index based on attribute name.

  • attrPush()

    Token.prototype.attrPush = function attrPush(attrData) {
      if (this.attrs) {
        this.attrs.push(attrData);
      } else {
        this.attrs = [ attrData ]; }};Copy the code

    Add a [name, value] pair.

  • attrSet

    Token.prototype.attrSet = function attrSet(name, value) {
      var idx = this.attrIndex(name),
          attrData = [ name, value ];
    
      if (idx < 0) {
        this.attrPush(attrData);
      } else {
        this.attrs[idx] = attrData; }};Copy the code

    Overrides or adds a [name, value] pair.

  • attrGet

    Token.prototype.attrGet = function attrGet(name) {
      var idx = this.attrIndex(name), value = null;
      if (idx >= 0) {
        value = this.attrs[idx][1];
      }
      return value;
    };
    Copy the code

    Returns the property value based on name

  • attrJoin

    Token.prototype.attrJoin = function attrJoin(name, value) {
      var idx = this.attrIndex(name);
    
      if (idx < 0) {
        this.attrPush([ name, value ]);
      } else {
        this.attrs[idx][1] = this.attrs[idx][1] + ' '+ value; }};Copy the code

    Concatenate the current value to the previous value based on name.

Token summary

Token is the most basic class within MarkdownIt and the smallest unit of division. It is the product of parse and the basis for output.

Ruler

Take a look at MarkdownIt’s other class, Ruler. It can be thought of as the manager of the responsibility chain function. As it stores many rule functions internally, the functions of rule are divided into two types: parse rule, which is used to parse the string passed in by users and generate tokens, and render rule, which is used to generate tokens. Call different Render rules based on the token type, and finally spit out the HTML string.

Let’s start with Constructor.

function Ruler() {
  this.__rules__ = [];

  this.__cache__ = null;
}
Copy the code
  • __rules__

    To hold all rule objects, it has the following structure:

    {
      name: XXX,
      enabled: Boolean.// Whether to enable
      fn: Function(), // handle the function
      alt: [ name2, name3 ] // Name of the responsibility chain to which it belongs
    }
    Copy the code

    Some people will be confused by Alt, but I’ll leave this a bit of a hole and we’ll talk about it in detail when we look at __compile__ methods.

  • cache

    It stores information about the rule chain. Its structure is as follows:

    {Responsibility chain name: [rule1.fn, rule2.fn,...] }Copy the code

    Note: There is a default rule chain named an empty string (“) whose value is an array containing all of the rules.fn.

Let’s look at the effects of each approach on the prototype.

  • __find__

    Ruler.prototype.__find__ = function (name) {
      for (var i = 0; i < this.__rules__.length; i++) {
        if (this.__rules__[i].name === name) {
          returni; }}return - 1;
    };
    Copy the code

    Find its index in __rules__ based on the rule name.

  • __compile__

    Ruler.prototype.__compile__ = function () {
      var self = this;
      var chains = [ ' ' ];
    
      // collect unique names
      self.__rules__.forEach(function (rule) {
        if(! rule.enabled) {return; }
    
        rule.alt.forEach(function (altName) {
          if (chains.indexOf(altName) < 0) { chains.push(altName); }}); }); self.__cache__ = {}; chains.forEach(function (chain) {
        self.__cache__[chain] = [];
        self.__rules__.forEach(function (rule) {
          if(! rule.enabled) {return; }
    
          if (chain && rule.alt.indexOf(chain) < 0) { return; }
    
          self.__cache__[chain].push(rule.fn);
        });
      });
    };
    Copy the code

    Generate responsibility chain information.

    1. First, use the __rules__ rule to find all key names corresponding to rule chains. This is where the Alt attribute of the rule becomes particularly important, because it indicates that it belongs to the responsibility chain of the Alt in addition to the default responsibility chain. By default, there is a chain of responsibilities with an empty string key (“) to which any rule-fn belongs.
    2. The rule-fn is then mapped to the corresponding key property and cached on the __cache__ property.

    Here’s an example:

    let ruler = new Ruler()
    ruler.push('rule1', rule1Fn, {
      alt: 'chainA'
    })
    ruler.push('rule2', rule2Fn, {
      alt: 'chainB'
    })
    ruler.push('rule3', rule3Fn, {
      alt: 'chainB'
    })
    ruler.__compile__()
    
    // We can get the following structure
    ruler.__cache__ = {
      ' ': [rule1Fn, rule2Fn, rule3Fn],
      'chainA': [rule1Fn],
      'chainB': [rule2Fn, rule3Fn],
    }
    // Get three rule chains: '', 'chainA', 'chainB'.
    Copy the code
  • at

    Ruler.prototype.at = function (name, fn, options) {
      var index = this.__find__(name);
      var opt = options || {};
    
      if (index === - 1) { throw new Error('Parser rule not found: ' + name); }
    
      this.__rules__[index].fn = fn;
      this.__rules__[index].alt = opt.alt || [];
      this.__cache__ = null;
    };
    Copy the code

    Replace the FN of a rule or change the chain name to which it belongs.

  • before

    Ruler.prototype.before = function (beforeName, ruleName, fn, options) {
      var index = this.__find__(beforeName);
      var opt = options || {};
    
      if (index === - 1) { throw new Error('Parser rule not found: ' + beforeName); }
    
      this.__rules__.splice(index, 0, {
        name: ruleName,
        enabled: true.fn: fn,
        alt: opt.alt || []
      });
    
      this.__cache__ = null;
    };
    Copy the code

    Insert a new rule before a rule.

  • after

    Ruler.prototype.after = function (afterName, ruleName, fn, options) {
      var index = this.__find__(afterName);
      var opt = options || {};
    
      if (index === - 1) { throw new Error('Parser rule not found: ' + afterName); }
    
      this.__rules__.splice(index + 1.0, {
        name: ruleName,
        enabled: true.fn: fn,
        alt: opt.alt || []
      });
    
      this.__cache__ = null;
    };
    Copy the code

    Insert a new rule after a rule.

  • push

    Ruler.prototype.push = function (ruleName, fn, options) {
      var opt = options || {};
    
      this.__rules__.push({
        name: ruleName,
        enabled: true.fn: fn,
        alt: opt.alt || []
      });
    
      this.__cache__ = null;
    };
    Copy the code

    Increase the rule.

  • enable

    Ruler.prototype.enable = function (list, ignoreInvalid) {
      if (!Array.isArray(list)) { list = [ list ]; }
    
      var result = [];
    
      // Search by name and enable
      list.forEach(function (name) {
        var idx = this.__find__(name);
    
        if (idx < 0) {
          if (ignoreInvalid) { return; }
          throw new Error('Rules manager: invalid rule name ' + name);
        }
        this.__rules__[idx].enabled = true;
        result.push(name);
      }, this);
    
      this.__cache__ = null;
      return result;
    };
    Copy the code

    Enabling rules listed in the list does not affect other rules.

  • enableOnly

    Ruler.prototype.enableOnly = function (list, ignoreInvalid) {
      if (!Array.isArray(list)) { list = [ list ]; }
    
      this.__rules__.forEach(function (rule) { rule.enabled = false; });
    
      this.enable(list, ignoreInvalid);
    };
    Copy the code

    Disable all other rules first, and enable only the rules corresponding to list.

  • getRules

    Ruler.prototype.getRules = function (chainName) {
      if (this.__cache__ === null) {
        this.__compile__();
      }
      return this.__cache__[chainName] || [];
    };
    Copy the code

    Obtain the corresponding FN function queue based on the key of the rule chain.

    Ruler summary

    It can be seen that Ruler is quite flexible. Whether it is at, before, after, enable or other methods, it has given the Ruler great flexibility and expansibility. As a user, it can use these excellent architectural designs to meet specific requirements.

    conclusion

    After analyzing the base classes Token and Ruler, we will further uncover the source code of MarkdownIt. In future articles, we’ll look at how tokens are generated from the SRC string parse, and how tokens are output to renderer.render as the final string. In the next article, we will enter MarkdownIt’s entry point parser — CoreParser analysis.