Previous links:

Write a compiler from scratch in PHP

As we saw in the last video, we converted the string into a Token array and stored it in ListLexer. So, what does this Token look like?


/*
 * This file is part of the ByteFerry/Rql-Parser package.
 *
 * (c) BardoQi <67158925@qq.com>
 *
 * For the full copyright and license information, please view the LICENSE
 * file that was distributed with this source code.
 */

declare(strict_types=1);


namespace ByteFerry\RqlParser\Lexer;

use ByteFerry\RqlParser\Exceptions\ParseException;
/**
 * Class Token
 *
 * @package ByteFerry\RqlParser
 */
class Token
{
    /** * the type of symbol stored here is the constant * symbol type * defined in the symbol class@var int
     */
    protected $type = 0;

    /** * this is the symbol of the string * symbol content string@var string
     */
    protected $symbol = ' ';

    /** * The lexer type of next node *@var int
     */
    protected $next_type = -1;

    /** * Previous Token type * the lexer type of previous node *@var int
     */
    protected $previous_type = -1;

    /** * the level argument is used for syntax verification@var int
     */
    protected $level = 0;

    / * * *@param     $type
     * @param     $symbol
     * @paramInt $previous_type * This is a statically created Token, instead of using new * in code@return  static
     */
    public static function from($type.$symbol.$previous_type = -1)
    {
        /**
         * ensure the syntax of the rql with simple ABNF definition
         * 我们定了一套ABNF的规则,那么,我们用它来检验语法是否正确
         */
        if(-1! = =$previous_type) {if(! in_array($type,Symbols::$rules[$previous_type])){
                throw new ParseException('Syntex error in Node of ' .$symbol); }}$instance = new static(a);// The following is the initialization
        $instance->type = $type;
        $instance->symbol = $symbol;
        $instance->previous_type = $previous_type;
        return $instance;
    }

    / * * *@param$previousType * RQL has one data that does not have a function schema, and that is arrays@return \ByteFerry\RqlParser\Lexer\Token
     */
    public static function makeArrayToken($previousType){
        $instance = new static(a);$instance->type = Symbols::T_WORD;
        $instance->symbol = 'arr';
        $instance->previous_type = $previousType;
        return $instance;
    }

    / * * *@param $level
     *
     * @return void
     */
    public function setLevel($level){
        $this->level = $level;
    }

    / * * *@param $type
     *
     * @return void
     */
    public function setNextType($type)
    {
        $this->next_type = $type;
    }

    / * * *@return int
     */
    public function getType()
    {
        return $this->type;
    }

    / * * *@param $type
     *
     * @return void
     */
    public function setPrevType($type){
        $this->previous_type=$type;
    }

    / * * *@return string
     */
    public function getSymbol()
    {
        return $this->symbol;
    }

    / * * *@return bool
     */
    public function isClose(){
        return ($this->type === Symbols::T_CLOSE_PARENTHESIS);
    }

    / * * *@return int
     */
    public function getPrevType(){
        return $this->previous_type;
    }

    / * * *@return bool
     */
    public function isPunctuation(){
        return !( ($this->type === Symbols::T_WORD)
               || ($this->type === Symbols::T_STRING) ); }}Copy the code

As you can see, there are many single-line methods in this class. Well, object orientation is one thing. Many beginners do not use single-line method functions, resulting in excessively long code for some functions.

Next, it’s time to look at the ListLexer class

/*
 * This file is part of the ByteFerry/Rql-Parser package.
 *
 * (c) BardoQi <67158925@qq.com>
 *
 * For the full copyright and license information, please view the LICENSE
 * file that was distributed with this source code.
 */

declare(strict_types=1);

namespace ByteFerry\RqlParser\Lexer;

use ByteFerry\RqlParser\Abstracts\BaseObject;
use ByteFerry\RqlParser\Exceptions\ParseException;


/**
 * Class TokenList
 *
 * @package ByteFerry\RqlParser\ListLexer
 */
class ListLexer extends BaseObject
{
    /** * is the token array stored here *@var array
     */
    protected $items = [];

    / * * *@var int
     */
    protected $level = 0;

    / * * *@var int
     */
    protected $position = 0;



    / * * *@param $token
     *
     * @return void
     */
    public function addItem(Token $token){
        if($token->getType() === Symbols::T_OPEN_PARENTHESIS){
            /** * for <,(> that is the array operator, * we'd insert a node 'arr' *, */
            if($token->getPrevType()===Symbols::T_COMMA){
                $this->items[$this->position++] = Token::makeArrayToken(Symbols::T_COMMA);  // so, makeArrayToken with Token
            }
            $token->setPrevType(Symbols::T_WORD);  // Set the arR function type to T_WORD
            $this->level++; // Add one more layer
        }
        if($token->getType() === Symbols::T_CLOSE_PARENTHESIS){
            $this->level--;    // When the closing parenthesis is encountered, the level is subtracted. (When the parentheses match, the final level should be 0, which is the core of this check algorithm. Why, you figure it out)
        }
        $token->setLevel($this->level);
        $this->items[$this->position++] = $token;
    }

    / * * *@param$type * This sets the NextType of the previous node, *@return void
     */
    public function setNextType($type){
        if(isset($this->items[$this->position-2])){ // Subtract 2 from the current pointer because position has not been updated since the new pointer was added
            $this->items[$this->position-2]->setNextType($type); }}/ * * *@return mixed
     */
    public function current(){
        return $this->items[$this->position];
    }

    / * * *@returnBool | mixed * this is a Token of consumption, the key function of * /
    public function consume(){
        /** * if got the end we must return; * /
        if($this->isEnd()){
            return false;   // Determine whether to end
        }
        /** * get the next token */
        $token = $this->items[++$this->position]; Remove a token/** * We only consume the word or string. ** * We only consume the word or string tokens, so we call isPunctuation */
        for(; $token->isPunctuation() && !$this->isEnd(); $token = $this->items[++$this->position]){
            /** * if we meet the close flag we must return. */
            if($token->isClose()){
                return $token; }}return $token;
    }


    / * * *@return mixed
     */
    public function rewind()
    {
        $this->position = 0;
        return $this->items[$this->position];
    }

    / * * * *@return int
     */
    public function getNextIndex()
    {
        return ++$this->position;
    }


    / * * *@return mixed
     */
    public function isClose(){
        return $this->items[$this->position]->isClose();
    }

    / * * *@return bool
     */
    public function isEnd(){
        return $this->position+1 >= count($this->items);
    }

    / * * *@return int
     */
    public function getLevel(){
        return $this->level; }}Copy the code

We found that some single-line functions in the Token class simplify the code here. Also, there are single-line functions in this class that simplify the consume function, so there are far fewer lines of code. At this point, the lexical part is over. Next comes the abstract syntax tree. Let’s move on to the NodeVisitor



      

/*
 * This file is part of the ByteFerry/Rql-Parser package.
 *
 * (c) BardoQi <67158925@qq.com>
 *
 * For the full copyright and license information, please view the LICENSE
 * file that was distributed with this source code.
 */

declare(strict_types=1);

namespace ByteFerry\RqlParser\AstBuilder;

use ByteFerry\RqlParser\Exceptions\ParseException;
use ByteFerry\RqlParser\Lexer\Symbols;

/**
 * Class NodeVisitor
 *
 * @package ByteFerry\RqlParser\Ast
 */
class NodeVisitor
{

    / * * *@param $name
     *
     * @return mixed
     */
    protected static function fromAlias($name)
    {
        return Symbols::$type_alias[$name]????$name;
    }

    / * * *@param $operator
     *
     * @return mixed
     */
    protected static function getNodeType($operator)
    {
        return Symbols::$type_mappings[$operator]????null;
    }

    / * * *@param $node_type
     *
     * @return mixed|null
     */
    protected static function getClass($node_type)
    {
        return Symbols::$class_mapping[$node_type]???? Symbols::$class_mapping['N_CONSTANT'];
    }


    / * * *@param $symbol
     *
     * @return\ByteFerry\RqlParser\AstBuilder\NodeInterface; * /
    public static function visit($symbol){
        $operator = self::fromAlias($symbol);
        $node_type = self::getNodeType($operator);
        $node_class = self::getClass($node_type);
        if(null= = =$node_class) {throw new ParseException('Node class of ' .$node_type.' not found! ');
        }
        return $node_class::of($operator.$symbol); }}Copy the code

The code is fairly simple. At this point, it turns out that it’s not really a visitor pattern. It just takes a symbol, gets an instance, and that’s it. Next, we need to understand the contents of the abstract syntax tree. (to be continued)

Read on:

Write a compiler from scratch in PHP (3)

Write a compiler from scratch in PHP (4)

Write a compiler from scratch in PHP (5)