instruction

Before we generated the function prototype we declared the instruction table in the form of number[].

This means that the instruction is really just a number.

export type Instructon = number
Copy the code

A computer is used to process a string of 01010s. In this sense, we are all numbers.

Instruction can be divided into two types according to length, one is variable length, the other is fixed length. Obviously the former advantage is small size, the latter advantage is fast parsing. Here our machine chooses the fixed length, easy to write.

Referring to lua’s instruction set, we also set the instruction to be 4 bytes in length, a total of 32 bits, of which 6 bits represent opcodes and the other 26 bits represent operands.

There are several opcodes that we have covered so far.

export enum OpCode {
    / / jump
    move, jmp,
    / / load
    loadNil, loadK, loadKX
    / / integer
    iadd, isub, imul, idiv,
    / / function
    func, call, return
}
Copy the code

Different opcodes require different numbers and types of parameters, which divides opcodes into the following four types.

export enum OpMode {
    IABC, IABx, IAsBx, IAx,
}

export type OpModeArgu = {
    [OpMode.IABC]: [number.number.number],
    [OpMode.IABx]: [number.number],
    [OpMode.IAsBx]: [number.number],
    [OpMode.IAx]: [number],}Copy the code
  • IABC means there are three parameters. A has 8 bits, B and C have 9 bits respectively.

  • IABx indicates that there are two parameters, where A has 8 bits and Bx has 18 bits.

  • IAsBx means that there are two parameters, where A has 8 bits and sBx has 18 bits, but sBx is interpreted as A signed integer.

  • IAx means that there’s one parameter, Ax, that’s all 26 bits.

IABC is the most commonly used type, but because the bit is so small that 9 bits represents 512 at most, IABx is required for some special operations. If 18 bit is not enough, you may need to perform an operation in conjunction with the IAx instruction. There are also operations such as jump, which can jump up or down, so IAsBx is required to provide a signed integer.

We’re going to assign a type to each of the opcodes that we declared before.

export type OpCodeMode = {
    [OpCode.move]: OpMode.IABC,
    [OpCode.jmp]: OpMode.IAsBx,

    [OpCode.loadNil]: OpMode.IABC,
    [OpCode.loadK]: OpMode.IABx,
    [OpCode.loadKX]: OpMode.IABx,

    [OpCode.iadd]: OpMode.IABC,
    [OpCode.isub]: OpMode.IABC,
    [OpCode.imul]: OpMode.IABC,
    [OpCode.idiv]: OpMode.IABC,
    
    [OpCode.func]: OpMode.IABC,
    [OpCode.call]: OpMode.IABC,
    [OpCode.return]: OpMode.IABC,
}

export const OpCodeMode: OpCodeMode = {
    [OpCode.move]: OpMode.IABC,
    [OpCode.jmp]: OpMode.IAsBx,
    
    [OpCode.loadNil]: OpMode.IABC,
    [OpCode.loadK]: OpMode.IABx,
    [OpCode.loadKX]: OpMode.IABx,

    [OpCode.iadd]: OpMode.IABC,
    [OpCode.isub]: OpMode.IABC,
    [OpCode.imul]: OpMode.IABC,
    [OpCode.idiv]: OpMode.IABC,
    
    [OpCode.func]: OpMode.IABC,
    [OpCode.call]: OpMode.IABC,
    [OpCode.return]: OpMode.IABC,
}
Copy the code

Method of instruction

Based on the already defined format, you can write the convenient generation function, parsing tool function.

  • Read an opcode from an instruction
// Obtain the instruction opcode
const code = (i:Instructon):OpCode= > i & 0x3f
Copy the code
  • For each type, write a corresponding generating function
// Generate instructions based on opcodes and parameters
const create = {
    [OpMode.IABC]: (code: OpCode, args: OpModeArgu[OpMode.IABC]): Instructon= > {
        const [a, b, c] = args
        const vcode = code & 0x3f
        const va = (a & 0xff) < <6
        const vb = (b & 0x1ff) < <14
        const vc = (c & 0x1ff) < <23
        return vcode + va + vb + vc
    },
    [OpMode.IABx]: (code: OpCode, args: OpModeArgu[OpMode.IABx]): Instructon= > {
        const [a, bx] = args
        const vcode = code & 0x3f
        const va = (a & 0xff) < <6
        const vb = bx << 14
        return vcode + va + vb
    },
    [OpMode.IAsBx](code: OpCode, args: OpModeArgu[OpMode.IAsBx]): Instructon {
        const [a, bx] = args
        const vcode = code & 0x3f
        const va = (a & 0x1ff) < <6
        const vb = (bx + MAXARG_sBx) << 14
        return vcode + va + vb
    },
    [OpMode.IAx](code: OpCode, args: OpModeArgu[OpMode.IAx]): Instructon {
        const [a] = args
        const vcode = code & 0x3f
        const va = a << 6
        return vcode + va
    }
}
Copy the code
  • For each type, write a corresponding parameter parsing function
Export const inputs = {[opmode.iABC](ins: Instructon): OpModeArgu[OpMode.IABC] { return [ ins >> 6 & 0xff, ins >> 14 & 0x1ff, ins >> 23 & 0x1ff ] }, [OpMode.IABx](ins: Instructon): OpModeArgu[OpMode.IABx] { return [ ins >> 6 & 0xff, ins >> 14, ] }, [OpMode.IAsBx](ins: Instructon): OpModeArgu[OpMode.IAsBx] { return [ ins >> 6 & 0xff, (ins >> 14) - MAXARG_sBx, ] }, [OpMode.IAx](ins: Instructon): OpModeArgu[OpMode.IAx] { return [ins >> 6] } }Copy the code
  • For each opcode, write a corresponding handler. This function takes an instruction and a stack frame, reads parameters from the instruction, and then operates on the stack frame. (Detailed content will be improved later)
const deal:{[key in OpCode]: (i: Instructon, stack: LxCallStack) = > void} = {
    / / to be completed
}
Copy the code

Finally, according to the above tool methods, aspire to throw two functions, respectively corresponding to the generation of instructions and instruction execution.

export const ins = {
    create: <T extends OpCode>(code: T, ... args: GetOpCodeArgu<T>) => { const mode: OpMode = OpCodeMode[code] return create[mode](code, args as any) }, excute: (i: Instructon, frame: LxCallFrame) => { const code: OpCode = i & 0x3f return deal[code](i, frame) } }Copy the code

Basic Operations (JMP, move)

Start with the two basic opcodes

export type BaseOP = OpCode.move | OpCode.jmp
Copy the code
  • Move: copies the value of the index on stack B to the index on stack A. You can just call the copy method on the frame.

  • JMP: Jump to current instruction relative position SBX. You can call the addPC method on the frame directly.

export const dealBase = {
    [OpCode.move]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, b, _] = inputs[OpCodeMode[OpCode.move]](i)
        frame.copy(b, a)
    },
    [OpCode.jmp]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, sbx] = inputs[OpCodeMode[OpCode.jmp]](i)
        frame.addPC(sbx)
        if(a ! = =0) { throw new Error('Op: todo')}}}Copy the code

Integer operations (iADD, ISub, IMul, idiV)

Take the value from index B and c, calculate the value and write it to index A.

export type ArithIntOP = OpCode.iadd | OpCode.isub | OpCode.imul | OpCode.idiv

const toInt = (val: LxValue | null) = > {
    if (val && val[0] === LxType.int)
        return val
    else
        throw new Error('parse int error')}const table = {
    [OpCode.iadd]: (a: LxValue, b: LxValue): [LxType.int, number] = > {const va = toInt(a)
        const vb = toInt(b)
        return [LxType.int, va[1] + vb[1]]
    },
    [OpCode.isub]: (a: LxValue, b: LxValue): [LxType.int, number] = > {const va = toInt(a)
        const vb = toInt(b)

        return [LxType.int, va[1] - vb[1]]
    },
    [OpCode.imul]: (a: LxValue, b: LxValue): [LxType.int, number] = > {const va = toInt(a)
        const vb = toInt(b)

        return [LxType.int, va[1] * vb[1]]
    },
    [OpCode.idiv]: (a: LxValue, b: LxValue): [LxType.int, number] = > {const va = toInt(a)
        const vb = toInt(b)

        if(! b)throw new Error('state: division by zero')

        return [LxType.int, Math.floor(va[1] / vb[1])]}},const dealBinaryArith = (op: ArithIntOP) = > (i: Instructon, frame: LxCallFrame) = > {
    const [a, b, c] = inputs[OpCodeMode[OpCode.move]](i)

    frame.pushValue(b)
    frame.pushValue(c)

    const n = frame.pop()
    const m = frame.pop()

    if(! m || ! n)throw new Error('state: val is not exist! ')

    const val = table[op](m, n)

    frame.push(val)
    frame.replace(a)
}

export const dealIntArith = {
    [OpCode.iadd]: dealBinaryArith(OpCode.iadd),
    [OpCode.isub]: dealBinaryArith(OpCode.isub),
    [OpCode.imul]: dealBinaryArith(OpCode.imul),
    [OpCode.idiv]: dealBinaryArith(OpCode.idiv),
}
Copy the code

Function operations (func, call, return)

  • Func: puts the function prototype at the BX index position on the prototype table, and the generating function at the stack A index.

  • Call: reads a function from a, continues to read b arguments backwards, pushes to the top of the stack, records the result index on C, calls the call method on the frame

  • Return: reads the index on a and calls the return method on the frame directly.

export type FuncOP = OpCode.func | OpCode.call | OpCode.return

export const dealFunc = {
    [OpCode.func]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, bx] = inputs[OpCodeMode[OpCode.func]](i)
        frame.loadProto(bx)
        frame.replace(a)
    },
    [OpCode.call]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, b, c] = inputs[OpCodeMode[OpCode.call]](i)
        for (let i = a; i <= a + b; i++) {
            frame.pushValue(i)
        }
        frame.call(b, c)
    },
    [OpCode.return]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, _] = inputs[OpCodeMode[OpCode.return]](i)
        frame.return(a)
    },
}

Copy the code

Read operations (loadNil, loadK, loadKX)

  • LoadNil: b stack bits starting at a are initialized to Nil

  • LoadK: reads a constant from bx and writes it to a

  • LoadKX: take the stack index of a of this instruction, take the constant index of ax of the next instruction, and assign the value.

export type LoadOP = OpCode.loadNil | OpCode.loadK | OpCode.loadKX


export const dealLoad = {
    [OpCode.loadNil]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, b, _] = inputs[OpCodeMode[OpCode.loadNil]](i)
        frame.push(nil())
        for (let i = a; i < a + b; i++) { frame.copy(-1, i) }
        frame.pop()
    },
    [OpCode.loadK]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, bx] = inputs[OpCodeMode[OpCode.loadK]](i)
        frame.getConst(bx)
        frame.replace(a)
    },
    [OpCode.loadKX]: (i: Instructon, frame: LxCallFrame) = > {
        const [a, _] = inputs[OpCodeMode[OpCode.loadKX]](i)
        const [ax] = inputs[OpMode.IAx](i)
        frame.getConst(ax)
        frame.replace(a)
    },
}

Copy the code

Adjusting and optimizing

  • We can implement the handler corresponding to the opcode
const deal:{[key in OpCode]: (i: Instructon, frame: LxCallFrame) = > void} = {... dealBase, ... dealIntArith, ... dealFunc, ... dealLoad, }Copy the code
  • Because of the operand parameteraThe size of the stack index is only 8bit, which means our stack index range is between 0 and 255. And we use the 9bitb,cWhen you look for data in the stack, the highest bit is not actually used. We can expand on that, and use the highest bit to determine whether to look in the stack or in the constant list. For integer operations, this means possible bar savingsloadKThe instructions. This is also a performance optimization method in Lua.

Add a getRk method to our stack frame

export class LxCallFrame extends Stack {
    // ...
    getRk(idx: number) {
        if (idx > 0xff) {
            this.getConst(idx & 0xff)}else {
            this.pushValue(idx)
        }
    }
}
Copy the code

In integer arithmetic operations, use getRK instead of pushValue

const dealBinaryArith = (op: ArithIntOP) => (i: Instructon, frame: LxCallFrame) => {
    // ...
    frame.getRK(b)
    frame.getRK(c)
    // ...
}
Copy the code
  • Since we have implemented a function for executing instructions, we can add methods for executing instructions to the stack frame

export class LxCallFrame extends Stack {
    // ...
    // Execute the next instruction
    nextStep() {
        ins.excute(this.fetch(), this)}}Copy the code
  • Extend the abstract class LxCallStack to the entity class LxVM

    • Accepts a main function prototype in the constructor and generates the bottom call frame. (Since the frame initialization requires a stack, and at the time of lxVM construction, there was no stack, you need to adjust the LxCallStack constructor.)

    • The nextStep method calls the nextStep on the top call frame

    • The terminate method terminates the program

export abstract class LxCallStack {

    #top: LxCallFrame
    
    constructor(top:(stack:LxCallStack)=> LxCallFrame) {
        this.#top = top(this)}top(){
        return this.#top
    }
    
    push(frame: LxCallFrame) {
        frame.prev = this.#top
        this.#top = frame
    }
    
    pop() {
        const top = this.#top
        if (top.prev) {
            this.#top = top.prev
        } else {
            this.terminate()
        }
        return top
    }
    abstract terminate():void
}


export class LxVM extends LxCallStack {
    
    constructor(main:Proto) {
        super((stack) = >new LxCallFrame(main,stack))
        
    }

    nextStep(){
        this.top().nextStep()
    }

    terminate(){
        console.log('terminate'.this)}}Copy the code

conclusion

At this point, our virtual machine is complete. The final step is to parse the original abstract syntax tree and generate the corresponding function prototype!