In the last article, we detailed the performance bottlenecks in the front end and the various efforts that were being made to address the performance issues before WebAssembly was born.

In this article, we’ll take a look at WebAssembly technology itself and its current state of the art.

What is WebAssembly?

Webassembly is not a new language, as it is officially defined:

A WebAssembly, or Wasm for short, is a set of binary instructions based on a stack-based virtual machine. Wasm is designed to be a portable compilation target for a programming language and can be deployed on a Web platform to serve both client and server applications

The definition here may give the impression that I know every word, but taken together…

Let’s take the keywords and analyze them one by one:

Stack virtual machine:

There are several common computational models in computer theory: Stack machine, accumulator, register machine, namely the CPU when doing the calculations are based on what kind of storage is defined, just as its name implies, is the data stored in the stack stack machine, accumulator is to put the data on the accumulator, the machine will put the data on the register, register here say data is rigorous, is actually the operand, Each of the three computing models has its advantages and disadvantages, and wASM uses a stack machine to achieve its goals.

Stack Machine model Stack Machine, known as “Stack structure Machine”, that is, the English “Stack Machine”. The computer based on the stack machine model, whether virtual machine or physical computer, will use the “stack” structure to realize the data storage and exchange process. A stack is a LIFO (last in, first out) data structure, meaning that the data that is placed last in the stack container can be retrieved first.

Next, we will try to simulate the actual running flow of the stack machine. In this process, we use simple commands like “push”, “pop” and “add”. Here you can think of them as a kind of assembly instruction. When most instructions are executed, a number of required operands will be taken from the stack container of the stack machine, and then according to the corresponding function of the instruction, the stack machine will carry out certain operations and processing on the taken operands. When this process is complete, if the instruction has a result to return, the value is pushed back into the stack container.

Suppose we need to evaluate the expression “1 + 2”, how would the expression be executed by the stack machine? As we mentioned earlier, the stack container in the stack machine serves primarily as a place for storing and exchanging data during program execution. For the above expression, the compiler usually generates instructions like the following when it actually compiles, assuming that no optimization strategy is used.

As shown in the figure above, here we put the set of instructions generated by the compiler, executed from top to bottom, on the left. The current state of the stack container in the stack machine is placed on the right. As you can see, the stack container is empty with no data inside. Next, the stack machine begins to execute the first instruction, “push 1.” The push instruction pushes the operands that follow directly onto the stack. When the command is executed, the state of the stack container is shown in the figure below.

We mark instructions that have been executed in red. At this point, the bottom of the stack container holds the operand “1” pushed in by the first push instruction. In the same manner, the stack machine continues with the second instruction, “push 2.” After the execution of this instruction, the state of the stack container is shown in the figure below.

As you can see, the stack container currently holds operands “1” and “2” pushed in through the first two push instructions. Next, the stack machine proceeds to execute the third “add” instruction.

This instruction requires two operands, so when executing the instruction, the stack machine first checks the current stack container to see if the number of elements in it is “greater than or equal to 2.” If this condition is true, the stack will take the two operands directly from the top of the stack container and add them together. The result will be pushed into the stack container again. After the last add instruction is executed, the state of the stack container is shown in the figure below. When all instructions are executed, the stack container will store the result value of the expression “1 + 2” evaluated by the stack machine.

When all instructions are executed, the stack container will store the result value of the expression “1 + 2” evaluated by the stack machine.

Those of you who are interested in the other two models can do a search.

ISA and V – ISA

For those of you who have looked at all three models of computation, in general you will see that the instructions for each model of computation have a different basic structure. Such as the number of operands an instruction can accept, where operable data is stored, and the nuances of how instructions interact with each other.

Instruction Set Architecture is commonly referred to as ISA (Instruction Set Architecture) for Instruction sets that can be applied to existing physical systems such as I386, x86-64, etc. The other instruction set used in Virtual architecture is commonly referred to as a V-ISA, or Virtual ISA.

The design of these V-ISA is mostly based on the stack machine model.

Wasm is one such V-ISA. The main reason why Wasm chooses the stack machine model for instruction design is that the stack machine itself is relatively simple to design and implement. Rapid prototyping allows for trial and error for future development of Wasm.

Another important reason is that the stack container feature of the stack machine model makes the instruction code validation process of the Wasm module much easier.

Simple implementation makes it easy to integrate the Wasm engine with the browser. Based on the structured control flow of stack machine, Wasm code can have good performance even under the stack machine model by SSA (Static Single Assignment Form) transformation of Wasm instructions. The moderate instruction length of the stack machine model itself ensures that the Wasm binary module can have a higher density of instruction code in the same volume.

Wasm virtual instruction set

So far, we have seen that Wasm ISA v-isa instruction set based on the stack-machine model. So let’s take a look at what it really looks like. The following is a standard Wasm instruction. The function of this instruction is the same as the example we used earlier when we introduced the three calculation models.

i32.const 1
i32.const 2
i32.add
Copy the code

The first two instructions use “i32. Const”, which pushes the immediately following number into the stack container of the stack machine as a value of type I32, or 32-bit integer. The final instruction, “i32.add,” takes the two i32 values at the top of the stack, adds them together, and then puts the calculation back into the stack.

Similarly, the stack machine checks to see if the current stack container contains at least two i32 values at the top before actually executing this instruction. As you can see, the Wasm instructions are executed in exactly the same way as we did in the stack machine model. At this point, you’re probably wondering what Wasm is at the beginning of this article. The answer to that question has come to light.

Another thing to mention is the analogy between assembly language and machine code. Here we see “i32.const” and “i32.add” are actually the text mnemonic corresponding to each instruction in the V-ISA instruction set Wasm. In fact, when these mnemonics are compiled into the Wasm binary module, the binary bytecode corresponding to the mnemonic (commonly known as OpCode, which you can simply think of as binary numbers) is used, along with some encoding algorithm to compress the size of the entire binary module file.

Compilation goals for high-level languages

Although Wasm has this “mnemonic” form similar to assembly language, it is mostly used as the ultimate compilation target for high-level programming languages such as C/C++.

The compiler automatically handles the conversion from these high-level language sources to Wasm binary instructions. And as we mentioned at the beginning, the official claim is that “Wasm is designed as a portable compilation target for a programming language.”

So if you want to write this assembly language by hand, you can forget it (except for those who want to install 13). You should have learned assembly language in college. It took me a long time to figure out how to use assembly to implement 1 + 1 = 2. Just as everyone now writes TS and gives it to Webpack to compile, JS is the target for TS compilation.

Another key word is portability, which will definitely enhance webAssembly’s application scenarios and flexibility. In fact, as long as the WebAssembly virtual machine is implemented, wASM code can run in any environment.

Switch to wASM’s high-level language

WebAssembly supports continuous growth. Currently, the following languages support it:

  • C/C ++- Good support via EmScripten or other llVM-based minimal toolchain (available for production environments)

  • Rust-webassembly is an officially supported target with a very active community around it.

  • Go- Now supports WebAssembly as a formal but experimental goal

  • C# – has experimental support via Blazor, but is currently required to. NET runtime embedded with Wasm. Recently released in preview form, Blazor is officially being used as an experimental technology by Microsoft.

  • The “betterC” subset of D-D can be compiled into WebAssembly using LDC (LLVM compiler).

  • TypeScript- Experimental but powerful with AssemblyScript.

  • Java- via TeaVM or Bytecoder

  • Haxe- just announced support

  • Kotlin-Kotlin/Native 0.4 has experimental support through WebAssembly and TeaVM

  • Python-pyodide is a Python port of WebAssembly, which includes the core packages of the scientific Python stack (Numpy, Pandas, Matplotlib).

  • PHP- Experimental, but with working prototypes

  • Perl-WebPerl is a port for Perl binaries to WebAssembly, allowing you to run Perl scripts on the Web.

  • Scala- Using the Emscripten compiler

  • Ruby- through the run.rb project

  • Swift- Uses SwiftWasm and is currently under development

AssemblyScript and Rust are two of my favorite programs to learn. AssemblyScript is a WASM language developed in TypeScript, and is friendly to those familiar with TS.

Rust, which comes from the same family as WASM (similar to Flutter and Dart, but with many more applications than DART), is a very young language, but it has been voted the programmers’ favorite language on Stack Overflow for five consecutive years. It has one big advantage. That is, when compiled into WASM, it is extremely small, which is a big advantage for browsers that need to load wASM modules over HTTPS to run.

Of course, for programmers in other language ecosystems, try whatever language you’re familiar with. After all, the hardest part is taking the first step.

Application scenarios of WASM

In the browser

  • Better yet, some languages and tools can be compiled to run on the Web platform.

  • Photo/video editing.

  • Game:

    • Small games that need to be opened quickly
  • AAA, with a lot of resources.

  • Game Portal (agency/Original game platform)

  • P2P applications (games, real-time co-editing)

  • Music player (streaming, caching)

  • Image recognition

  • Live video

  • VR and virtual reality

  • CAD software

  • Scientific visualization and simulation

  • Interactive educational software and news articles.

  • Simulation/simulation platform (ARC, DOSBox, QEMU, MAME…) .

  • Language compiler/virtual machine.

  • POSIX user-space environment that allows porting of existing POSIX applications.

  • Developer tools (editors, compilers, debuggers…)

  • Remote desktop.

  • The VPN.

  • Encryption tools.

  • Local Web server.

  • Plug-ins distributed using NPAPI, but limited by Web security protocols, can use Web APIs.

  • Enterprise software functional clients (e.g., databases)

Out of the browser

  • Game distribution services (portable, secure).

  • The server executes untrusted code.

  • Server applications.

  • Mobile hybrid native apps.

  • Multi-node symmetric computing

Currently, there are several large application commercialization cases as follows:

  • Google Earth

  • Auto CAD

  • Unity, Unreal game engines

  • Figma (foreign UI favorite, replacement sketch is coming soon)

In addition, wASM comes with its own sandbox attribute, so the founder of Docker would have lamented that if WASM had been launched ten years earlier, he would not have started Docker.

On the server side, check out Second State, which uses WASM for server development.