What is the article about?

Learn about the past and present of WebAssembly and how the great creation of making the Web more widely available has worked throughout the life cycle of Web/Node.js.

Throughout this article, you can learn about the WebAssembly native, AssemblyScript, Emscripten compiler, and how to debug WebAssembly programs in a browser.

Finally, it looks into the future of WebAssembly, listing some exciting technology directions.

This article is intended to provide a quick start and in-depth sharing for those of you who are interested in learning about WebAssembly, but haven’t had the time to explore its boundaries. Hopefully, this article will provide you with an interesting guide on your way to learning About WebAssembly.

This article also attempts to answer some of the questions from the previous shared article: Getting Started with WebAssembly: How to use it with C projects

How do I compile complex CMake projects into WebAssembly?
How do you explore a common set of best practices when compiling complex CMake projects into WebAssembly?
How to Debug with CMake projects?

Why is WebAssembly needed?

The heel of dynamic languages

First let’s look at the JS code execution process:

This is the previous ChakraCore engine structure of Microsoft Edge. The JS engine of Microsoft Edge has switched to V8.

The overall process is:

Get the JS source code and hand it to Parser to generate the AST
ByteCode Compiler compiles the AST to ByteCode.
The ByteCode goes into a translator, which interprets the ByteCode line by line into Machine Code and then executes it

However, In fact, the Code we usually write can be optimized In many places. For example, if the same function is executed for many times, the Machine Code mark generated by this function can be optimized, and then packaged to JIT Compiler (just-in-time). When the function is executed next Time, It does not need to go through the parser-Compiler-interpreter process and can directly execute the prepared Machine Code, greatly improving the execution efficiency of the Code.

However, the above JIT optimization can only be used for statically typed variables, such as the function we want to optimize, which has only two parameters, each of which has a definite type. JavaScript is a dynamically typed language, which means that during the execution of a function, the type may change dynamically, and the parameters may become three. The type of the first parameter may change from object to array, which will cause the JIT to fail and require a re-run of parser-Compiler-interpreters-execuation, which are the two most time-consuming steps in the entire code execution process. This is also why the Web cannot execute some high performance applications, such as large games, video clips, etc. in the context of JavaScript.

Static language optimization

In fact, one of the main reasons why JS execution is slow is because of its dynamic language characteristics, which lead to JIT failure. Therefore, if we can introduce static features into JS, then we can maintain effective JIT, which is sure to speed up JS execution. At this time, ASM.js appeared.

Asm.js provides only two data types:

32 – bit signed integer
64 – bit signed floating point number

Other objects such as strings, bores, or objects are stored in memory as numeric values and are called via TypedArray. Integers and floating point numbers are represented as follows:

The ArrayBuffer object, TypedArray view, and DataView are an interface for JavaScript to manipulate binary data in the syntax of arrays, collectively referred to as binary arrays. Reference ArrayBuffer.

var a = 1; var x = a | 0; // x is a 32-bit integer. // y is a 64-bit floating point numberCopy the code

The function is written as follows:

function add(x, y) {

  x = x | 0;

  y = y | 0;

  return (x + y) | 0;

}
Copy the code

All of the above function parameters and return values require a declared type, which is a 32-bit integer.

In addition, asM.js does not provide garbage collection mechanism, and all memory operations are controlled by the developer. TypedArray reads and writes memory directly:

var buffer = new ArrayBuffer(32768); Var HEAP8 = new Int8Array(buffer); HEAP8 function compiledCode(PTR) {HEAP[PTR] = 12; return HEAP[ptr + 4]; }Copy the code

As you can see, ASM.js is a strict subset of JavaScript that requires variable types to be determined and immutable at run time, and removes JavaScript garbage collection mechanisms that require developers to manually manage memory. This allows the JS engine to do a lot of JIT optimizations based on asM.js code, and asM.js runs about 50% faster in browsers than native machine code.

new

However, no matter how statically ASM.js is made, it is still the domain of JavaScript to eliminate some time-consuming upper-level abstractions (garbage collection, etc.). Parser-compiler is also required for code execution, and these two processes are the most time-consuming in code execution.

For the sake of extreme performance, cutting-edge Web developers abandoned JavaScript and created an assembly language WebAssembly that can directly deal with Machine Code, directly eliminating parser-Compiler. At the same time, WebAssembly is a strongly typed static language, capable of maximum JIT optimization, making WebAssembly infinitely close to the speed of C/C++ and other native code.

Equivalent to the following process:

WebAssembly que

To get a sense of where WebAssembly fits into the Web, use this diagram:

WebAssembly (also known as WASM) is a new language format that can run in the Web. It has the characteristics of small size, high performance and strong portability. It is similar to JavaScript in the Web at the bottom, and it is also the fourth language in the Web recognized by W3C.

There are several reasons why it is similar to JavaScript at the bottom:

Executed at the same level as JavaScript: JS Engine, such as Chrome’s V8
You can manipulate various Web apis just like JavaScript

WASM can also run in Node.js or other WASM Runtime.

WebAssembly text format

WASM is actually a bunch of binary formats that can be executed directly, but to make it easier to display in a text editor or developer tool, WASM also designs an “intermediate” text format, named.wat or.wast for the extension, and then uses tools like WABT. Convert WASM in text format to executable code in binary format, with.wASM as an extended format.

Take a look at the module code in WASM text format:

(module

  (func $i (import "imports" "imported_func") (param i32))

  (func (export "exported_func")

    i32.const 42

    call $i

  )

)
Copy the code

The code logic is as follows:

A WASM module is defined first, and then from aimportsThe JS module imports a functionimported_func, and name it$i, receiving parametersi32
Then export a file namedexported_funcYou can import this function from a Web App, such as JS
And then the parametersi32Pass in 42 and call the function$i

We convert the above text format to binary via WABT:

Copy the above code to a new one namedsimple.watSave in the file
Compile transformations using WABT

When you have wABt installed, run the following command to compile:

wat2wasm simple.wat -o simple.wasm
Copy the code

The binary is converted to binary, but cannot be viewed in the text editor. To see the binary, we can add the -v option at compile time to print the content on the command line:

wat2wasm simple.wat -v
Copy the code

The following output is displayed:

As you can see, WebAssembly is code in binary format, and even though it provides a slightly more readable text format, it’s not really useful for actual coding, let alone development efficiency.

An attempt at WebAssembly as a programming language

Because the above binary and text formats are not suitable for coding, WASM is not a suitable language for normal development.

AssemblyScript, a variation of TypeScript that adds the WebAssembly type to JavaScript, It can be compiled into WebAssembly using Binaryen.

The WebAssembly types are roughly as follows:

I32, U32, I64, AND V128
Small integer: i8, u8, etc
Variable integer types: isize, usize, etc

AssemblyScript is statically compiled into a strongly typed WebAssembly binary in front of Binaryen, which is then handed over to the JS engine for execution, so while AssemblyScript brings a layer of abstraction, But the actual production code is still WebAssembly, keeping WebAssembly’s performance benefits. AssemblyScript is designed much like TypeScript, providing a set of built-in functions that directly manipulate WebAssembly and compiler features.

Built-in functions:

Static type checking:
- function isInteger
  
  (value? T) : bool, etc
Practical functions:
- The function sizeof < T > () : usize, etc
Operate WebAssembly:
- Mathematical operations
  - Function CLZ
    
    (value: T): T
- Memory operations
  - function load
    
    (ptr: usize, immOffset? : usize) : T, etc
- The control flow
  - Function select
    
    (ifTrue: T, condition: bool): T etc
- SIMD
- Atomics
- Inline instructions

Then build up a standard library based on this set of built-in functions.

The standard library:

Globals
Array
ArrayBuffer
DataView
Date
Error
Map
Math
Number
Set
String
Symbol
TypedArray

A typical Array is used as follows:

var arr = new Array<string>(10) // arr[0]; 😢 // initialize for (let I = 0; i < arr.length; ++i) { arr[i] = "" } arr[0]; // Works correctly 😊Copy the code

AssemblyScript adds typescript-like syntax to JavaScript, and then uses it to maintain static strong typing requirements such as C/C++. If not initialized, memory allocation will cause an error.

There are also some extension libraries, such as Node.js process, crypto, etc., JS Console, and some memory-related StaticArray, Heap, etc.

AssemblyScript builds almost all of the features that JavaScript has with its basic types, built-in libraries, standard libraries and extensions, while AssemblyScript provides a typescript-like syntax. It strictly follows the conventions of strongly typed static languages.

AssemblyScript implements the modules itself, such as exporting a module, because the ES module specification for WebAssembly is still in draft form:

// env.ts export declare function doSomething(foo: i32): void { /* ... Function body */}Copy the code

Import a module:

import { doSomething } from "./env";
Copy the code

An example of a large block of code using a class:

class Animal<T> {

  static ONE: i32 = 1;

  static add(a: i32, b: i32): i32 { return a + b + Animal.ONE; }



  two: i16 = 2; // 6

  instanceSub<T>(a: T, b: T): T { return a - b + <T>Animal.ONE; } // tsc does not allow this

}



export function staticOne(): i32 {

  return Animal.ONE;

}



export function staticAdd(a: i32, b: i32): i32 {

  return Animal.add(a, b);

}



export function instanceTwo(): i32 {

  let animal = new Animal<i32>();

  return animal.two;

}



export function instanceSub(a: f32, b: f32): f32 {

  let animal = new Animal<f32>();

  return animal.instanceSub<f32>(a, b);

}
Copy the code

AssemblyScript opens the door to efficient coding with TS syntax, statically strongly typed specifications, and easy access to WebAssembly/ compiler apis. Compile it into a WASM binary using the Binaryen compiler, and then get the performance of WASM execution.

AssemblyScript has built a thriving ecosystem of applications thanks to its flexibility and performance. At present in the chain of blocks, build tools, editors, simulator, games, graphics editing tools, libraries, IoT, testing tools, etc, have a lot of use of AssemblyScript: www.assemblyscript.org/built-with-…

A philosophy of genius: running C/C++ code in a browser

AssemblyScript has greatly improved WebAssembly’s shortcomings in efficient coding, but as a new programming language, its biggest disadvantages are ecology, developers and accumulation.

The designers of WebAssembly clearly designed it with all sorts of perfection in mind. Since WebAssembly is a binary format, it can be used as a compilation target for other languages. If you can build a compiler, Being able to compile an existing, mature language with a large number of developers and a powerful ecosystem into WebAssembly is like being able to directly reuse years of that language and use them to improve the WebAssembly ecosystem, running them on the Web and Node.js.

Fortunately, Emscripten already exists as an excellent compiler for C/C++.

Emscripten’s position in the development link can be visually illustrated by the following diagram:

Compile THE C/C++ code (Rust/Go, etc.) into WASM, then run the WASM runtime in the browser (or Node.js runtime) with JS glue code, such as ffmpeg. Emscripten compiler compiles to the Web for use, can directly transcode audio and video in the browser front end.

The JS “Gule” code above is necessary because if you want to compile C/C++ into WASM and execute it in the browser, you need to implement a Web API that maps to C/C++ related operations to ensure execution. This glue code currently contains some of the more popular C/C++ libraries. Such as SDL, OpenGL, OpenAL, and some POSIX apis.

The biggest use of WebAssembly today is this way of compiling C/C++ modules into WASM, with well-known examples of large libraries or applications such as Unreal Engine 4 and Unity.

Will WebAssembly replace JavaScript?

The answer is no.

Based on the above layers, the design of WASM can be summarized as follows:

Maximize the reuse of the existing underlying language ecosystem, such as C/C++ in game development, compiler design, etc
Near-native performance on the Web, Node.js, or other WASM Runtime, which allows browsers to run large games, image clips, and more
There is also maximum Web compatibility and security
AssemblyScript goes a step further in making it easy to write, write and debug if needed

So for good reason, WebAssembly fits better in this diagram:

WASM Bridges the ecosystem of various system programming languages, further complements the Web development ecosystem, and also provides performance supplement for JS, which is an important map missing in Web development up to now.

Rust Web Framework: github.com/yewstack/ye…

Explore Emscripten in depth

Address: github.com/emscripten-…

The following all the demo can be in the warehouse: code.byted.org/huangwei.fp… find

Star: 21.4 K

Maintenance: Active

Emscripten is an open source, cross-platform Compiler tool chain for compiling C/C++ into WebAssembly, consisting of LLVM, Binaryen, Closure Compiler, and other tools.

The core tool of Emscripten is Emscripten Compiler Frontend (EMCC), which is used to compile C/C++ code instead of some native compilers such as GCC or CLang.

In fact, in order for almost any portable C/C++ code base to be compiled into WebAssembly and executed on the Web or Node.js, Emscripten Runtime also provides a mapping of compliant C/C++ libraries and related apis to the Web/Node.js API, which exists in the compiled JS glue code.

The red part is the compilation of Emscripten, and the green part is the runtime support that Emscripten uses to make C/C++ code run:

Experience “Hello World” briefly

It is worth noting that the installation of webAssembly-related toolchains is almost always provided in source code, probably due to the habits of the C/C++ ecosystem.

To complete a simple C/C++ program running on the Web, we first need to install Emscripten’s SDK:

Git Clone HTTPS // github.com/emscripten-core/emsdk.git # go to the repository CD emsdk.git # go to the repository CD emsdk.git # We installed 1.39.18, /emsdk activate 1.39.18 # Add the corresponding environment variables to the system PATH source./emsdk_env.sh # Run the command to test whether the installation is successful  emcc -v #Copy the code

If the installation is successful, the following output is displayed after the preceding command is executed:

Emcc (Emscripten GCC /clang-like replacement + Linker emulating GNU LD) 1.39.18 Clang version 11.0.0 (/b/s/w/ir/cache/git/chromium.googlesource.com-external-github.com-llvm-llvm--project 613 c4a87ba9bb39d1927402f4dd4c1ef1f9a02f7) Target: x86_64 - apple - darwin21.1.0 Thread model: posixCopy the code

Let’s prepare the initial code:

mkdir -r webassembly/hello_world

cd webassembly/hello_world && touch main.c
Copy the code

Add the following code to main.c:

#include <stdio.h> int main() { printf("hello, world! \n"); return 0; }Copy the code

Then use EMCC to compile the C code, switch to the webAssembly /hello_world directory on the command line, and run:

emcc main.c
Copy the code

This command outputs two files: a.ut.js and a.ut. wasm. The latter is the compiled WASM code, and the former is the JS glue code, which provides the WASM runtime.

You can use Node.js for quick tests:

node a.out.js
Copy the code

It prints “Hello, world!” , we successfully run the C/C++ code in node.js.

Let’s try running the code in a Web environment and modify the compiled code as follows:

emcc main.c -o main.html
Copy the code

The command above generates three files:

main.jsGlue code
main.wasmWASM code
main.htmlLoad the glue code to execute some of WASM’s logic

Emscripten generated code has certain rules, specific can consult: emscripten.org/docs/compil…

If you want to open this HTML in the browser, you need to start a local server, because the simple open through the file:// protocol access, mainstream browsers do not support XHR requests, only under the HTTP server, XHR requests, so we run the following command to open the website:

npx serve .
Copy the code

Open the web page, go to localhost:3000/main.html, you can see the following result:

There will also be printouts in the developer tools:

Try calling C/C++ functions in JS

The last section gave us a taste of how to run C programs on the Web and in Node.js, but we still have a long way to go if we want complex C/C++ applications like Unity to run on the Web. One of them is being able to manipulate C/C++ functions in JS.

Let’s create a new function.c file in our directory and add the following code:

 #include <stdio.h>

 #include <emscripten/emscripten.h>



int main() {

    printf("Hello World\n");

}



EMSCRIPTEN_KEEPALIVE void myFunction(int argc, char ** argv) {

    printf("MyFunction Called\n");

}
Copy the code

It is worth noting that Emscripten’s default compiled code will only call main, and the rest will be removed as “dead code” at compile time, so to use the myFunction we defined above, We need to precede the definition with the EMSCRIPTEN_KEEPALIVE declaration to make sure we don’t delete myFunction related code at compile time.

We need to import the emscripten/emscripten.h header file to use the EMSCRIPTEN_KEEPALIVE declaration.

We also need to make some improvements to the compile command as follows:

emcc function.c -o function.html -s NO_EXIT_RUNTIME=1 -s "EXTRA_EXPORTED_RUNTIME_METHODS=['ccall']"
Copy the code

Two additional parameters were added above:

-s NO_EXIT_RUNTIME=1Said inmainAfter the function runs, the program does not exit, but still maintains the executable state, which is convenient for subsequent callsmyFunctionfunction
-s "EXTRA_EXPORTED_RUNTIME_METHODS=['ccall']"Exports a run-time functionccall, this function can call C program functions in JS

After compiling, we also need to modify the generated function.html file to add our function call logic as follows:

<html> <body> <! <button class="mybutton">Run myFunction</button> </body> <! <script> document.querySelector (".myButton ").addeventListener ("click", function () { alert("check console"); Var result = module. ccall("myFunction", // C function name null, // function return type null, // function argument type, default array null, // function argument type, default array);  }); </script> </html>Copy the code

As you can see, we added a Button, and then added a script that registered the click event for the Button. In the callback function, we called myFunction.

At the command line run NPX serve. Open a browser to http://localhost:3000/function.html, view the results as follows:

Just execute main:

Try clicking the button to execute the myFunction function:

As you can see, the alert box is displayed first, and then open the console, you can see the call result of myFunction, and print “myFunction Called”.

First taste of the Emscripten file system

We can use the LIBC STdio API like fopen and fclose to access your file system in C/C++ applications, but JS runs in a browser-provided sandbox environment and cannot directly access the local file system. So in order to be compatible with C/C++ programs accessing the file system, Emscripten emulates a file system in its JS glue code and provides the same API as LiBC Stdio.

Let’s create a new program called file.c and add the following code:

#include <stdio.h> int main() { FILE *file = fopen("file.txt", "rb"); if (! file) { printf("cannot open file\n"); return 1; } while (! feof(file)) { char c = fgetc(file); if (c ! = EOF) { putchar(c); } } fclose (file); return 0; }Copy the code

The above code we first use fopen access to file.txt, and then line by line to read the contents of the file, if there is any error during the execution of the program, it will print errors.

Create a file called file. TXT in the directory and add the following contents:

==

This data has been read from a file.

The file is readable as if it were at the same location in the filesystem, including directories, as in the local filesystem where you compiled the source.

==
Copy the code

If we want to compile this program and ensure that it will run properly in JS, we need to add the preload parameter to Emscripten Runtime at compile time, because accessing the file on C/C++ programs is synchronous operation. JS is an asynchronous operation based on the event model, and files can only be accessed in the form of XHR in the Web (Web Worker and Node.js can access files synchronously), so you need to load the files in advance to ensure that the files are ready before the code is compiled. This allows C/C++ code to access the file directly.

Run the following command to compile the code:

emcc file.c -o file.html -s EXIT_RUNTIME=1 --preload-file file.txt
Copy the code

-s EXIT_RUNTIME=1 to ensure that the program does not exit after the main logic is executed.

Then we run the local server, visit http://localhost:3000/file.html, you can view the results:

Try compiling an existing WebP module and using it

With the three examples above, we’ve seen how basic C/C++ stuff like printing, function calls, and file systems-related stuff can be compiled into WASM and run in JS, specifically the Web and Node.js environments. Most of the C/C++ programs written by yourself can be compiled into WASM using the above examples.

As mentioned earlier, one of the biggest application scenarios for WebAssembly today is to maximize the reuse of existing language ecosystems, such as libraries in the C/C++ ecosystem, which often rely on the C standard library, operating system, file system, or other dependencies. The great thing about Emscripten is that it’s compatible with most of these dependencies, and it’s usable enough, albeit with some limitations.

Simple test

Let’s take a look at how to compile an existing, complex and widely used C module, libwebp, into WASM and allow it to be used on the Web. The source code for libwebp is C and can be found on Github, along with some of its API documentation.

To get the code ready, run the following command in our directory:

git clone https://github.com/webmproject/libwebp
Copy the code

To quickly test if libwebp is properly connected to use, we can write a simple C function and call the libwebp version fetch function inside to test if the version is correctly retrieved.

Create a webp.c file in the directory and add the following:

#include "emscripten.h"

#include "src/webp/encode.h"



EMSCRIPTEN_KEEPALIVE int version() {

  return WebPGetEncoderVersion();

}
Copy the code

The WebPGetEncoderVersion is the libwebp function that gets the current version. We get this function by importing the SRC /webp/encode.h header file so that the compiler can find it at compile time. At compile time, we need to tell the compiler the address of the libwebp header file and pass all the C files from the libwebp library to the compiler.

Let’s run the following compile command:

emcc -O3 -s WASM=1 -s EXTRA_EXPORTED_RUNTIME_METHODS='["cwrap"]' \

 -I libwebp \

 webp.c \

 libwebp/src/{dec,dsp,demux,enc,mux,utils}/*.c
Copy the code

The above commands mainly do the following:

-I libwebpTell the compiler the address of the libwebp library’s header file
libwebp/src/{dec,dsp,demux,enc,mux,utils}/*.cPass the C files required by the compiler to the compiler, as shown heredec,dsp,demux,enc,mux,utilsAll C files in the directory are passed to the compiler, avoiding the hassle of listing the required files one by one, and then letting the compiler automatically identify and filter out files that are not used
webp.cIs the C function we wrote to callWebPGetEncoderVersionGet the library version
-O3Represents level 3 optimization at compile time, including inline functions, removal of useless code, various compression optimizations for code, and so on
while-s WASM=1It’s actually the default, which is output at compile timexx.out.wasmThe reason why this option is set here is mainly for those runtime that do not support WASM, can be set-s WASM=0Output the equivalent JS code instead of WASM
EXTRA_EXPORTED_RUNTIME_METHODS= '["cwrap"]'Is a function that outputs runtimecwrap, similar toccallC functions can be called in JS

Js and a.ut.wasm. We also need to create an HTML document to use the output script code. Create a new webp.html and add the following:

<html>

  <head></head>

  <body></body>

  <script src="./a.out.js"></script>

    <script>

      Module.onRuntimeInitialized = async _ => {

        const api = {

          version: Module.cwrap('version', 'number', []),

        };

        console.log(api.version());

      };

    </script>

</html>
Copy the code

We usually in the Module. It is important to note that onRuntimeInitialized callback to carry out our inside WASM related operations, because WASM related code from the load to the available is need a period of time, The onRuntimeInitialized callback is to ensure that the WASM-related code has been loaded and is available.

Then we can run the NPX serve. And then visit http://localhost:3000/webp.html and view the results:

You can see that the console prints version 66049.

Libwebp uses the ABC of 0xabc in hexadecimal to represent the current version A.B.C. For example, v0.6.1 is encoded as 0x000601 in hexadecimal and 1537 in decimal. The value is 66049 in decimal and 0x010201 in hexadecimal, indicating that the current version is V1.2.1.

Get the image in JavaScript and run it in WASM

Having just verified that the libwebp library has been successfully compiled into WASM by calling the encoder’s WebPGetEncoderVersion method to get the version number, and can then be used in JavaScript, we’ll move on to more complicated operations, How to convert image formats using libwebp’s coding API.

Libwebp encoding API needs to receive a about RGB, RGBA and BGR or BGRA byte array, fortunately, the Canvas API have a CanvasRenderingContext2D getImageData method, Returns a Uint8ClampedArray, an array containing image data in RGBA format.

First we need to write a function to load the image in JavaScript and write it to the HTML file created in the previous step:

<script src="./a.out.js"></script> <script> Module.onRuntimeInitialized = async _ => { const api = { version: Module.cwrap('version', 'number', []), }; console.log(api.version()); }; Async function loadImage(SRC) {const imgBlob = await fetch(SRC).then(resp => resp.blob()); const img = await createImageBitmap(imgBlob); Const canvas = document.createElement('canvas'); const canvas = document.createElement('canvas'); canvas.width = img.width; canvas.height = img.height; // Draw the image to canvas const CTX = Canvas.getContext ('2d'); ctx.drawImage(img, 0, 0); return ctx.getImageData(0, 0, img.width, img.height); } </script>Copy the code

Now all that remains is how to copy image data from JavaScript to WASM. To do this, we need to expose additional methods in the previous webp.c function:

A method to allocate memory for images in WASM
A way to free up memory

Modify webp.c as follows:

EMSCRIPTEN_KEEPALIVE Uint8_t * create_buffer(int width, int height) { return malloc(width * height * 4 * sizeof(uint8_t)); } EMSCRIPTEN_KEEPALIVE void destroy_buffer(uint8_t* p) { free(p); }Copy the code

1 uint8_t (uint8_t); malloc (); When this pointer is returned to JavaScript for use, it is treated as a simple number. When the corresponding C function exposed to JavaScript is retrieved via the cwrap function, you can use this pointer number to find where the memory to copy the image data starts.

We added additional code to the HTML file as follows:

<script src="./a.out.js"></script> <script> Module.onRuntimeInitialized = async _ => { const api = { version: Module.cwrap('version', 'number', []), create_buffer: Module.cwrap('create_buffer', 'number', ['number', 'number']), destroy_buffer: Module.cwrap('destroy_buffer', '', ['number']), encode: Module.cwrap("encode", "", ["number","number","number","number",]), free_result: Module.cwrap("free_result", "", ["number"]), get_result_pointer: Module.cwrap("get_result_pointer", "number", []), get_result_size: Module.cwrap("get_result_size", "number", []), }; const image = await loadImage('./image.jpg'); const p = api.create_buffer(image.width, image.height); Module.HEAP8.set(image.data, p); / /... call encoder ... api.destroy_buffer(p); }; Async function loadImage(SRC) {const imgBlob = await fetch(SRC).then(resp => resp.blob()); const img = await createImageBitmap(imgBlob); Const canvas = document.createElement('canvas'); const canvas = document.createElement('canvas'); canvas.width = img.width; canvas.height = img.height; // Draw the image to canvas const CTX = Canvas.getContext ('2d'); ctx.drawImage(img, 0, 0); return ctx.getImageData(0, 0, img.width, img.height); } </script>Copy the code

As you can see, in addition to importing the create_buffer and Destroy_buffer we added earlier, the code also has many functions for encoding files, which we’ll cover later. In addition, the code first loads an image of image.jpg. We then call C to allocate memory for the image data, get the corresponding pointer to the WebAssembly Module.HEAP8, write the image data at the beginning of memory, and finally free the allocated memory.

Coding image

Now that the image data has been loaded into WASM’s memory, you can call the libwebp encoder method to complete the coding process. Looking at the WebP documentation, you can use the WebPEncodeRGBA function to do the work. This function receives a pointer to the image data and its size, as well as the stride length to be crossed at each stride, which is 4 bytes (RGBA), with an optional quality parameter ranging from 0 to 100. During coding, WebPEncodeRGBA allocates a block of memory for output data, and we need to call WebPFree to free this memory after coding.

We open the webp.c file and add the following code to handle the encoding:

int result[2];



EMSCRIPTEN_KEEPALIVE

void encode(uint8_t* img_in, int width, int height, float quality) {

  uint8_t* img_out;

  size_t size;



  size = WebPEncodeRGBA(img_in, width, height, width * 4, quality, &img_out);



  result[0] = (int)img_out;

  result[1] = size;

}



EMSCRIPTEN_KEEPALIVE

void free_result(uint8_t* result) {

  WebPFree(result);

}



EMSCRIPTEN_KEEPALIVE

int get_result_pointer() {

  return result[0];

}



EMSCRIPTEN_KEEPALIVE

int get_result_size() {

  return result[1];

}
Copy the code

The result of the above WebPEncodeRGBA function is to allocate a block of memory for the output data and the size of the returned memory. Array because C function cannot be used as the return value (unless we need to carry on the dynamic memory allocation), so we use a global static array to obtain the results returned, it may not be very normative writing C code, at the same time it requires wasm pointer to 32 – bit, but for the sake of simplicity we can tolerate this kind of practice for the time being.

Now that the c-side logic has been written, you can call the encoding function on the JavaScript side to get the pointer to the image data and the memory occupied by the image, save this data to the WASM buffer, and then release the memory allocated by WASM when processing the image. Let’s open the HTML file to complete the logic described above:

<script src="./a.out.js"></script> <script> Module.onRuntimeInitialized = async _ => { const api = { version: Module.cwrap('version', 'number', []), create_buffer: Module.cwrap('create_buffer', 'number', ['number', 'number']), destroy_buffer: Module.cwrap('destroy_buffer', '', ['number']), encode: Module.cwrap("encode", "", ["number","number","number","number",]), free_result: Module.cwrap("free_result", "", ["number"]), get_result_pointer: Module.cwrap("get_result_pointer", "number", []), get_result_size: Module.cwrap("get_result_size", "number", []), }; const image = await loadImage('./image.jpg'); const p = api.create_buffer(image.width, image.height); Module.HEAP8.set(image.data, p); api.encode(p, image.width, image.height, 100); const resultPointer = api.get_result_pointer(); const resultSize = api.get_result_size(); const resultView = new Uint8Array(Module.HEAP8.buffer, resultPointer, resultSize); const result = new Uint8Array(resultView); api.free_result(resultPointer); api.destroy_buffer(p); }; Async function loadImage(SRC) {const imgBlob = await fetch(SRC).then(resp => resp.blob()); const img = await createImageBitmap(imgBlob); Const canvas = document.createElement('canvas'); const canvas = document.createElement('canvas'); canvas.width = img.width; canvas.height = img.height; // Draw the image to canvas const CTX = Canvas.getContext ('2d'); ctx.drawImage(img, 0, 0); return ctx.getImageData(0, 0, img.width, img.height); } </script>Copy the code

In the above code we load a local image.jpg image using the loadImage function. You need to prepare an image to use in the emCC compiler’s output directory, which is our HTML file directory.

Note: New Uint8Array(someBuffer) will create a new view on the same memory block, while new Uint8Array(someTypedArray) will only copy the data of someTypedArray, make sure to use the copied data for operation, The original memory data will not be modified.

Wasm does not automatically expand memory for large images. If the default memory allocation does not hold the input and output image data, you may get the following error:

However, the image used in our example is small, so simply add a filter parameter -s ALLOW_MEMORY_GROWTH=1 at compile time and ignore the error message:

emcc -O3 -s WASM=1 -s EXTRA_EXPORTED_RUNTIME_METHODS='["cwrap"]' \

    -I libwebp \

    webp.c \

    libwebp/src/{dec,dsp,demux,enc,mux,utils}/*.c \

    -s ALLOW_MEMORY_GROWTH=1
Copy the code

Running the above command again yields the wASM code with the encoding function added and the corresponding JavaScript glue code, so that when we open the HTML file, it is already able to encode a JPG file into WebP format. To further prove this point, We can display the image on the Web interface by modifying the HTML file and adding the following code:

<script> // ... api.encode(p, image.width, image.height, 100); const resultPointer = api.get_result_pointer(); const resultSize = api.get_result_size(); const resultView = new Uint8Array(Module.HEAP8.buffer, resultPointer, resultSize); const result = new Uint8Array(resultView); Const blob = new blob ([result], {type: 'image/webp'}); const blobURL = URL.createObjectURL(blob); const img = document.createElement('img'); img.src = blobURL; document.body.appendChild(img) api.free_result(resultPointer); api.destroy_buffer(p); </script>Copy the code

Then refresh your browser and you should see something like this:

By downloading this file locally, you can see that its format is converted to WebP:

We successfully compiled the existing libwebp C library into WASM, and converted JPG images into WebP format and displayed them on the Web interface. Using WASM to handle computation-intensive transcoding operations can greatly improve the performance of Web pages. This is one of the main advantages WebAssembly brings.

How to compile FFmpeg to WebAssembly?

Boy, I just taught you 1+1, and I’m already solving quadratic equations. 🌚

In the previous example we successfully compiled an existing C module into WebAssembly, but there are many larger projects that rely on the C library, operating system, file system, or other dependencies that rely on libraries such as AutoConfig/Automake to generate system-specific code before compilation.

So you’ll often see libraries go through the following steps before they can be used:

. / configure # # processing front depends on the make use GCC to compile construction, such as generated object fileCopy the code

Emscripten provides emconfigure and emmake to encapsulate these commands and inject appropriate parameters to smooth out projects with pre-dependencies. If you use EMCC to process projects with a large number of pre-dependencies, the command will look like this:

Emmake make # emmake make-j4: emmconfigure. /configure # emmconfigure. /configure # emmconfigure. /configure # emmconfigure. /configure # emmake make-j4: Instead of the traditional C object file emcc xxx.o # compile the object file generated by make into wASM file + JS glue codeCopy the code

Next we’ll show you how to handle this reliance on libraries such as AutoConfig/Automake to generate specific code by actually compiling FFMPEG.

After practice, ffMPEG compilation depends on a specific FFMPEG version, Emscripten version, operating system environment, etc., so the following FFMPEG compilation is limited to a specific condition, mainly for the general FFMPEG compilation to provide a way of thinking and debugging methods.

Prepare directory

This time we create the WebAssembly directory and place the ffMPEG source code and the code for the x264 decoder to be used:

Git Clone HTTPS: // github.com/emscripten-core/emsdk.git # go to the repository CD emsdk # get the latest code, if it is a new clone step can not need git pullCopy the code

Compilation step

When compiling most complex C/C++ libraries using Emscripten, there are three main steps:

useemconfigureRunning the projectconfigureThe file will be C/C++ code compiler fromgcc/g++Switch toemcc/em++
throughemmake makeTo build C/C++ projects that generate WASM objects.ofile
callemccReceives compiled object files.oFile, and then output the final WASM and JS glue code

Install specific dependencies

Note: we already have a version of Emscripten installed at the beginning of this step, just to highlight the version.

In order to verify ffMPEG validation, we need to rely on a specific version, and the various file versions we rely on are described in detail below.

First, install the Emscripten compiler version 1.39.18. Before entering, Clone to the local emsdk project and run the following command:

/emsdk install 1.39.18./emsdk activate 1.39.18 source./emsdk_env.shCopy the code

Run the following command on the CLI to check whether the switchover is successful:

Emcc -v # output 1.39.18Copy the code

Download ffMPEG code with branch N4.3.1 from emSDK:

Git clone - the depth of 1 - branch n4.3.1 https://github.com/FFmpeg/FFmpegCopy the code

Use emconfigure to process configure files

Process the configure file with the following script:

export CFLAGS="-s USE_PTHREADS -O3" export LDFLAGS="$CFLAGS -s INITIAL_MEMORY=33554432" emconfigure ./configure \ -- target-OS =none # Set to none to remove some OS specific dependencies --arch=x86_32 # select architecture x86_32 --enable-cross-compile # handle cross-platform operations -- disable-x86ASM \ # Disable x86ASM --disable-inline-asm \ # Disable in-line ASM \ # Disable stripping --disable-doc -- add some flag output --extra-cflags="$cflags "\ -- extra-cxxFlags ="$CFLAGS" \ -- extra-ldFlags ="$ldflags "\ --nm="llvm-nm" \ # Use the LLVM compiler --ar=emar \ --ranlib=emranlib \ --cc=emcc \ # replace GCC with emcc -- CXX =em++ \ # replace g++ with em++ --objcc=emcc \ --dep-cc=emccCopy the code

The above script does a few things:

USE_PTHREADSopenpthreadssupport
-O3Optimizes code size at compile time, typically from 30MB to 15MB
INITIAL_MEMORYSet it to 33554432 (32MB), mainly because Emscripten can take up 19MB, so set it to a larger memory capacity to avoid running out of memory to allocate during compilation
The actual useemconfigureTo configure theconfigureFile, replacegccThe compiler toemcc, as well as setting up the necessary actions to handle compilation bugs that may be encountered, and finally generating a configuration file for compilation builds

Use emmake make to build dependencies

Now that you have the configuration files in place, you need to build the actual dependencies using emmake by running the following command from the command line:

# build the final ffmpeg.wasm file emmake make-j4Copy the code

Through the above compilation, the following four files are generated:

ffmpeg

ffmpeg_g

ffmpeg_g.wasm

ffmpeg_g.worker.js

The first two are JS files, the third is WASM module, and the fourth is the function that deals with the running of relevant logic in worker. The ideal form of the files generated above should be three. In order to achieve such customized compilation, it is necessary to customize emCC command for processing.

Compile output using EMCC

Create a wASM folder in the FFmpeg directory to place the built files, and customize the compiled files to the following output:

mkdir -p wasm/dist emcc \ -I. -I./fftools \ -Llibavcodec -Llibavdevice -Llibavfilter -Llibavformat -Llibavresample -Llibavutil -Llibpostproc -Llibswscale -Llibswresample \ -Qunused-arguments \ -o wasm/dist/ffmpeg-core.js fftools/ffmpeg_opt.c fftools/ffmpeg_filter.c fftools/ffmpeg_hw.c fftools/cmdutils.c fftools/ffmpeg.c \ -lavdevice -lavfilter-lavformat -lavcodec -lswresample-lswscale-lavutil-lm \ -o3 \ -s USE_SDL=2 \ # Use SDL2 -s USE_PTHREADS=1 \ -s INVOKE_RUN=0. -s INVOKE_RUN=0 EXPORTED_FUNCTIONS="[_main, _proxy_main]" \ -s EXTRA_EXPORTED_RUNTIME_METHODS="[FS, cwrap, setValue, writeAsciiToMemory]" \ -s INITIAL_MEMORY=33554432Copy the code

The above script has the following major improvements:

-s PROXY_TO_PTHREAD=1Set at compile timepthread, so that the program has responsive special effects
-o wasm/dist/ffmpeg-core.jsWill the originalffmpegThe output of the js file is renamed toffmpeg-core.js, corresponding outputffmpeg-core.wasm 和 ffmpeg-core.worker.js
-s EXPORTED_FUNCTIONS="[_main, _proxy_main]"Export ffmpeg corresponding C file inmainThe function,proxy_mainBy settingPROXY_TO_PTHREAD The agentmainFunction for external use
-s EXTRA_EXPORTED_RUNTIME_METHODS="[FS, cwrap, setValue, writeAsciiToMemory]"The runtime helper function is used to export C functions, handle file systems, and Pointers

The following three files are output through the above compilation command:

ffmpeg-core.js

ffmpeg-core.wasm

ffmpeg-core.worker.js

Use the compiled FFMPEG WASM module

Create an ffmpeg.js file in the wASM directory and write the following code:

const Module = require('./dist/ffmpeg-core.js');



Module.onRuntimeInitialized = () => {

  const ffmpeg = Module.cwrap('proxy_main', 'number', ['number', 'number']);

};
Copy the code

Then run the above code with the following command:

node --experimental-wasm-threads --experimental-wasm-bulk-memory ffmpeg.js
Copy the code

The above code is explained as follows:

OnRuntimeInitialized is the logic to be executed after the WebAssembly module is loaded. All of our related logic needs to be written in this function
Cwrap is used to export proxy_main from C file (fftools/ffmpeg.c). The signature of the function is int main(int argc, char **argv). Char **argv is a pointer to C, representing an array of Pointers to the actual arguments, which can also be mapped to number
To run ffmPEG-hide_banner from the command line, in our code we need to call main(2, [“./ffmpeg”, “-hide_banner”]). So how do we pass an array of strings? This question can be broken down into two parts:
- We need to convert JavaScript strings into character arrays in C
- We need to convert an array in JavaScript to an array of Pointers in C

The first part is easy, because Emscripten provides a helper function, writeAsciiToMemory, to do this:

const str = "FFmpeg.wasm"; const buf = Module._malloc(str.length + 1); / / allocate a byte of space to store extra 0 indicates the end of the string. The Module writeAsciiToMemory (STR, buf);Copy the code

The second part is a bit harder. We need to create an array of Pointers to 32-bit integers in C. We can use setValue to help us create this array:

const ptrs = [123, 3455];

const buf = Module._malloc(ptrs.length * Uint32Array.BYTES_PER_ELEMENT);

ptrs.forEach((p, idx) => {

  Module.setValue(buf + (Uint32Array.BYTES_PER_ELEMENT * idx), p, 'i32');

});
Copy the code

Putting the above code together, we get a program that can interact with FFMPEG:

const Module = require('./dist/ffmpeg-core');



Module.onRuntimeInitialized = () => {

  const ffmpeg = Module.cwrap('proxy_main', 'number', ['number', 'number']);

  const args = ['ffmpeg', '-hide_banner'];

  const argsPtr = Module._malloc(args.length * Uint32Array.BYTES_PER_ELEMENT);

  args.forEach((s, idx) => {

    const buf = Module._malloc(s.length + 1);

    Module.writeAsciiToMemory(s, buf);

    Module.setValue(argsPtr + (Uint32Array.BYTES_PER_ELEMENT * idx), buf, 'i32');

  })

  ffmpeg(args.length, argsPtr);

};
Copy the code

Then run the program with the same command:

node --experimental-wasm-threads --experimental-wasm-bulk-memory ffmpeg.js
Copy the code

The results of the above run are as follows:

You can see that we compiled and ran ffmpeg 🎉 successfully.

Process the Emscripten file system

Emscripten has built in a virtual file system to support standard file reads and writes in C, so we need to write audio files to the file system before passing them to FFmpeg.wASM.

You can click here to see more about file system APIS.

Fs.writefile () and fs.readfile () are just two functions in the FS module to accomplish the above tasks. For all data read and written from the file system, it requires the Uint8Array type in JavaScript. So it is necessary to agree on data types before consuming data.

We’ll read the video file named flame. Avi with the fs.readfilesync () method and then write it to the Emscripten file system using fs.writefile ().

const fs = require('fs');

const Module = require('./dist/ffmpeg-core');



Module.onRuntimeInitialized = () => {

  const data = Uint8Array.from(fs.readFileSync('./flame.avi'));

  Module.FS.writeFile('flame.avi', data);



  const ffmpeg = Module.cwrap('proxy_main', 'number', ['number', 'number']);

  const args = ['ffmpeg', '-hide_banner'];

  const argsPtr = Module._malloc(args.length * Uint32Array.BYTES_PER_ELEMENT);

  args.forEach((s, idx) => {

    const buf = Module._malloc(s.length + 1);

    Module.writeAsciiToMemory(s, buf);

    Module.setValue(argsPtr + (Uint32Array.BYTES_PER_ELEMENT * idx), buf, 'i32');

  })

  ffmpeg(args.length, argsPtr);

};
Copy the code

Compile the video using FFmpeg. wASM

Now that we can save the video file to the Emscripten file system, it’s time to actually transcode the video using the compiled FFMEPG.

We modify the code as follows:

const fs = require('fs'); const Module = require('./dist/ffmpeg-core'); Module.onRuntimeInitialized = () => { const data = Uint8Array.from(fs.readFileSync('./flame.avi')); Module.FS.writeFile('flame.avi', data); const ffmpeg = Module.cwrap('proxy_main', 'number', ['number', 'number']); const args = ['ffmpeg', '-hide_banner', '-report', '-i', 'flame.avi', 'flame.mp4']; const argsPtr = Module._malloc(args.length * Uint32Array.BYTES_PER_ELEMENT); args.forEach((s, idx) => { const buf = Module._malloc(s.length + 1); Module.writeAsciiToMemory(s, buf); Module.setValue(argsPtr + (Uint32Array.BYTES_PER_ELEMENT * idx), buf, 'i32'); }); ffmpeg(args.length, argsPtr); const timer = setInterval(() => { const logFileName = Module.FS.readdir('.').find(name => name.endsWith('.log')); if (typeof logFileName ! == 'undefined') { const log = String.fromCharCode.apply(null, Module.FS.readFile(logFileName)); if (log.includes("frames successfully decoded")) { clearInterval(timer); const output = Module.FS.readFile('flame.mp4'); fs.writeFileSync('flame.mp4', output); }}}, 500); };Copy the code

In the above code, we added a timer. Since the process of ffMPEG transcoding video is asynchronous, we need to constantly read whether there is a good transcoding file flag in the Emscripten file system. When we get the file flag and it is not undefined, We use the module.fs.readfile () method to read transcoded video files from the Emscripten file system, and then write the video to the local file system via fs.writefilesync (). In the end we receive the following result:

Use FFMPEG to transcode the video in your browser and play it

In the previous step, we successfully transcoded avi to MP4 format using ffMPEG compiled on the Node side. Next, we will transcode the video using FFMPEG in the browser and play it in the browser.

The ffMPEG we compiled can transcode AVI files to MP4, but the mp4 files that are transcoded using the default encoding format cannot be played directly in the browser because the browser does not support this encoding. So we need to use the libx264 encoder to encode mp4 files into a browser playable encoding format.

X264 encoder source code:

curl -OL https://download.videolan.org/pub/videolan/x264/snapshots/x264-snapshot-20170226-2245-stable.tar.bz2

tar xvfj x264-snapshot-20170226-2245-stable.tar.bz2
Copy the code

Go to the x264 folder and create a build-x264.sh file and add the following contents:

#! /bin/bash -x ROOT=$PWD BUILD_DIR=$ROOT/build cd $ROOT/x264-snapshot-20170226-2245-stable ARGS=( --prefix=$BUILD_DIR --host=i686-gnu # use i686 gnu --enable-static # enable building static library --disable-cli # disable cli tools --disable-asm # disable asm optimization --extra-cflags="-s USE_PTHREADS=1" # pass this flags for using pthreads ) emconfigure ./configure "${ARGS[@]}" emmake make install-lib-static -j4 cd -Copy the code

Note that you need to run the following command in the WebAssembly directory to build x264:

bash x264-snapshot-20170226-2245-stable/build-x264.sh
Copy the code

X264 encoder can be installed in the ffmpeg compiler script to enable x264. This time we create a Bash script in the ffmpeg folder to build. Create build.sh as follows:

#! /bin/bash -x emcc -v ROOT=$PWD BUILD_DIR=$ROOT/build cd $ROOT/FFmpeg CFLAGS="-s USE_PTHREADS -I$BUILD_DIR/include" LDFLAGS="$CFLAGS -L$BUILD_DIR/lib -s INITIAL_MEMORY=33554432" # 33554432 bytes = 32 MB CONFIG_ARGS=( --target-os=none # use none to prevent any os specific configurations --arch=x86_32 # use x86_32 to achieve minimal architectural optimization --enable-cross-compile # enable cross compile --disable-x86asm # disable x86 asm --disable-inline-asm # disable inline asm --disable-stripping --disable-programs # disable programs build (incl. ffplay, ffprobe & ffmpeg) --disable-doc # disable doc --enable-gpl ## required by x264 --enable-libx264 ## enable x264 --extra-cflags="$CFLAGS" --extra-cxxflags="$CFLAGS" --extra-ldflags="$LDFLAGS" --nm="llvm-nm" --ar=emar --ranlib=emranlib --cc=emcc --cxx=em++ --objcc=emcc --dep-cc=emcc ) emconfigure ./configure "${CONFIG_ARGS[@]}" # build ffmpeg.wasm emmake make -j4 cd -Copy the code

For the above compilation script, run the following command in the WebAssembly directory to process the configuration file and compile the file:

bash FFmpeg/build.sh
Copy the code

Then create the build-with-emcc.sh script for customizing the output build file: build-with-emcc.sh

ROOT=$PWD BUILD_DIR=$ROOT/build cd FFmpeg ARGS=( -I. -I./fftools -I$BUILD_DIR/include -Llibavcodec -Llibavdevice -Llibavfilter -Llibavformat -Llibavresample -Llibavutil -Llibpostproc -Llibswscale -Llibswresample -L$BUILD_DIR/lib -qunused-arguments # add -lpostproc and -lx264 to the line -o wasm/dist/ffmpeg-core.js fftools/ffmpeg_opt.c fftools/ffmpeg_filter.c fftools/ffmpeg_hw.c fftools/cmdutils.c fftools/ffmpeg.c -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lpostproc -lm -lx264 -pthread -O3 # Optimize code with performance first -s USE_SDL=2 # use SDL2 -s USE_PTHREADS=1 # enable pthreads support -s PROXY_TO_PTHREAD=1 # detach main() from browser/UI main thread -s INVOKE_RUN=0 # not to run the main() in the beginning -s EXPORTED_FUNCTIONS="[_main, _proxy_main]" # export main and proxy_main funcs -s EXTRA_EXPORTED_RUNTIME_METHODS="[FS, cwrap, setValue, writeAsciiToMemory]" # export preamble funcs -s INITIAL_MEMORY=268435456 # 268435456 bytes = 268435456 MB ) emcc "${ARGS[@]}" cd -Copy the code

Then run the script to receive the object file compiled in the previous step and compile it into WASM and JS glue code:

bash FFmpeg/build-with-emcc.sh
Copy the code

Ffmpeg transcoding is actually used

We will create a Web page and then provide a button to upload the video file and play the uploaded video file. Although it is not possible to play avi format video files directly on the Web, we can use FFMPEG transcoding to play them.

Create the index.html file in the WASM folder in the FFmpeg directory and add the following:

<html> <head> <style> html, body { margin: 0; width: 100%; height: 100% } body { display: flex; flex-direction: column; align-items: center; } </style> </head> <body> <h3> </h3> <video ID ="output-video" controls></video><br/> <input type="file" id="uploader"> <p id="message"> The ffmPEG script takes 5 seconds </p> <script type="text/javascript"> const readFromBlobOrFile = (blob) => (new Promise(resolve, resolve, reject) => { const fileReader = new FileReader(); fileReader.onload = () => { resolve(fileReader.result); }; fileReader.onerror = ({ target: { error: { code } } }) => { reject(Error(`File could not be read! Code=${code}`)); }; fileReader.readAsArrayBuffer(blob); })); const message = document.getElementById('message'); const transcode = async ({ target: { files } }) => { const { name } = files[0]; Message. innerHTML = 'write a file to the Emscripten file system '; const data = await readFromBlobOrFile(files[0]); Module.FS.writeFile(name, new Uint8Array(data)); const ffmpeg = Module.cwrap('proxy_main', 'number', ['number', 'number']); const args = ['ffmpeg', '-hide_banner', '-nostdin', '-report', '-i', name, 'out.mp4']; const argsPtr = Module._malloc(args.length * Uint32Array.BYTES_PER_ELEMENT); args.forEach((s, idx) => { const buf = Module._malloc(s.length + 1); Module.writeAsciiToMemory(s, buf); Module.setValue(argsPtr + (Uint32Array.BYTES_PER_ELEMENT * idx), buf, 'i32'); }); Message. innerHTML = 'start transcoding '; ffmpeg(args.length, argsPtr); const timer = setInterval(() => { const logFileName = Module.FS.readdir('.').find(name => name.endsWith('.log')); if (typeof logFileName ! == 'undefined') { const log = String.fromCharCode.apply(null, Module.FS.readFile(logFileName)); if (log.includes("frames successfully decoded")) { clearInterval(timer); Message. innerHTML = 'Complete transcoding '; const out = Module.FS.readFile('out.mp4'); const video = document.getElementById('output-video'); video.src = URL.createObjectURL(new Blob([out.buffer], { type: 'video/mp4' })); }}}, 500); }; document.getElementById('uploader').addEventListener('change', transcode); </script> <script type="text/javascript" src="./dist/ffmpeg-core.js"></script> </body> </html>Copy the code

Open the above web page to run.

How do I debug WebAssembly code?

The original way to debug WebAssembly

Chrome Developer Tools now supports WebAssembly debugging, although there are some limitations, but for WebAssembly text files can be a single command analysis and view the original stack trace, as shown in the following figure:

The above approach works well for some WebAssembly modules that have no other dependent functions because they involve only a small debugging scope. However, for complex applications, such as complex applications written in C/C++, where a module depends on many other modules and the source code is mapped differently from the compiled Text format of WebAssembly, the above debugging method is not intuitive and can only be understood by guessing. And complex assembly code is hard for most people to read.

More intuitive debugging method

Modern JavaScript projects usually have compilation process during development. Develop using ES6 and compile to ES5 or below version for running. If you need to debug code at this time, the concept of Source Map is involved. The Source map is used to map the position of the compiled code in the source code, making client-side code more readable and easier to debug without having a significant impact on performance.

Emscripten, a compiler for C/C++ to WebAssembly code, can inject debugging information into the code at compile time and generate the corresponding source map. You can then use the Chrome developer tools to debug C/C++ code by installing the C/C++ Devtools Support browser extension written by the Chrome team.

The idea here is that Emscripten, at compile time, generates a DWARF debug file, which is the common debug file format used by most compilers, and C/C++ Devtools Support parses DWARF files. Source Map provides source map information for Chrome Devtools debugging, enabling developers to debug C/C++ code on Chrome Devtools up to version 89+.

Debug simple C applications

Because DWARF format debug files provide for handling variable names, formatting typed print messages, executing expressions in source code, and so on, let’s actually write a simple C program, compile it into WebAssembly and run it in a browser, and see what the actual debugging looks like.

First let’s go to the WebAssembly directory we created earlier, activate emCC related commands, and see the activation effect:

CD emsdk && source emsdk_env.sh emcc --version # emcc (Emscripten GCC /clang-like replacement) 1.39.18 (a3beeb0d6c9825bd1757d03677e817d819949a77)Copy the code

Next create a temp folder in WebAssembly, then create a temp.c file, fill it with the following content and save it:

#include <stdlib.h>



void assert_less(int x, int y) {

  if (x >= y) {

    abort();

  }

}



int main() {

  assert_less(10, 20);

  assert_less(30, 20);

}
Copy the code

The above code will throw an exception if x >= y is encountered while executing asset_less, terminating the program.

Run the emcc command to switch the terminal directory to the temp directory for compilation:

emcc -g temp.c -o temp.html
Copy the code

The above command adds the -g argument to the normal compilation form, telling Emscripten to inject DWARF debugging information into the code at compile time.

Now you can start an HTTP server, using NPX serve. then go to localhost:5000/temp.html to see how it works.

Need to make sure that you have installed Chrome extension: chrome.google.com/webstore/de… Chrome Devtools updated to version 89+.

To see the effects of debugging, you need to set up a few things.

Open the WebAssembly debug option in Chrome Devtools

When you’re done, a Reload blue button appears at the top of the toolbar to Reload the configuration. Just click on it.

Set debugging options and pause where exceptions are encountered

Refresh the browser, and you will find that the breakpoint has stoppedtemp.js, JS glue code generated by Emscripten compilation, and then go down the call stack to find, you can seetemp.cAnd locate where the exception was thrown:

The C code is abort() in Chrome Devtools, and the value under the current scope can be checked as we did when we were debugging JS:

You can view the values of x and y as described above. You can also display the values by hovering the mouse over x.

View complex type values

In fact, Chrome Devtools can not only view the normal type values of some variables in the original C/C++ code, such as numbers, strings, but also view more complex structures, such as structures, arrays, classes, etc. Let’s take another example to show this effect.

To demonstrate this effect, we use an example of drawing mandelbrot graphics in C++. Again, create the mandelbrot folder in the WebAssembly directory, then add the mandelbrot.cc file and fill it with the following content:

#include <SDL2/ sdl.h > #include <complex> int main() {// initialize SDL int width = 600, height = 600; SDL_Init(SDL_INIT_VIDEO); SDL_Window* window; SDL_Renderer* renderer; SDL_CreateWindowAndRenderer(width, height, SDL_WINDOW_OPENGL, &window, &renderer); // Fill the artboard with a random color. Enum {MAX_ITER_COUNT = 256}; SDL_Color palette[MAX_ITER_COUNT]; srand(time(0)); for (int i = 0; i < MAX_ITER_COUNT; ++i) { palette[i] = { .r = (uint8_t)rand(), .g = (uint8_t)rand(), .b = (uint8_t)rand(), .a = 255, }; STD ::complex<double> center(0.5, 0.5); Double scale = 4.0; for (int y = 0; y < height; y++) { for (int x = 0; x < width; x++) { std::complex<double> point((double)x / width, (double)y / height); std::complex<double> c = (point - center) * scale; std::complex<double> z(0, 0); int i = 0; for (; i < MAX_ITER_COUNT - 1; i++) { z = z * z + c; If (abs(z) > 2.0) break; } SDL_Color color = palette[i]; SDL_SetRenderDrawColor(renderer, color.r, color.g, color.b, color.a); SDL_RenderDrawPoint(renderer, x, y); Renderpresent (renderer); renderer (renderer); // SDL_Quit(); }Copy the code

The above code is about 50 lines, but it references two C++ standard libraries: SDL and complex numbers, which makes our code a bit more complicated. Let’s compile the code to see how Chrome Devtools debugs.

Tell the Emscripten compiler to bring debugging information by attaching the -g tag at compile time, and ask Emscripten to inject the SDL2 library at compile time and allow the library to use any memory size at run time:

emcc -g mandelbrot.cc -o mandelbrot.html \

     -s USE_SDL=2 \

     -s ALLOW_MEMORY_GROWTH=1
Copy the code

Use same NPX serve. Command to open a local Web server, and then go to http://localhost:5000/mandelbrot.html you can see the following effects:

Open the developer tools and search for the mandelbrot.cc file. You can see the following:

We can hit a breakpoint on a line from the palette assignment statement in the first for loop, and then refresh the page. We see that the execution logic will pause at our breakpoint. By looking at the Scope pane on the right, we can see something interesting.

Using the Scope panel

We can look at complex types such as Center, palette, and expand them to see the specific value in the complex type:

View it directly in the program

At the same time, move the mouse over a variable such as palette to see the type of the value:

Use in the console

In the console, you can also get values by entering variable names, and still view complex types:

Complex types can also be evaluated and evaluated:

Use the Watch function

We can also use the watch function in the debug panel, add I in the for loop to the watch list, and then resume the program execution to see the change of I:

More complex step debugging

We can also use several other debugging tools: step over, step in, Step out, step, etc. If we use step over, we perform two steps backwards:

You can see the variable value for the current step, as well as the corresponding value in the Scope pane.

Debug against third-party libraries that are not source compiled

Before, we only compiled the mandelbrot.cc file, and asked Emscripten to provide us with the built-in SDL-related library during compilation. Since the SDL library was not compiled from source, we did not bring debugging information. So we can only debug by looking at C++ code in mandelbrot.cc, and we can only debug by looking at webassembly-related code for sdl-related content.

For example, we break at line 41, the SDL_SetRenderDrawColor call, and use step in to enter the function:

It will be in the following form:

We went back to the original WebAssembly debugging format, which was inevitable because we might encounter a variety of third-party libraries during development, but we couldn’t guarantee that every library would be compiled from source code with DWARF debugging information. For the most part, we have no control over the behavior of third-party libraries; On the other hand, sometimes we run into problems in production where there is no debugging information.

There’s no easy way to handle this, but developer tools improve the debugging experience by packaging all the code into a single WebAssembly file, which in this case is mandelbrot.wasm. So we don’t have to worry about which source file some of the code is from.

New naming generation strategy

In the previous debug panel, there were only numeric indexes for WebAssembly and no names for functions. Without the necessary type information, it was hard to track down a specific value because Pointers were presented as integers, but you didn’t know what was stored behind those integers.

The new naming policy references the naming policies of other disassembly tools, using the contents of the WebAssembly naming policy section, and the contents of the import/export path. As you can see, the debugging panel now displays the information related to function names:

Even if a program error is encountered, a name like $func123 can be generated based on the statement type and index, greatly improving the stack tracing and disassembly experience.

Viewing the Memory Panel

If you want to debug the memory usage of the program at this time, you can look at the Scope Module. Memories.$env. Memory in the Context of WebAssembly, but this will only see a few individual bytes. There is no way to know which other data format these bytes correspond to, such as ASCII. But the Chrome Developer tool also provides some other, more powerful forms of memory viewing. When you right-click env.Memory, you can select Reveal in Memory Inspector Panel:

Or click the little icon next to env.Memory:

You can open the memory panel:

From the memory panel, you can view WebAssembly memory in hexadecimal or ASCII format, navigate to specific memory addresses, and parse specific data into various formats, such as the ASCII character E for hexadecimal 65.

Perform a performance analysis of WebAssembly code

Because we inject a lot of debugging information into the code at compile time, the code we run is unoptimized and verbose, so it runs very slowly, so if you want to evaluate the performance of your program, you can’t use apis like performing. now or console.time, Because the performance-related numbers obtained by these function calls usually do not reflect real-world effects.

So if you need to perform a performance analysis of your code, you need to use the performance panel provided by the developer tools, which will run the code at full speed and provide clear breakpoints for how long the different functions take to execute:

It can be seen that the typical time points mentioned above, such as 161ms, or 461ms LCP and FCP, are performance indicators that can reflect the real world.

Or you can turn off the console while the page is loading, so you don’t have to call for debugging information, make sure it’s realistic, wait until the page is loaded, and then open the console to see the metrics.

Debug on different machines

When building on a Docker, virtual machine, or other original server, you may encounter a build where the source file path is inconsistent with the path on the local file system. This will cause the developer tool to display the file in the Sources panel at runtime, but not load the file contents.

To solve this problem, we need to set the path map in the previously installed C/C++ Devtools Support configuration, and click the extended “Options” :

Then add a pathmap. In old/path, fill in the path of the previous source file when it was built, and in new/path, fill in the path of the file that now exists on the local file system:

The above mapping functions much like some C++ debuggers such as GDB’s set substitute-path and LLDB’s target.source-map. In this way, when searching for the source file, the developer tool checks whether there is a mapping in the configured path mapping. If the source path fails to load the file, the developer tool tries to load the file from the mapping path. Otherwise, the loading fails.

Debug optimally built code

If you want to debug some code that has been optimized at build time, you may have a less than ideal debugging experience because the functions are inlined during an optimized build, and you may have to reorder the code or remove some useless code, which can confuse the debugger.

At present, developer tools can support most of the debugging experience of optimized code except for function inlining. In order to reduce the debugging impact caused by the lack of function inlining support ability, It is recommended to use the -fno-inline flag when compiling code to disable inline processing of functions during optimized builds (usually with -o arguments). The developer tools will fix this in the future. So the script for compiling the simple C program mentioned earlier is as follows:

emcc -g temp.c -o temp.html \

     -O3 -fno-inline
Copy the code

Store debugging information separately

Debugging information contains details of the code, defined types, variables, functions, function scopes, and file locations, anything that is useful to the debugger, so it is often larger than the source code.

To speed up compilation and loading of WebAssembly modules, you can split debugging information into separate WebAssembly files at compile time and load them separately. To split separate files, you can add -gseparate-dwarf at compile time:

emcc -g temp.c -o temp.html \

     -gseparate-dwarf=temp.debug.wasm
Copy the code

After doing this, the compiled main application code only stores the file name of temp.debug.wasm, and when the code loads, the plug-in locates the debug file and loads it into the developer tools.

If we want to optimize the build at the same time and split the debug information separately, and then load the local debug file when we need to debug later, in this case, we need to reload the address of the debug file to help the plug-in find the file, we can run the following command to handle this:

emcc -g temp.c -o temp.html \ -O3 -fno-inline \ -gseparate-dwarf=temp.debug.wasm \ -s SEPARATE_DWARF_URL=file://[temp.debug.wasm storage address in the local file system]Copy the code

Debug ffMPEG code in a browser

This article gives you an in-depth look at how to debug C/C++ code built from Emscripten in a browser. It explains a common no-dependency example and an example that relies on the C++ standard library SDL. It also explains what debugging tools can do and what their limitations are. Let’s use what we’ve learned to see how to debug FFMPEg-related code in a browser.

Build with debug information

We just need to modify the build-with-emcc.sh script mentioned in the previous article and add the flag corresponding to -g:

ROOT=$PWD BUILD_DIR=$ROOT/build CD ffmpeg-4.3.2-3 ARGS=(-g # -i./fftools -I$BUILD_DIR/ include-llibavcodec -llibavdevice -llibavfilter -llibavformat -llibavresample -Llibavutil -Llibpostproc -Llibswscale -Llibswresample -L$BUILD_DIR/lib -Qunused-arguments -o wasm/dist/ffmpeg-core.js fftools/ffmpeg_opt.c fftools/ffmpeg_filter.c fftools/ffmpeg_hw.c fftools/cmdutils.c fftools/ffmpeg.c -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lpostproc -lm -lx264 -pthread -O3 # Optimize code with performance first -s USE_SDL=2 # use SDL2 -s USE_PTHREADS=1 # enable pthreads support -s PROXY_TO_PTHREAD=1 # detach main() from browser/UI main thread -s INVOKE_RUN=0 # not to run the main() in the beginning -s EXPORTED_FUNCTIONS="[_main, _proxy_main]" # export main and proxy_main funcs -s EXTRA_EXPORTED_RUNTIME_METHODS="[FS, cwrap, setValue, writeAsciiToMemory]" # export preamble funcs -s INITIAL_MEMORY=268435456 # 268435456 bytes = 268435456 MB ) emcc "${ARGS[@]}" cd -Copy the code

Then perform other operations with this, and finally run our script from Node server.js and open http://localhost:8080/ to see what it looks like:

As you can see from Sources, we can search for the built ffmpeg.c file. We can break the nb_output loop at line 4865:

Then upload a video in AVI format to the web page and pause at the breakpoint:

As you can see, we can still view variable values by mouse-over in the program, in the Scope panel on the right, and in the console as before.

Similarly, we can perform step over, step in, step out, step and other complex debugging operations, or watch a variable value, or check the memory at this time, etc.

As you can see from this article, you can debug C/C++ projects of any size in a browser and use most of the features currently provided by developer tools.

On the future of WebAssembly

This article just lists some of the main application scenarios for WebAssembly today, including the high performance, lightweight, and cross-platform features of WebAssembly that allow you to run languages like C/C++ on the Web and desktop applications on the Web container.

But what this article didn’t cover was WASI, a standardized system interface that allows WebAssembly to run on any system. As WebAssembly performance increases, WASI can provide a practical way to run any code on any platform. Just like Docker does, but not limited to the operating system. As the founders of Docker put it:

“If WASM+WASI had been around in 2008, there would have been no need to create Docker. WASM on servers is the future of computing, the standardized system interface we’ve all been waiting for.

Another interesting thing is that WASM client development frameworks like Yew may become as popular in the future as React/Vue/Angular.

WASM’s package management tool, WAPM, may become the preferred way to share packages between frameworks in different languages thanks to the cross-platform nature of WASM.

At the same time, WebAssembly is mainly developed by W3C, and it is a project sponsored and jointly maintained by various manufacturers, including Microsoft, Google, Mozilla, etc. It is believed that WebAssembly will have a very promising future.

Q & A

Answering questions…

How do I compile complex CMake projects into WebAssembly?
How do you explore a common set of best practices when compiling complex CMake projects into WebAssembly?
How to Debug with CMake projects?

Question:

The volume of compiled code

Refer to the link

www.ruanyifeng.com/blog/2017/0…
Pspdfkit.com/blog/2017/w…
Hacks.mozilla.org/2017/02/wha…
www.sitepoint.com/understandi…
www.cmake.org/download/
Developer.mozilla.org/en-US/docs/…
Research.mozilla.org/webassembly…
Itnext. IO/build – ffmpe…
Dev. To/alfg/ffmpeg…
Gist.github.com/rinthel/f4d…
Github.com/Kagami/ffmp…
Qdmana.com/2021/04/202…
Github.com/leandromore…
Ffmpeg.org/doxygen/4.1…
Github.com/alfg/ffmpeg…
Github.com/alfg/ffprob…
Gist.github.com/rinthel/f4d…
Emscripten.org/docs/compil…
Itnext. IO/build – ffmpe…
Github.com/mymindstorm…
Github.com/emscripten-…
Github.com/FFmpeg/FFmp…
Yeasy. Gitbook. IO/docker_prac…
Debugging WebAssembly with modern tools – Chrome Developers
www.infoq.com/news/2021/0…
Developer.chrome.com/blog/wasm-d…
Lucumr.pocoo.org/2020/11/30/…
V8. Dev/docs/wasm – c…
Debugging WebAssembly with Chrome DevTools | by Charuka Herath | Bits and Pieces (bitsrc.io)
Making Web Assembly Even Faster: Debugging Web Assembly Performance with AssemblyScript and a Gameboy Emulator | by Aaron Turner | Medium
zhuanlan.zhihu.com/p/68048524
www.ruanyifeng.com/blog/2017/0…
www.jianshu.com/p/e4a75cb6f…
www.cloudsavvyit.com/13696/why-w…
Mp.weixin.qq.com/s/LSIi2P6FK…

❤️ Thank you

That is all the content of this sharing. I hope it will help you

Don’t forget to share, like and bookmark your favorite things.

Welcome to the public account ELab team harvest dachang good article ~

We are from the front end department of Bytedance, responsible for the front end development of all bytedance education products.

We focus on product quality improvement, development efficiency, creativity and cutting-edge technology and other aspects of precipitation and dissemination of professional knowledge and cases, to contribute experience value to the industry. Including but not limited to performance monitoring, component library, multi-terminal technology, Serverless, visual construction, audio and video, artificial intelligence, product design and marketing, etc.

Bytedance calibration/social recruitment internal push code: 6466HRE

Post links: job.toutiao.com/s/LdnSw2C

Why is WebAssembly the future of the Web?