File system - C/C++ programming for WASM open source book

Cross-platform C/C++ programs often use libc/libcxx synchronous file access functions such as fopen(), fread(), and fwrite(). When it comes to file systems, there are significant differences between common JavaScript programs and C/C++ native programs.

JavaScript programs running in a browser cannot access the local file system;
In JavaScript, either AJAX orfetch(), are asynchronous operations.

Emscripten provides a set of virtual file systems that are compatible with liBC/libcXX synchronous file access functions.

3.3.1 Emscripten Virtual file system architecture

The Emscripten virtual file system architecture is shown below:

Tips Asynchronous file system API is a set of functions declared in Emscripten. h, which can only be used in the Emscripten environment and does not conform to the “compile objective insensitive” philosophy of this book.

At the lowest level, Emscripten provides three file systems:

MEMFS: Memory file system. The data of the system is completely stored in memory. The data written by the program will be lost after page refresh or program overload.
NODEFSNode.js file system. The system can access the local file system, can persist storage, but only for node.js environment;
IDBFS: IndexedDB file system. The system is based on browser-based IndexedDB objects that can be stored persistently, but only in a browser environment.

The Emscripten synchronous file system API encapsulates the above three file systems through the JavaScript object FS for liBC /libcxx file access functions such as fopen(), fread(), and fwrite() to call.

From the point of view of the call syntax, C/C++ code is the same as generating native code, but be aware that different underlying file systems have their own characteristics and the resulting differences in business logic. Emscripten virtual file system contains a lot of content, which is not enough to be a book alone. Due to space limitation, this section briefly introduces memFs-based packaged file system, while NODEFS and IDBFS only give simple examples without too much expansion.

3.3.2 rainfall distribution on 10-12`MEMFS`/ Package the file system

Files need to be packaged before they can be imported into MEMFS. File packaging can be done from the EMCC command line or using a separate file packaging tool, file_packager.py.

There are two modes for packaging: Embed and preload. In Embed mode, file data is converted to JavaScript code; In preload mode, in addition to the.js file, an additional.data file with the same name will be generated, which contains the binary data of all files. The generated.js file will contain the glue code for downloading and loading the.data file package.

Tips Embed requires textual encoding of data, resulting in larger packages than preload, so use Preload unless the total amount of files to be packed is very small.

When using the emCC command, the –preload-file argument is used to pack a specified file or folder in preload mode, while the –embed-file argument is used to pack a specified file or folder in embed mode.

For example, the C code file packfile.cc contains a text file named hello. TXT. Run the following command in the packfile.cc directory:

emcc packfile.cc -o packfile.js --preload-file hello.txt
Copy the code

Packfile.js and packfile.data will be generated. Hello.txt is packaged in packfile.data. C reads the contents of hello.txt and prints:

//packfile.cc int main() { FILE* fp = fopen("hello.txt", "rt"); if (fp) { while (! feof(fp)) { char c = fgetc(fp); if (c ! = EOF) { putchar(c); } } fclose(fp); } return 0; }Copy the code

The console will output:

The preload-file parameter can be used to package not only a single file, but also the entire directory. For example, the C code file packdir.cc contains a folder named dat_dir. TXT, which has the following structure:

|--packdir.cc
|--dat_dir
   |--t1.txt
   |--t2.txt
   |--sub_dir
      |--t3.txt
Copy the code

Run the following command in the packdir.cc directory:

emcc packdir.cc -o packdir.js --preload-file dat_dir
Copy the code

The package file packdir.data is generated, which contains all the contents of dat_dir. C code is as follows:

//packdir.cc void read_fs(const char* fname) { FILE* fp = fopen(fname, "rt"); if (fp) { while (! feof(fp)) { char c = fgetc(fp); if (c ! = EOF) { putchar(c); } } fclose(fp); } } void write_fs() { FILE* fp = fopen("t3.txt", "wt"); if (fp) { fprintf(fp, "This is t3.txt.\n"); fclose(fp); } } int main() { read_fs("dat_dir/t1.txt"); read_fs("dat_dir/t2.txt"); read_fs("dat_dir/sub_dir/t3.txt"); write_fs(); read_fs("t3.txt"); return 0; }Copy the code

The console will output:

Emscripten uses the Unix-style directory separator “/”, and from the point of view of C/C++ code, the package files will be loaded under the current path. Once the package file is loaded, files and folders can be created and data written, but the data written is actually in javascript-managed memory, and all writes are lost when the page refreshes.

The Python script file_packager.py under

/

/tools/ can perform file packaging separately, such as the following command to preload the dat_dir directory into fp.data and fp.js:

Python emsdk/1.38.11/tools/file_packager.py fp.data --preload dat_dir --js-output=fp.jsCopy the code

When a plug-in file package is used, the -s FORCE_FILESYSTEM=1 parameter must be added during the compilation of the main program to forcibly enable the file system, for example:

emcc packdir.cc -o packdir_sep.js -s FORCE_FILESYSTEM=1
Copy the code

In the web page, we must first introduce the plug-in file package js, and then introduce the main program js:

//packdir_sep.html
	<script src="fp.js"></script>
	<script src="packdir_sep.js"></script>Copy the code

The console output from the above example is still:

Info although download package is asynchronous, but Emscripten can ensure that when the runtime ready, file system initialization is complete, so in the Module. The onRuntimeInitialized callback function used in the file system is safe.

3.3.3 `NODEFS`

Here is an example using NODEFS:

//nodefs.cc void setup_nodefs() { EM_ASM( FS.mkdir('/data'); FS.mount(NODEFS, {root:'.'}, '/data'); ) ; } int main() { setup_nodefs(); FILE* fp = fopen("/data/nodefs_data.txt", "r+t"); if (fp == NULL) fp = fopen("/data/nodefs_data.txt", "w+t"); int count = 0; if (fp) { fscanf(fp, "%d", &count); count++; fseek(fp, 0, SEEK_SET); fprintf(fp, "%d", count); fclose(fp); printf("count:%d\n", count); } else { printf("fopen failed.\n"); } return 0; }Copy the code

Note that setup_nodefs() uses the EM_ASM macro to execute the attached NODEFS JavaScript script: Fs.mkdir (‘/data’) creates the /data directory in the virtual file system. Fs.mount (NODEFS, {root:’.’}, ‘/data’) mounts the current local directory to the above directory. Each time the main() function is run, it opens /data/nodefs_data.txt — corresponding to nodefs_data.txt in the current local directory, and reads an integer from it, incrementing it and writing it back. Compile the above code with EMCC:

emcc nodefs.cc -o nodefs.js
Copy the code

Run nodefs.js several times with Node, and the output is as follows:

> node nodefs.js
count:2
> node nodefs.js
count:3
> node nodefs.js
count:4
Copy the code

We do`IDBFS`

Here is an example using IDBFS:

void sync_idbfs() { EM_ASM( FS.syncfs(function (err) {}); ) ; } EM_PORT_API(void) test() { FILE* fp = fopen("/data/nodefs_data.txt", "r+t"); if (fp == NULL) fp = fopen("/data/nodefs_data.txt", "w+t"); int count = 0; if (fp) { fscanf(fp, "%d", &count); count++; fseek(fp, 0, SEEK_SET); fprintf(fp, "%d", count); fclose(fp); printf("count:%d\n", count); sync_idbfs(); } else { printf("fopen failed.\n"); } } int main() { EM_ASM( FS.mkdir('/data'); FS.mount(IDBFS, {}, '/data'); FS.syncfs(true, function (err) { assert(! err); ccall('test', 'v'); }); ; return 0; }Copy the code

Similar to NODEFS, IDBFS mounts through the fs.mount () method. In fact, IDBFS still uses memory to store virtual file systems at runtime, but IDBFS can use fs.syncfs () to bidirectionallysynchronize in-memory data with IndexedDB for persistent storage. Fs.syncfs () is an asynchronous operation, so in the example above, the test() function that reads and writes files must be called in the fs.syncfs () callback. After each page refresh, the console output count increases by 1:

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

File system – C/C++ programming for WASM open source book

3.3.1 Emscripten Virtual file system architecture

3.3.2 rainfall distribution on 10-12`MEMFS`/ Package the file system

3.3.3 `NODEFS`

We do`IDBFS`

File system – C/C++ programming for WASM open source book

3.3.1 Emscripten Virtual file system architecture

3.3.2 rainfall distribution on 10-12MEMFS/ Package the file system

3.3.3 NODEFS

We doIDBFS

Related Posts

JavaScript Array, Object, Array, Object data conversion and processing Summary (2)

Webpack source code parsing based on HMR plug-in

Suggest collection 】 【 HTML dry share | challenge the shortest time take you into the HTML (20)

3.3.2 rainfall distribution on 10-12`MEMFS`/ Package the file system

3.3.3 `NODEFS`

We do`IDBFS`

Suggest collection 】【 HTML dry share | challenge the shortest time take you into the HTML (20)