• Writing fast and safe native node.js modules with Rust
  • Originally by Peter Czibik
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: LeopPro
  • Proofread by Serendipity96

Write fast and safe native Node.js modules using Rust

Content summary – use Rust instead of C++ to develop native node.js modules!

RisingStack had a tough time last year: we were getting node.js to perform as well as we could, yet our server overhead was still at its peak. In order to improve the performance (and reduce the cost) of our application, we decided to completely rewrite it and migrate the system to other infrastructures – a lot of work, no doubt, that is not covered here. Then I realized that all we had to do was write a native module!

At the time, we didn’t realize there was a better way to solve our performance problems. Just a few weeks ago, I found out that there is an alternative to implementing native modules using Rust instead of C ++. I found this to be a good choice, thanks to the security and ease of use it provides.

In this Rust tutorial, I will walk you through writing an advanced, fast, and secure native module.

Node.js server performance issues

Our problems came to light in late 2016 when we were working on Node.js monitoring product Trace, which merged with Keymetrics in October 2017. Like other tech startups at the time, we deployed our services to Heroku to save on some infrastructure costs and maintenance. We have been building microservices architecture applications, which means that many of our services communicate over HTTP(S).

Here’s the tricky part: We wanted secure communication between services, but Heroku didn’t support private networks, so we had to implement our own solution. So we looked at some security authentication schemes and settled on HTTP signatures.

To explain briefly: HTTP signatures are based on asymmetric cryptography. To create an HTTP signature, you need to get all the parts of a request: the URL, the request header, the request body, and sign it with your private key. You can then send the public key to the devices that will receive the signature request for verification.

Over time, we have found that CPU utilization has reached its limit in most HTTP server processes. Obviously, there’s a reason to be suspicious – if you want to encrypt, that’s what happens.

However, after a rigorous analysis of the V8-profiler, we found that the problem was not caused by encryption! It is URL parsing that consumes the most CPU time. Why is that? For validation, the URL must be parsed to validate the request signature.

To solve this problem, we decided to scrap Heroku (among other things) and instead of optimizing our URL parsing, we created a Google cloud infrastructure that included Kubernetes and the internal network.

What prompted me to write this story? Just a few weeks ago, I realized that we could optimize URL parsing in another way — by writing a native library in Rust.

Writing native modules – Requires a Rust module

It shouldn’t be that hard to write native code, right?

At RisingStack, we adhere to the principle that if you want to do a good job, you must first sharpen your tools. We often investigate better ways to build software components and, where necessary, use C++ to write native modules.

Shamelessly: I’ve also blogged about my journey to learn the native Node.js module. Go and have a look!

Until now, I thought C++ was the right choice for writing fast and efficient software in most business scenarios. Now, however, we have modern tools (in this case Rust) that we can use to write more efficient, safer, and faster code with less labor cost than ever before.

Let’s go back to the original question: is it difficult to parse a URL? It includes protocols, hosts, query parameters…



Node.js documentation

This looks really complicated. When I read through the URL Standard, I realized I didn’t want to implement it myself, so I started looking for alternatives.

I’m sure I’m not the only one who wants to parse urls. The browser probably figured it out, so I searched for Chromium’s solution: Google Links. Although it would be easy to call this implementation from Node.js using n-API, there are several reasons why I don’t:

  • Update: WHEN I just copied and pasted code from the web, I immediately felt uneasy. People have been doing this for a long time, and there are always many reasons why they don’t work well… There is no good way to update large chunks of code in a code base.
  • Security: a person without extensive C++ programming experience cannot verify that the code is correct, but we have to run it on our server. C++ has a steep learning curve and takes a long time to master.
  • Privacy: we’ve all heard that usable C++ code exists, but I prefer to avoid reusing C++ code because I can’t audit it on my own. Using a well-maintained open source module gives me enough confidence that I don’t have to worry about its privacy.

So I prefer a language that is easier to use, with a simple update mechanism and modernization: Rust!

A quick word about Rust

Rust allows us to write fast and efficient code.

All Rust projects are managed by Cargo – the NPM of Rust. Cargo can install project dependencies and has a registry containing all the packages you need to use.

I found a library-rust-URL that I could use in our example, thanks a lot to the Servo team for their work.

We also use Rust FFI! I wrote a related blog two years ago using Rust FFI with Node.js. Much has changed in the Rust ecosystem since then.

We have a working library (rust-URL), let’s try compiling it!

How do I compile a Rust application?

According to the Rustup.rs guide, we could use the RUSTC compiler, but we should be more concerned with cargo for now. I won’t go into the details of how it works, but if you’re interested, check out our previous Rust post.

Create a new Rust project

Creating a new Rust project is as simple as cargo new –lib < project name >.

You can view the full code in my warehouse at github.com/peteyy/rust…

To reference the Rust library, we simply list it as a dependency in Cargo. Toml.

[package]
name = "ffi"
version = "1.0.0"
authors = ["Peter Czibik <[email protected]>"]

[dependencies]
url = "1.6"
Copy the code

Rust doesn’t have a command to install dependencies like NPM install – you’ll have to add it yourself manually. However, cargo Edit Crate does something similar.

The rust-URL for Rust also belongs to Crate. Crates. IO allows Rust developers around the world to search for or publish Crate.

Rust FFI

To invoke Rust from Node.js, we can use FFI provided by Rust. FFI stands for Foreign Function Interface. An external function interface (FFI) is a mechanism written in one programming language that can invoke routines written in another language or use services.

To link our library, we need to add two more things to Cargo. Toml

[lib]
crate-type = ["dylib"]

[dependencies]
libc = "0.2"
url = "1.6"
Copy the code

Note here: our library is a dynamically linked library with a.dylib file extension that is loaded at run time, not compile time.

We also added liBC dependencies to the project, which is the ANSI C compliant C language standard library.

Libc Crate is a library of Rust that has native bindings to common types and functions found in various systems, including LIBC. This allows us to use C type in Rust code, and we must use it for any C-type data we want to receive or return in Rust functions.

Our code is fairly simple — I use the extern Crate keyword to reference the URL and libc Crate. We’re going to mark functions as pub extern so that they can be exposed externally via FFI. Our function holds a C_char pointer representing the String type in Node.js.

We need to mark type conversions as unsafe. Unsafe code blocks marked with the unsafe keyword can access unsafe functions or unrefer raw Pointers in safe functions.

Rust uses the Option

type to represent a nullable value. Just like in JavaScript a value can be null or undefined. You can (and should) check explicitly every time you try to access a value that might be empty. In Rust, there are several ways to access it, but here I’ll use the simplest: If the value is empty, an error (panic in Rust terms) unwrap will be thrown.

Once we’ve done URL parsing, we need to convert the result to a CString before we can pass it back to JavaScript.

extern crate libc;
extern crate url;

use std::ffi::{CStr,CString};
use url::{Url};

#[no_mangle]
pub extern "C" fn get_query (arg1: *const libc::c_char) -> *const libc::c_char {

    let s1 = unsafe { CStr::from_ptr(arg1) };

    let str1 = s1.to_str().unwrap();

    let parsed_url = Url::parse(
        str1
    ).unwrap();

    CString::new(parsed_url.query().unwrap().as_bytes()).unwrap().into_raw()
}
Copy the code

To compile the Rust code, you can use the cargo Build –release command. Before compiling, make sure you add the URL library to your Cargo. Toml dependency!

Now we can use the Node.js FFI package to create a module that calls Rust code.

const path = require('path');
const ffi = require('ffi');

const library_name = path.resolve(__dirname, './target/release/libffi');
const api = ffi.Library(library_name, {
  get_query: ['string'['string']]}); module.exports = { getQuery: api.get_query };Copy the code

The cargo build –release command compiles.dylib as lib*, where * is your library name.

Pleasantness: We already have a Rust code that can be called from Node.js! As you may have noticed, we had to do a lot of casting, which added to the overhead of our function calls. There must be a better way to integrate our code with JavaScript.

First met Neon

Rust binding for writing secure, fast native Node.js modules.

Neon makes it possible to use JavaScript types in Rust code. To create a new Neon project, you can use the command line tools that come with it. Run NPM install neon-cli –global to install it.

Executing neon new < projectName > will create a new neon project without any configuration.

Once the Neon project is created, we’ll rewrite the above code as follows:

#[macro_use]
extern crate neon;

extern crate url;

use url::{Url};
use neon::vm::{Call, JsResult};
use neon::js::{JsString, JsObject};

fn get_query(call: Call) -> JsResult<JsString> {
    let scope = call.scope;
    leturl = call.arguments.require(scope, 0)? .check::<JsString>()? .value();letparsed_url = Url::parse( &url ).unwrap(); Ok(JsString::new(scope, parsed_url.query().unwrap()).unwrap()) } register_module! (m, { m.export("getQuery", get_query)
    });
Copy the code

In the code above, the new types JsString, Call, and JsResult encapsulate JavaScript types so that we can plug into the JavaScript VM and execute the code above. Scope binds our new variables to the current JavaScript domain, which makes our variables recyclable by the garbage collector.

This is very similar to how I explained writing native node.js modules in C++ in my previous blog post.

Note that the #[macro_use] attribute allows us to use register_module! Macros, which allow us to create modules like module.exports in Node.js.

The only tricky part is the access to the parameters:

leturl = call.arguments.require(scope, 0)? .check::<JsString>()? .value();Copy the code

We have to accept all types of arguments (just like any JavaScript function), so there’s no way to determine the number of arguments, which is why we have to check if the first element exists.

Other than that, we can get rid of most of the serialization work and just use Js types.

Now let’s try to run it!

If you downloaded my sample code beforehand, you’ll need to go to the FFi folder and execute cargo Build –release, and then go to the neon folder and execute neon Build (neon- CLI installed beforehand).

If you’re ready, you can use Node.js’s Faker Library to generate a new list of urls.

Execute the node Generateurls.js command, which will create a urls.json file in your folder that our test program will attempt to parse later. Once that’s done, you can run the benchmark by executing Node urlParser.js, and if all succeeds, you should see the following:

The test program parsed 100 urls (randomly generated), and our application only needed one run to parse the results. If you want to benchmark, increase the number of urls (tryCount in urlParser.js) or the number of urls (urlGenerator.js).

Obviously, the best performance in the benchmark is Rust Neon, but as the array length increases, V8 has more and more room for optimization, and the results will be close. Eventually it will surpass the Rust Neon implementation.

This is just a simple example, and of course we still have a lot to learn in this area,

Later, we can further optimize our calculations to maximize performance by taking advantage of concurrent computation, something that Crates like Rayon provides.

Implement the Rust module in Node.js

I hope you learned from me today how to implement Rust modules in Node.js so you can benefit from new tools. What I want to say is that while this is problem-solving (and fun), it’s not a silver bullet to solve every performance problem.

Keep in mind that Rust can be a convenient solution in some scenarios

If you want to see my talk on this topic at the Rust Hungary Seminar, click here!

If you have any questions or comments, please leave them below and I’ll get back to you here!


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.