background

One of our projects uses Electron to develop a desktop application that runs cross-platform on both Windows and Mac, so the core logic is written in JavaScript, making it very easy for hackers to unpack our application, modify the logic to break commercial restrictions, and repackage it. Go redistribute the cracked version.

We already have digital signatures, but that’s not enough. To really solve the problem, in addition to putting all the commercial logic on the server side, we need to harden the code to avoid unpacking, tampering, repackaging, and redistributing.

Scheme comparison

The mainstream plan

  • Uglify / Obfuscator
    • Introduction: Minimize the readability of JS code by uglifying and obfuscating it.
    • Features:Easy to unpack,Easy to read,Easy to tamper with,Easy secondary packing
    • Advantage: Simple access.
    • Disadvantages: Both code formatting tools and obfuscation decontamination tools can restore some degree of code. Ugliness may cause code to fail by changing variable names. Obfuscation, by adjusting code structure, has a great impact on code performance and may cause code execution failure.
  • Native encryption
    • Introduction: The Bundle, the build product of Webpack, is encrypted by XOR or AES, encapsulated into Node Addon, and decrypted by JS at runtime.
    • Features:Unpacking has a cost,Easy to read,Easy to tamper with,Easy secondary packing
    • Advantages: has a certain protective effect, can block small white.
    • Cons: Unpacking is very easy for a hacker familiar with Node and Electron. But if your application supports DevTools, you can see the source code directly from DevTools and then distribute it. If your application does not support DevTools, simply copy Node Addon to a DevTools-enabled Electron and you will still see the source code.
  • ASAR encryption
    • Introduction: Encrypt the Electron ASAR file, modify the Electron source code, decrypt the ASAR file before reading it, and then run it.
    • Features:It is difficult to unpack,Easy to read,Easy to tamper with,Easy secondary packing
    • Advantages: strong protection, can block many hackers.
    • Disadvantages: The initial cost of rebuilding Electron is high. But a hacker can parse the source code and repackage it for distribution by forcing port Inspect on or DevTools in the application to read it, or by memory Dump.
  • V8 bytecode
    • Description: The VM module in the Node standard library can generate its cache data from Script objects (see). The cached data can be interpreted as v8 bytecode, which is distributed to achieve source code protection.
    • Features:Easy to unpack,Difficult to read,It is difficult to tamper with,Easy secondary packing
    • Advantages: The generated bytecode is not only nearly unreadable, but also hard to tamper with. And do not save the source code.
    • Disadvantages: More intrusive to the build process and no convenient solution. You can still read strings and things like that in bytecode, and you can tamper with them.

Plan to introduce

About V8 bytecode

V8. Dev /blog/code-c…

Read more:

  • bytenode/bytenode
  • Understand V8 bytecode “translation”
  • How to protect Node.js source code by bytecode

As we can understand, V8 bytecode is a serialized form of JavaScript parsed and compiled by the V8 engine, and it is often used for performance optimization within the browser. So if we run the code through V8 bytecode, we can not only protect the code, but also improve performance.

Instead of going too far into V8 bytecode, you can read the above two articles to get a background on the technology and implementation of code protection with V8 bytecode.

Limitations of V8 bytecode

Limitations in code protection

V8 bytecode does not protect strings, so if we write some database keys and other information in JS code, we can still see the string contents directly by reading V8 bytecode as a string. The simpler way, of course, is to use a non-string key in Binary form.

In addition, the binaries generated in the technical scheme above can be easily redistributed if they are directly modified slightly. For example, write isVip to true, or disable auto-update by changing the auto-update URL to a false address. To avoid these situations, we want to put more protection on top of this layer to make cracking more expensive.

Impact on builds

The v8 bytecode format depends on the VERSION and environment of V8, and the bytecode output varies from version to version or environment to environment. Electron has two processes, Browser and Renderer. Although the v8 version is the same, the bytecode products of the two processes differ due to the different methods of injection and the different running environments. V8 bytecode generated in the Browser process does not run in the Renderer process or vice versa. Of course, bytecode generated in Node.js won’t run on Electron either. Therefore, we need to build code for the Browser process in the Browser process and code for the Renderer process in the Renderer process.

Impact on debugging and support for Sourcemap

Sourcemap is affected because we have replaced all the code used to construct vm.script with dummyCode for placeholders, and filename is no longer in effect. So there is a certain impact on the location of the code during debugging.

Impact on code size

With only a few lines of JS code, compiling to bytecode can greatly increase the file size. If you have a large number of small JavaScript files in your project, the size of your project can increase significantly. Of course, for a few million JS bundles, the increment in volume is negligible.

One step further – obfuscate and run via Node Addon

Based on these limitations, we embed V8 bytecode into a Node Addon that node.js can run. And run the embedded V8 bytecode in the Node Addon. This not only protects constant information on V8 bytecode, but also hides the entire bytecode scheme within a Node Addon.

Using N – API

In order to avoid rebuild, we need to use N – API as Node Addon, specific advantages can refer to: Node – API | Node. Js v15.14.0 Documentation

Use Rust and Neon Bindings

Rust is a simple technology selection preference. Compared with C++, Rust is relatively safe in memory, easy to use in building toolchains, and has strong cross-platform capabilities. Therefore, we choose Rust as the implementation solution of Node Addon.

Rust also has include_bytes! Macros, which embed binaries directly into the dynamically linked libraries produced by the build at compile time, are much simpler than C++ solutions that require codegen implementation.

Of course, Rust cannot be used to write Node Addons directly, but needs to be developed with Neon Bindings. Neon Bindings is a Rust-wrapped library for Node apis. It hides Node apis in the underlying implementation and exposes easy-to-use Rust apis to Rust developers. (Rust Bindings /neon Support is not supported by Node API, Node API Support progress reference Quest: N-API Support · Issue #444 · neon- Bindings /neon)

The specific implementation

In terms of implementation, it is mainly the transformation of the construction tool flow. For the specific construction process, please refer to the diagram:

Compiled bytecode

In most Electron application scenarios, whether Webpack or other Bundler tools are used, more than two bundles will be generated, one for the main process and one for single/multiple renderers. We assume the name of the build product, depending on the actual application scenario. We built two or more bundles using Bundler, assuming the names are main.js and renderer.js.

After completing the Bundle build, you need to compile both bundles into bytecode. Since we need to run the two bundles in the Electron environment, we need to complete bytecode generation in the Electron environment. For bundles used for the main process, bytecode can be generated directly in the main process, whereas for bundles used for rendering the process, we need to create a new browser window and generate bytecode in it. We create two js files:

electron-main.js

// This file can be run directly with electron.

const fs = require('fs');
const path = require('path');
const rimraf = require('rimraf');
const { BrowserWindow, app } = require('electron');
const { compile } = require('./bytecode');

async function main() {
  // Enter the directory to store the JS bundle to be compiled
  const inputPath = path.resolve(__dirname, 'input');
  Js -> main.bin
  const outputPath = path.resolve(__dirname, 'output');

  // Clean up and recreate the output directory
  rimraf.sync(outputPath);
  fs.mkdirSync(outputPath);

  // Read the raw js and generate the bytecode
  const code = fs.readFileSync(path.resolve(inputPath, 'main.js'));
  fs.writeFileSync(path.resolve(outputPath, 'main.bin'), compile(code));

  // Launches a browser window for compiling the bytecode of the renderer process
  await launchRenderer();
}


async function launchRenderer() {
  await app.whenReady();

  const win = new BrowserWindow({
    webPreferences: {
      // We use preload to execute JS in the renderer, so we don't need an HTML file.
      preload: path.resolve(__dirname, './electron-renderer.js'),
      enableRemoteModule: true.nodeIntegration: true,}}); win.loadURL('about:blank');
  win.show();
}

main();
Copy the code

electron-renderer.js

// This file is run in a browser window created by electorn main.js.

const fs = require('fs')
const path = require('path')
const { remote } = require('electron')
const { compile } = require('./bytecode');

async function main() {
  const inputPath = path.resolve(__dirname, 'input')
  const outputPath = path.resolve(__dirname, 'output')

  const code = fs.readFileSync(path.resolve(inputPath, 'renderer.js'))
  fs.writeFileSync(path.resolve(outputPath, `renderer.bin`), compile(code));
}

// Close the browser window to inform the main process that compilation is complete
main().then(() = > remote.getCurrentWindow().close())
Copy the code

Next we need to implement bytecode.js, which is the logic for compiling bytecodes:

bytecode.js

const vm = require('vm');
const v8 = require('v8');

// These two parameters are very important to ensure that the bytecode can be run.
v8.setFlagsFromString('--no-lazy');
v8.setFlagsFromString('--no-flush-bytecode');

function encode(buf) {
  // Here we can do some obfuscation logic, such as xor.
  return buf.map(b= > b ^ 12345);
}

exports.compile = function compile(code) {
  const script = new vm.Script(code);
  const raw = script.createCachedData();
  return encode(raw);
};
Copy the code

About confusion: In order not to affect the startup speed of the application, it is not recommended to use AES encryption algorithm. Because even with AES, bytecode build artifacts can be retrieved in various ways (in-memory Dump, Hook, etc.). Bytecodes are obfuscated to mention the cost of decryption and to avoid the possibility of extracting constants directly from Node Addon Binary data.

With the above files in place, we can compile bytecode for main.js and renderer.js in the input folder directly using the electron./electron-main.js command. The product will be generated in the Output folder.

A visible BrowserWindow is created at compile time. If you do not want it to be visible, set the creation BrowserWindow parameter to hide: True.

Encapsulation Native Addon

We used Rust to develop Node Addon.

Subsequent exist some direct perform JS logic operation in Rust, which involved some reference Node module, construct object, such as operation, can refer to Neon Bindings documents: the Introduction | Neon.

Referencing the Node module

The require method does not exist in the Global object. Instead, it exists in the scope of module code execution.

(function (exports.require.module, __filename, __dirname) {
  /* Module file code */
});
Copy the code

Module, require, exports, __filename, __dirname are all exposed to modules as local variables, so the Global object does not hold these contents.

Therefore, we cannot directly obtain the require method in Node Addon, so JS side must pass module object to Node Addon when executing Node Addon. Rust can only refer to other modules by calling Module’s require method:

require("./loader.node").load({
  type: "main".module // Pass through the Module object of the current Module
})
Copy the code

The above code directly replaces the original contents of main.js, whereas in Rust you need to implement a method to facilitate the Require operation:

fn node_require(&mut self, id: &str) -> NeonResult<Handle<'a, JsObject>> { let require_fn: Handle
      
        = self.js_get(self.module, "require")? ; let require_args = vec! [self.cx.string(id)]; let result = require_fn.call(&mut self.cx, self.module, require_args)? .downcast_or_throw(&mut self.cx)? ; Ok(result) }
      Copy the code

Embedding and retrieval of bytecode

After bytecode compilation is complete, we generate the following Rust code through JS so that Rust can embed the compiled bytecode into the dynamic link library and read it directly:

pub fn get_module_main() -> &'static [u8] { include_bytes! (" [...]. /output/main.bin") } pub fn get_module_renderer() -> &'static[u8] { include_bytes! ("[...]. /output/renderer.bin")}Copy the code

To read bytecode in Rust, you only need to make a match pattern judgment according to the type field passed in when JS calls the function in Node Addon, and then call the corresponding binary data acquisition method:

enum LoaderProcessType {
    Main,
    Renderer
}
let process_type = match process_type_str.value(&mut cx).as_str() {
    "main" => LoaderProcessType::Main,
    "renderer" => LoaderProcessType::Renderer,
    _= >panic! ("ERROR")}; match process_type {LoaderProcessType: :Main= > gen_main::get_module(),
    LoaderProcessType: :Renderer= > gen_renderer::get_module()
};
Copy the code

Fix Code generation and replacement

During initialization, we first need to generate Fix Code. Fix Code is 4-byte binary data that is actually V8 Flags Hash, which is checked by V8 before bytecodes are run, causing cachedDataRejected if they are inconsistent. In order for the bytecode to work in the current environment, we need to get the V8 Flags Hash for the current environment.

We get Fix Code by calling the VM module in Rust to execute a meaningless piece of Code:

fn init_fix_code(&mut self) -> NeonResult<()> {
    let vm = self.node_require("vm")? ;let vm_script: Handle<JsFunction> = self.js_get(vm, "Script")? ;let code = self.cx.string("\" \ "");
    letscript = vm_script.construct(&mut self.cx, vec! [code])? ;let cache: Handle<JsBuffer> = self.js_invoke(script, "createCachedData".Vec::<Handle<JsValue>>::new())? ;letbuf: Vec::<u8> = self.buf_to_vec(cache)? ; self.fix_code = Some(buf); Ok(()) }Copy the code

Then replace the 12-16 bytes of the bytecode to be run with the 4-byte Fix Code you just fetched:

data[12.16.].clone_from_slice(&fix_code[12.16.]);
Copy the code

Spurious source code generation

You then need to parse the 8-12 bits of bytecode in Rust to get the Source Hash and figure out the code length. An arbitrary string of equal length is then generated as dummy source code to fool V8’s source code length verification.

let mut len = 0usize;
for (i, b) in (&data[8.12.]).iter().enumerate() {
    len += *b as usize * 256usize.pow(i asu32) }; self.eval(&format! (r#"'"' + "\u200b".repeat({}) + '"'"#, len - 2))? ;Copy the code

The reason for calling Eval directly to generate binary data here is that converting Rust’s string to JsString is expensive, so it is more efficient to generate it directly in JS. The Eval implementation essentially calls the VM module’s runInThisContext method.

Solution of confusion

Before running bytecode, we need to do xor to unscramble:

buf.into_iter().enumerate().map(|(_, b)| b ^ 12345).collect()
Copy the code

Run bytecode

Next, it’s time to run bytecode.

First, in order for the bytecode generated previously to run properly, there are some parameters that need to be set for V8 to align with the configuration of the compilation environment:

fn configure_v8(&mut self) -> NeonResult<()> {
    let v8 = self.node_require( "v8")? ;let set_flag: Handle<JsFunction> = self.js_get(v8, "setFlagsFromString")? ;letargs1 = vec! [self.cx.string("--no-lazy")]; set_flag.call(&mut self.cx, v8, args1)? ;letargs2 = vec! [self.cx.string("--no-flush-bytecode")]; set_flag.call(&mut self.cx, v8, args2)? ; Ok(()) }Copy the code

Then we still need to call the VM module in Rust to run the bytecode, which uses Rust to execute the following piece of JS logic:

const vm = require('vm');

const script = vm.Script(dummyCode, {
  cachedData, // This is the bytecode
  filename,
  lineOffset: 0.displayErrors: true
});
script.runInThisContext({
  filename,
  lineOffset: 0.columnOffset: 0.displayErrors: true
});
Copy the code

Operation principle

Finally, our build artifacts have the following directory structure:

Dist ├─ Loader. Node - Node Addon, which contains all the obfuscated bytecode data and is basically unreadable. ├─ ├─ index.html - HTML file for loading renderer.js. ├─ renderer.js - ├─ index.html - HTML file for loading renderer.jsCopy the code

When running the application, use main.js as the entrance, and the complete running process is as follows:

Loader. node stores all the bytecode data and contains the logic to load the bytecode. Both main.js and renderer.js refer directly to loader.node and pass in the type argument to specify the bytecode to load.

A common question

  • What is the impact on the build process?
    • The impact on the build process is mainly the insertion of a layer of bytecode compilation and Node Addon compilation after the Bundle build and before the Electron Builder package.
  • Impact on build performance?
    • Starting the Electron process and BrowserWindow for bytecode compilation takes about 2s. When compiling bytecode, for bundles of around 10M, the bytecode generation time is around 150ms thanks to V8’s high JavaScript parsing efficiency. Finally, the bytecode is encapsulated into Node Addon. Since Rust is a slow build, it may take 5-10 seconds.
    • On the whole, the construction time of this scheme will be extended by 10s-20s. If you are building on CI/CD, the loss of the Cargo cache can add up to about a minute due to the extra time taken with cargo download dependencies.
  • Impact on code organization and writing?
    • The only effect that bytecode schemes have found on code so far isFunction.prototype.toString()Method does not work because the source code does not follow the bytecode distribution, so the source code for the function is not available.
  • Does it affect application performance?
    • There is no impact on code execution performance. There was a 30% improvement in the initialization time (in our application, the Bundle size was around 10M, and the initialization time decreased from around 550ms to around 370ms).
  • Impact on program volume?
    • For bundles of only a few hundred KILobytes, there is a significant increase in bytecode size, but for 2M+ bundles, there is no significant difference in bytecode size.
  • How strong is the code protection?
    • Currently, there are no tools available to decompile V8 bytecode, so this solution is reliable and secure. However, due to the nature of the bytecode itself, it is not difficult to develop a decompile tool. In the unknown future, v8 bytecode should be as decompile as Java/C#, and we should continue to explore other code protection methods.
    • As a result, bytecodes are additionally obscrambled through the Node Addon layer to hide the code running logic on the basis of bytecode protection, which not only increases the difficulty of understanding the package, but also increases the difficulty of code tampering and secondary distribution.

Recruitment hard wide

We are a byte to beat each other entertainment music front team, dabble in across, in the background, desktops and other mainstream front end technology, is a technology atmosphere is very thick and the front end of the team, welcome various bosses: job.toutiao.com/s/eB5sw3x