This article was written by David Mark Clements and Matteo Collina and reviewed by Franziska Hinkelmann and Benedikt Meurer from the V8 team. The article was first published on nearForm’s Blog.

How do V8 TurboFan’s performance characteristics affect how we optimize

Since its inception, Node.js has relied on the V8 javascript engine to provide an environment for code execution, using a programming language we know and love. The V8 javascript engine is a virtual machine written by Google for Chrome. From the beginning, V8’s main goal was to make javascript run faster, at least faster than its competitors. This is not an easy goal for highly dynamic, weakly typed languages. This article covers the evolution of V8 and javascript engine performance.

The core of the V8 engine allows high-speed javascript execution using a JUST-in-time (JIT) compiler, a dynamic compiler that optimizes code at run time. When V8 first built the JIT compiler, it was named: CrankShaft.

Since the 1990s, javascript execution has been unpredictable in the eyes of outsiders and javascript users, making it difficult to fully understand why javascript execution is slow.

In recent years, Matteo Collina and I have focused on figuring out how to write high-performance Node.js code, which, of course, means they know how to make V8 execute our code fast and slow.

Now that the V8 team has written the new JIT compiler, Turbofan., it’s our turn to challenge all of our assumptions about performance.

From the more common V8 killer (a piece of code that causes slow optimization, a term that no longer makes sense in the Turbofan context) to the less common Matteo, we’ll look at the variation in PERFORMANCE across V8 versions with a series of miniature baseline results around Crankshaft performance.

Of course, before optimizing the V8 logical path, we should first focus on the API design, algorithm, and date structure. These mini-benchmarks are indicators of how javascript is performing as node versions change. We can use these metrics to improve our general coding style and improve the performance of our application after we have applied the usual optimization methods.

We will look at the performance of these micro-metrics using V8 versions 5.1, 5.8, 5.9, 6.0, and 6.1.

Put these versions in an environment where Node 6 uses a V8 5.1 and Crankshaft JIT compiler, and Nodes 8.0 through 8.2 use a V8 5.8, Crankshaft, and Turbofan mix JIT compiler.

As of now, V8 5.9 or 6.0 will be available in Node 8.3 (or possibly Node 8.4). 6.1 is the latest version of V8, integrated in the experimental Node-V8 github.com/nodejs/node… In the warehouse. In other words, V8 6.1 will be integrated with future Versions of Node.

Let’s take a look at our mini-reference points, and on the other hand, we’ll talk about what these reference points mean for the future.

Try/catch question

One of the most well-known patterns for de-tuning is the use of “try/catch” code blocks.

In this reference point, we compare four scenarios:

  • A function that contains a try/catch
  • A function that does not contain a try/catch
  • Call a function inside a try/catch
  • Simply call the function directly

Code address: github.com/davidmarkcl…

We can see that in Node 6 (V8 5.1), the performance issues caused by try/catch are real, but in Node 8.0-8.2 (V8 5.8), the impact is significantly reduced.

Also note that calling a function inside a try block is much slower than executing it outside of the try block, as was the case in Node 6 (V8 5.1) and Node 8.0-8.2 (V8 5.8).

However, for Node 8.3+, the performance problems of calling functions from within the “try” block are negligible.

But don’t take it too lightly. While researching some of the drill seminar material, Matteo and I found a performance bug, where a fairly specific combination of circumstances can result in Turbofan’s infinite optimization/re-optimization cycles (which would be considered a “killer” — a mode that breaks performance).

Removes an attribute on an object

For years, DELETE was limited to anyone who wanted to write high-performance JavaScript (at least we were trying to write the best code for popular programs).

The problem with “delete” boils down to V8’s handling of the dynamic nature and (potentially dynamic) prototype chains of JavaScript objects, which makes finding properties more complex at the implementation level.

The V8 high-performance object attribute creation technique is to create a class in the C ++ layer based on the “shape” of the object. Shapes are essentially key-value properties (including prototype chain keys and values). These are called “hidden classes.” However, this is an optimization that occurs on objects at run time, and V8 has another mode for attribute retrieval if the shape of the object is uncertain: hash table lookup. Hash table lookup is significantly slower. Previously, when we removed a key from an object, the subsequent property access would be a hash table lookup. This is why we avoided delete and instead set the property to undefined. Both approaches have the same value so far, but the former can be problematic when checking whether the property exists; Since json.stringify does not include a “undefined” value in its output (” undefined “is not a valid value in the JSON specification), preserialization compilation is usually good enough.

Now let’s see if the new Turbofan implementation solves the DELETE problem.

In this microbenchmark, we compared two scenarios:

  • Serialize an object after its property is set to “undefined”
  • usedeleteSerialize an object after removing its attributes

Code address github.com/davidmarkcl…

In V8 6.1 (not yet used in any Node release), properties are removed from action objects very quickly, even faster than setting it to “undefined”. This is a good message, because now we can use delete, and delete is faster.

Leak and tidy up ‘ARGUMENTS’

Common JavaScript functions (with the exception of the arrow function, which has no arguments object inside) can use an implicit “arguments” object of class arrays.

To use array methods or most array behavior, the index properties of the “arguments” object have been copied into an array. In the past, javascript programmers have tended to think that less code executes faster. While this rule of thumb brings the benefit of payload size to browser-side code, the same rule can lead to server-side code size being far less important than execution speed. So methods to convert arguments objects to arrays have become very popular: Array. The prototype. Slice. The call (the arguments), invoke the Array slice function pass the arguments object as the function of this context, see a slice function like an Array of objects, and accordingly. That is, it takes the arguments object of the entire array of classes as an array.

But, when a function is implicit exposed the arguments object from the function of context (for example, when it return value from a function to retrieve or passed to another function, as in the Array. The prototype. Slice. The call (the arguments) cases) lead to performance degradation. Now is the time to challenge that assumption.

The next microbenchmark measures two related topics across our four V8 versions: the cost of leaking “arguments” and copying the arguments into an array (which is then exposed from the function scope instead of the “arguments” object).

The details are as follows:

  • willargumentsObject exposed to another function, no array conversion
  • useArray.prototype.sliceSkills to createargumentsObject copy
  • Use a for loop and copy each attribute
  • Assign an input array to a reference using the ECMAScript 2015 extension operator

Code access address: github.com/davidmarkcl…

Let’s take a look at the same data, graphically highlighting the change in performance characteristics:

To sum up: If we want to use arrays as input parameters to our handlers (which seems fairly common in my experience), then we should use extension operators in Node 8.3 or later. As of Node 8.2 or later, we should use a for loop to copy the key from arguments into a new (pre-allocated) array (see the base code).

Also, in Node 8.3+, we don’t run into performance problems by exposing arguments to other functions, so there may be further performance advantages without the need for a full array, and array-like structures can be used.

Some applications are (CURRYING) and BINDING

Partial application (or currying) is a way in which we can capture state inside a nested closure.

Such as:

function add (a, b) {
  return a + b
}
const add10 = function (n) {
  return add(10, n)
}
console.log(add10(20))
Copy the code

Here the parameter a to add is assigned to 10 in add10.

Since EcmaScript 5, using the “bind” method provides a simpler way to write partial applications:

function add (a, b) {
  return a + b
}
const add10 = add.bind(null, 10)
console.log(add10(20))
Copy the code

However, we generally don’t use bind because it is significantly slower than using closures.

This benchmark measures the difference between our target V8 version of bind and closures that call functions directly.

Here are our four cases:

  • One function calls another function in a partially applied manner
  • An arrow function calls another function in a partially applied manner
  • Bind creates a function that calls another function in a partially applied fashion
  • Call the function directly

Code access address: github.com/davidmarkcl…

The line chart of the benchmark results clearly shows the aggregation occurring in the new V8 release. Interestingly, part of the application using the arrow function was much faster than using the normal function (at least in our microbenchmark case). In fact, it’s almost as good as a direct call. In V8 5.1 (Node 6) and 5.8 (Node 8.0-8.2), bind is very slow, and it seems clear that using the arrow part of the application is the fastest option. However, the speed of “bind” has improved since V8 5.9 (Node 8.3+) and is fastest in V8 6.1 (future Node).

In all versions, using the arrow function is the closest thing to a direct call. The code for using the arrow function in later versions will be as close as using “bind” and is currently faster than using the normal function. As a warning, however, we may need to test more types of partial applications with data structures of different sizes to get a more comprehensive graph.

Function character number

The size of a function depends on its function name, whitespace, and even comments, whether or not V8 can function to a single line. Yes: Adding comments to your functions can cause performance to drop into the 10% speed range. Will this change with Turbofan? Let’s take a look.

Within this baseline we consider three scenarios.

  • Call a little function
  • Inline performs a small function, populated with comments
  • Executes a large function filled with comments

Code access address: github.com/davidmarkcl…

In V8 5.1 (Node6), small functions and inline functions execute at the same speed. This completely illustrates how inline functions work. When we call this little function, it is as if V8 writes the contents of the little function to the place where it was called. So when we actually write the content of a function (even if we populate it with additional comments) and we manually make the function inline, the performance is the same. Again, we can see in V8 5.1 (Node6) that calling a function that fills with comments over a certain size causes slow execution.

In Node 8.0-8.2 (V8 5.8), the situation is almost the same, except that the cost of invoking small functions increases significantly; This could be due to both crankshafts and Turbofa operating at the same time, with one function possibly in the Crankshaft and the other in the Turbofan causing a separation of inline capabilities (i.e., a series of in-line functions that had to jump between clusters).

In 5.9 and later (Node 8.3+), any size added by irrelevant characters (such as whitespace or comments) has no bearing on functional performance. This is because Turbofan uses the function AST abstract syntax tree to determine the function size instead of using the character count in the Crankshaft. Instead of checking the number of bytes, it takes into account the actual instructions of the function, making Spaces from V8 5.9 (Node 8.3+) **, variable name character counts, function signatures and comments no longer a factor in whether a function is inline.

Notably, we once again saw the overall performance of the functionality degrade.

The conclusion here should also be to write small functions. For now, we want to avoid excessive comments (or even white space) inside functions. Also if you want the absolute fastest speed, manual inlining (deleting calls) is always the fastest method. Of course, this must be balanced against the fact that after a certain size (actual executable code) a function is not inlined, so copying code from other functions into a function can cause performance problems. In other words, manual inlining is an optional optimization; In most cases it is best to let the compiler handle inline code.

32-bit and 64-bit integers

As we all know, JavaScript has only one Number type: Number. (Maybe a sentence about BigInt’s proposal should be included here?)

However, V8 is implemented in C++, so you must select numeric type JavaScript values underneath.

In the case of integers (that is, when we specify one with no decimals in JS), V8 assumes that all numbers are 32 bits – until they are not. This seems like a fair choice, as in many cases a number is in the 0-65535 range. If the JavaScript (full) number exceeds 65535, the JIT compiler must dynamically change the underlying type of the number to 64 bits – which may have potential implications for other optimizations as well.

This benchmark looks at the following three situations:

  • A function only handles numbers in the 32-bit range
  • A function that handles numbers between 32 and 64 bits
  • A function only handles numbers that exceed 32 bit capacity

Code access address: github.com/davidmarkcl…

The figure shows whether this observation holds true for Node 6 (V8 5.1), Node 8 (V8 5.8), or even some future version of the Node. Numbers greater than 65535 (integers) will cause the function to run between half and two-thirds of the speed. So, if you have long numeric ids – put them in strings.

It is worth noting that numbers in the 32-bit range also increase significantly between Node 6 (V8 5.1) and Nodes 8.1 and 8.2 (V8 5.8), but are significantly slower in Node 8.3+ (V8 5.9+). Since large numbers don’t affect speed at all, it’s likely that this is the cause of the real (32-bit) number slowdowns, and not related to function calls or loop speeds (used in the benchmark code).

Traverse Object

Taking all the values (or properties) of an object and performing some operations with that data is a common task, and there are many ways to solve this problem. Let’s see which version of V8 (and Node) is the fastest.

The benchmark measures four conditions across all V8 versions:

  • usefor-inCirculation andhasOwnPropertyCheck to get all the values of an object
  • useObject.keysAnd USES the ArrayreduceMethod iterates over the key provided toreduceIterator to access the property values of the object
  • useObject.keysAnd USES the ArrayreduceMethod iterates over the key provided toreduceThe iterator arrow function accesses the attribute values of the object
  • withforcycleObject.keysReturns an array that accesses the property values of the object within the loop

We also added three additional cases for V8 5.8,5.9,6.0, and 6.1

  • useObject.valuesAnd use arraysreduceMethod iteration value
  • useObject.valuesAnd use arraysreduceMethod iterates the value, and the function passed to reduce is the arrow function
  • withforcycleObject.valuesArray returned

We don’t bench these cases in V8 5.1 (Node 6) because it doesn’t support the native EcmaScript 2015 object. values method.

We will not test these cases in V8 5.1 (Node 6) because it does not support the native EcmaScript 2015 object.values method.

Code access address: github.com/davidmarkcl…

Using for-in in Node 6 (V8 5.1) and Node 8.0-8.2 (V8 5.8) is by far the fastest way to loop through all keys of an object and then access the object’s value within the loop. At about 40 million operations per second, five times faster than the nearest method, Object.keys is around 8 million.

In V8 6.0 (Node 8.3), the previous version of for-In was a quarter faster, but still faster than any other method.

In V8 6.1 (a future release of Node), object-keys will fly in and become faster than for-in, but in V8 5.1 it will not be as fast as for-in in V8 5.1 and 5.8 (Node 6, Node 8.0-8.2).

The driving principles behind Turbofan seem to be optimized for intuitive coding behavior. That is, optimize for what is most ergonomic for the developer.

Getting the value directly with Object.values is slower than using Object.keys and accessing the value in the Object. In addition, processing loops is faster than functional programming. Therefore, you may need to do more work when iterating over objects.

Also, for those of us who use for-ins for performance, it can be a painful time when our speed drops dramatically and there is no alternative.

Create an object

We’re always creating objects, so this is a good point to measure.

We will look at the following three cases:

  • Create objects using object literals
  • Create objects using ES2015 Class
  • Use constructor functions to create objects

Code repository address: github.com/davidmarkcl…

In Node 6 (V8 5.1), all methods are about the same speed.

In Node 8.0-8.2 (V8 5.8), instances can be created using EcmaScript 2015 class at less than half the speed of object literals using constructors. Special attention should be paid here.

This is still true in V8 5.9.

Then in V8 6.0 (possibly in Node 8.3, or in Node 8.4) and 6.1 (not currently in any Node release) object creation speeds are incredibly fast at over 500 million op/s! It’s incredible.

We can see that objects created by constructors are slightly slower. So our best bet for future friendly execution code is always to like object literals. This suits us because we recommend returning object literals from functions (rather than using classes or constructors) as a general best coding practice.

POLYMORPHIC and MONOMORPHIC functions

When we always enter the same type of argument into a function (for example, we always pass a string), we use the function singly. Some functions are written as arguments of multiple types – meaning that the same argument can be handled as different hidden classes – so maybe it can handle a string, an array or an object with a particular hidden class, and handle it accordingly. This can make the interface good in some cases, but has a negative impact on performance.

Let’s take a look at how multi-type and single-type cases look in our benchmark.

Let’s look at five scenarios:

  • Arguments are functions of object literals and strings
  • Arguments are constructor instances and string functions
  • Arguments are just functions of strings
  • Arguments are just functions of object literals
  • Arguments are simply functions of the constructor instance

Code repository address: github.com/davidmarkcl…

The data shown in our diagram ultimately proves that all V8 versions of the singlet function outperform the polymorphic function.

The larger performance gap between singleton and polymorphic functions in V8 6.1 (a future version of Node) further reinforces this point. However, it is worth noting that this is based on the Node-V8 branch using the Gnice-build V8 version – it may not be implemented specifically in V8 6.1.

If we are writing code that needs to be optimized and a function is called multiple times, we should avoid polymorphisms. On the other hand, if you call only once or twice, with instantiated/set functions, the performance of polymorphic apis is acceptable.

The debugger keyword

Finally, let’s talk about the debugger keyword

Be sure to remove the debugger statement from the code. Debugger statements in your code can degrade performance.

Let’s look at two cases:

  • Inside the function isdebuggerThe function of
  • There’s nothing inside the functiondebuggerThe function of

Code repository address: github.com/davidmarkcl…

In all the V8 versions tested, the presence of the Debugger keyword had a terrible impact on performance.

The line without debugger is noticeably down for successive V8 versions, as we’ll see in the digest.

A real world benchmark: LOGGER comparison

In addition to our microbenchmark, we can use Matteo, node.js’ most popular logger, to see the overall effect of our V8 release.

Code address: Pino.

The following bar chart shows the performance of the most popular loggers in Node.js 5.9 (Crankshaft) :

Here’s the same benchmark using V8 6.1 (Turbofan) :

While all of the logger benchmarks were about 2x faster, Winston Logger got the biggest benefit from the new Turbofan JIT compiler. This seems to indicate the speed convergence we saw in the various methods in the micro-benchmark: the slower methods in the Crankshaft were significantly faster in the Turbofan, and the faster methods in the Crankshaft were significantly slower in the Turbofan. Winston, the slowest, might use the slower Crankshaft speed method but was faster in Turbofan, while Pino was optimized to use the fastest Crankshaft method. Although an increase in velocity was observed in Pino, the extent was much smaller.

conclusion

Some benchmarks show that slow cases in V8 5.1 and V8 5.8 and 5.9 get faster, faster cases get slower when Turbofan is fully enabled in V8 6.0 and V8 6.1, and often slow cases get faster.

Much of this is due to the cost of making function calls in Turbofan (V8 6.0 and later). The idea behind Turbofan is to optimize common cases and eliminate the common “V8 killer.” This makes the (Chrome) browser and server (Node) applications perform better. This tradeoff seems to be (at least initially) the most effective case for the rate of decline. Our recorder benchmark comparisons show that Turbofan features improve performance across the board, even cross-platform code comparisons (e.g. Winston vs Pino).

If you’ve had an eye on JavaScript performance for a while, And best-coding behaviors to the quirks of the underlying engine it’s nearly time to unlearn some techniques. If You’ve focused on best practices, writing generally good JavaScript then well done, Thanks to the V8 team’s tireless efforts, a performance reward is coming.

If you’ve been focusing on JavaScript performance for a while and adapting your coding behavior to the quirks of the underlying engine, it’s time to forget a few tricks. Written code is usually fine if you focus on best practices, and thanks to the relentless efforts of the V8 team, performance improvements are just around the corner.

For all the source code and another copy of this article visit github.com/davidmarkcl… . In this paper, the raw data, please visit: docs.google.com/spreadsheet…