Johnhax, the senior he Shijun I follow, recently published such a microblog:

The original weibo

Although this micro blog did not attract a wide range of attention and discussion, but as a new person, I fell into thinking. What magic did the V8 engine do to achieve the ultimate optimization?

In this article, we will analyze the secrets behind these optimizations. Hope that the god to give the correct and guidance, at the same time to the reader inspired and help.

What the hell are we talking about?

The ECMAScript 2015 language standard specification introduces several new data structures: Such as Maps and Sets (of course, there are similar to WeakMaps and WeakSets, etc.), and these newly introduced data structures have a common feature, that is, they can be iterated according to the same newly introduced iteration protocol. This means you can use for… The of loop or extension operator operates. For a simple example of Sets:

const s = new Set([1, 2, 3, 4]); console.log(... s); // 1, 2, 3, 4for(const x of s) console.log(x); // 1/2/3/4Copy the code

Also for Maps:

const m = new Map([[1, "1"], [2, "2"], [3, "3"], [4, "4"]]); console.log(... m); / / (2) [1,"1"[2] (2),"2"[3] (2),"3"[4] (2),"4"] console.log(... m.keys()); // 1 2 3 4 console.log(... m.values()); // 1, 2, 3, 4for (const [x, y] of m) console.log(x, "is", y); / / 1"is" "1"/ / 2"is" "2"/ / 3"is" "3"/ / 4"is" "4"Copy the code

These two simple examples illustrate the most basic usage. Interested readers can refer to the ECMA specifications: Map Iterator Objects and Set Iterator Objects.

Unfortunately, these traversable data structures are not optimized in the V8 engine. In other words, the implementation details are poorly optimized. Including ECMAScript member Brian Terlson, who complained on Twitter that he was having annoying performance issues implementing a regular engine using Sets.

So, it’s time to optimize them! But before we start tuning, we need to thoroughly understand one question: what is the real reason why these data structures are processing slowly? Where are the performance bottlenecks? To understand this, we need to understand how iterators work in the underlying implementation.

Dig deep into the subject

To do this, let’s start with a simple for… Of cycle.

ES6 uses C++, Java, C#, and Python to introduce for… The of loop, as a unified way to traverse all data structures.

One of the simplest uses:

function sum(iterable) {
  let x = 0;
  for (const y of iterable) x += y;
  return x;
}Copy the code

Compiling this code using Babel gives us:

function sum(iterable) {
  var x = 0;
  var _iteratorNormalCompletion = true;
  var _didIteratorError = false;
  var _iteratorError = undefined;

  try {
    for(var _iterator = iterable[Symbol.iterator](), _step; ! (_iteratorNormalCompletion = (_step = _iterator.next()).done); _iteratorNormalCompletion =true) {
      var y = _step.value;
      x += y;
    }
  } catch (err) {
    _didIteratorError = true;
    _iteratorError = err;
  } finally {
    try {
      if(! _iteratorNormalCompletion && _iterator.return) { _iterator.return(); } } finally {if(_didIteratorError) { throw _iteratorError; }}}return x;
}Copy the code

One fact that needs to be told is that modern JavaScript engines are essentially the same as Babel for… The degrammaticalization of of is the same, only there are some differences in specific details.

Where this fact comes from:

All modern JavaScript engines essentially perform the same desugaring that Babel does here to implement for-of (details vary)

The above quote is from Benedikt Meurer, a Google engineer who is a core member of V8.

But the code compiled by Babel above is not easy to read. Don’t worry, I’ve cut out the annoying exception handling and kept only the core logic for you to study:

function sum(iterable) {
  var x = 0;
  var iterator = iterable[Symbol.iterator]();
  for (;;) {
    var iterResult = iterator.next();
    if (iterResult.done) break;
    x += iterResult.value;
  }
  return x;
}Copy the code

The preliminary knowledge needed to understand this code needs to be clear for… Of and symbol. iterator methods:

A data structure that deploys the Symbol. Iterator attribute is considered to have an iterator interface. The of loop iterates through its members. That is to say, for… Inside the of loop is the symbol. iterator method of the data structure.

for… The scope of the of loop includes arrays, Set and Map structures, some array-like objects (such as Arguments objects, DOM NodeList objects), Generator objects, and strings.

A closer look at the previous code shows that the key to iterator performance optimization is to ensure that iterator.next(), which is called multiple times in the loop, is optimized to the maximum and, ideally, to avoid allocating memory to iterResult entirely. Several ways to achieve this are the use of advanced compilation processing techniques such as Store-Load Propagation, escape Analysis, and Scalar replacement of aggregates.

At the same time, optimized compilation also completely eliminates the assignment of calls to the iterator itself — iterable[symbol.iterator] — and operates directly on the iterator’s backing-store.

In fact, such techniques and concepts are used in the optimization of array and string iterators. Specific implementation documents can be found here.

The key reason for the performance problem of the Set iterator is that it is implemented in a mixed JavaScript and C++ environment. For example, let’s look at the implementation of %SetIteratorPrototype%.next() :

function SetIteratorNextJS() {
  if(! IS_SET_ITERATOR(this)) { throw %make_type_error(kIncompatibleMethodReceiver,'Set Iterator.prototype.next', this);
  }

  var value_array = [UNDEFINED, UNDEFINED];
  var result = %_CreateIterResultObject(value_array, false);
  switch (%SetIteratorNext(this, value_array)) {
    case 0:
      result.value = UNDEFINED;
      result.done = true;
      break;
    case ITERATOR_KIND_VALUES:
      result.value = value_array[0];
      break;
    case ITERATOR_KIND_ENTRIES:
      value_array[1] = value_array[0];
      break;
  }

  return result;
}Copy the code

This code actually does a few things step-by-step:

  • Definition allocates a value_array array, initialized with two undefined items;
  • {value: value_array, done: false};
  • Call the C++ %SetIteratorNext runtime function, passing this and value_array of the iterator as arguments.
  • Assign result correctly according to the C++ %SetIteratorNext runtime function

What are the consequences of this? At each step of the walk we are constantly assigning two objects: value_array and result. Neither V8 TurboFan nor Crankshaft (V8’s optimized compiler) could eliminate this operation. To make matters worse, we had to switch between JavaScript and C++ at every step of the walk. Here is a snapshot of the underlying flow of a simple SUM function:

set-iterator-next-js-20170714.png

In V8 execution, you are always in two states (actually more) : executing C++ code and executing JavaScript. The transition between these two states is expensive. The transition from JavaScript to C++ is done using a so-called CEntryStub, which triggers the Runtime_Something function (in this case, Runtime_SetIteratorNext) specified in C++. So, whether you can avoid this conversion, and avoid the allocation of value_array and result objects, again determines performance.

The latest %SetIteratorPrototype%.next() implementation hits the nail on the head and does this “critical” trick. The code we want to execute becomes hot before it is called, and TurboFan is finally heat optimized. With the so-called CodeStubAssembler, Baseline Implementation, access is now fully implemented at the JavaScript level. So we only need to call C++ to do garbage collection (when available memory runs out) and exception handling.

Traversal based on the callback function

In the traversal protocol, JavaScript also provides set.prototype. forEach and map.prototype. forEach methods to receive a callback function. This also relies on the processing logic of C++, which requires us to convert not only JavaScript to C++, but also callback functions to JavaScript. This works as shown below:

set-iterator-foreach-20170714.png

Therefore, the CodeStubAssembler approach above can only handle simple non-callback function scenarios. True full optimization, including turbofanization of the forEach method, requires some magic to be developed.

Optimization lifting Benchmark

We use the following benchmark code to measure the degree of optimization:

const s = new Set;
const m = new Map;
for (let i = 0; i < 1e7; ++i) {
  m.set(i, i);
  s.add(i);
}

function SetForOf() {
  for (const x of s) {}
}

function SetForOfEntries() {
  for (const x of s.entries()) {}
}

function SetForEach() {
  s.forEach(function(key, key, set) {});
}

function MapForOf() {
  for (const x of m) {}
}

function MapForOfKeys() {
  for (const x of m.keys()) {}
}

function MapForOfValues() {
  for (const x of m.values()) {}
}

function MapForEach() {
  m.forEach(function(val, key, map) {});
}

const TESTS = [
    SetForOf,
    SetForOfEntries,
    SetForEach,
    MapForOf,
    MapForOfKeys,
    MapForOfValues,
    MapForEach
];

// Warmup.
for (const test of TESTS) {
  for (let i = 0; i < 10; ++i) test(a); } // Actual tests.for (const test of TESTS) {
  console.time(test.name);
  test(a); console.timeEnd(test.name); }Copy the code

A comparison of Chrome60 and Chrome61 shows the following ICONS:

improvements-20170714.png

It can be seen that although there is a significant improvement, we still get some unsatisfactory results. This is especially true with SetForOfEntries and MapForOf. But this will be dealt with in the longer term.

conclusion

This article is just a broad overview of the traverser performance bottlenecks and existing solutions. Through he Lao’s microblog, I explored a problem and finally found V8 core member Benedikt Meurer’s Article Faster Collection Iterators for reference and translation. A solid computer background, knowledge of how the V8 engine works, and compilation principles are required to fully understand the original article.

Because I have little talent and knowledge, I have been in the industry for less than two years, not to mention the background of the computer major, while learning from others, it is inevitable that there are deviations and omissions in understanding. For more information on V8 engines, check out @JustJavac. For more information on compilation, check out @Yubrook and his JS Conf talk on compile-time optimization in Front-end Engineering. After all, we pay attention to the front-end big V, Internet celebrity is not to watch the fun, watching B, but hope to learn more knowledge from the predecessors, get more inspiration.

Finally, with the rapid growth of ECMAScript, every front-end developer probably feels like they’re constantly learning, even on the run. But beyond learning new features and syntax, understanding deeper content is the key to getting ahead.

I don’t know what the sea is either,

Standing barefoot on the sand,

Eagerly awaiting the dawn.

Happy Coding!

PS: Author Github warehouse and Zhihu Q&A link welcome all forms of communication.