• Maybe you don’t need Rust and WASM to speed up your JS — Part 2
  • Original author: Vyacheslav Egorov
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: geniusq1981
  • Proofread by D-Kylin and Leviding

The following is part 2 of this series. If you haven’t seen part 1, please doPerhaps you don’t need Rust and WASM to improve JS execution efficiency – Part 1.

I have tried three different methods to decode Base64 VLQ segments.

The first is decodeCached, which is exactly the same as the default implementation used by Source-Map — I’ve listed it above:

function decodeCached(aStr) {
    var length = aStr.length;
    var cachedSegments = {};
    var end, str, segment, value, temp = {value: 0, rest: 0};
    const decode = base64VLQ.decode;

    var index = 0;
    while (index < length) {
    // Because each offset is encoded relative to the previous one,
    // many segments often have the same encoding. We can exploit this
    // fact by caching the parsed variable length fields of each segment,
    // allowing us to avoid a second parse if we encounter the same
    // segment again.
    for (end = index; end < length; end++) {
        if (_charIsMappingSeparator(aStr, end)) {
        break;
        }
    }
    str = aStr.slice(index, end);

    segment = cachedSegments[str];
    if (segment) {
        index += str.length;
    } else {
        segment = [];
        while (index < end) {
        decode(aStr, index, temp);
        value = temp.value;
        index = temp.rest;
        segment.push(value);
        }

        if (segment.length === 2) {
        throw new Error('Found a source, but no line and column');
        }

        if (segment.length === 3) {
        throw new Error('Found a source and line, but no column'); } cachedSegments[str] = segment; } index++; }}Copy the code

The second is decodeNoCaching. It’s actually decodeCached without caching. Each segment is decoded individually. I use Int32Array for segment storage instead of Array.

function decodeNoCaching(aStr) {
    var length = aStr.length;
    var cachedSegments = {};
    var end, str, segment, temp = {value: 0, rest: 0};
    const decode = base64VLQ.decode;

    var index = 0, value;
    var segment = new Int32Array(5);
    var segmentLength = 0;
    while (index < length) {
    segmentLength = 0;
    while(! _charIsMappingSeparator(aStr, index)) { decode(aStr, index, temp); value = temp.value; index = temp.rest;if (segmentLength >= 5) throw new Error('Too many segments');
        segment[segmentLength++] = value;
    }

    if (segmentLength === 2) {
        throw new Error('Found a source, but no line and column');
    }

    if (segmentLength === 3) {
        throw new Error('Found a source and line, but no column'); } index++; }}Copy the code

Finally, the third is decodeNoCachingNoString, which tries to avoid handling JavaScript strings by converting strings to utF8-encoded Uint8Array. This optimization is inspired by the fact that the JS VM has a higher probability of optimizing array loading into separate memory access. Because of the difference of JS VM used String representation of the level and structure is very complex, so the String. The prototype. CharCodeAt optimization to the same level will be more difficult.

I compared two versions, one that encodes strings to UTF8 and the other that uses preencoded strings. With this “optimized” version, I wanted to evaluate how much of a performance gain we could get by converting our long string coefficients. The “optimized” version is implemented by loading the Source Map into an array buffer and parsing it directly from that buffer, rather than converting it to a string first.

let encoder = new TextEncoder();
function decodeNoCachingNoString(aStr) {
    decodeNoCachingNoStringPreEncoded(encoder.encode(aStr));
}

function decodeNoCachingNoStringPreEncoded(arr) {
    var length = arr.length;
    var cachedSegments = {};
    var end, str, segment, temp = {value: 0, rest: 0};
    const decode2 = base64VLQ.decode2;

    var index = 0, value;
    var segment = new Int32Array(5);
    var segmentLength = 0;
    while (index < length) {
    segmentLength = 0;
    while(arr[index] ! = 59 && arr[index] ! = 44) { decode2(arr, index, temp); value = temp.value; index = temp.rest;if(segmentLength < 5) { segment[segmentLength++] = value; }}if (segmentLength === 2) {
        throw new Error('Found a source, but no line and column');
    }

    if (segmentLength === 3) {
        throw new Error('Found a source and line, but no column'); } index++; }}Copy the code

Here are my benchmark results from running Chrome Dev66.0.3343.3 (V86.6.189) and Firefox Nightly60.0A1:

A few points to note:

  • On V8 and SpiderMonkey, the version that uses caching is slower than the others. As the number of caches increases, its performance deteriorates dramatically – while the performance of the no-cache version does not suffer;
  • On SpiderMonkey, converting strings to typed arrays and parsing them is advantageous, whereas on V8 direct character access is fast enough – so arrays are advantageous only if string-to-array conversions are removed from the baseline. (For example, if you start loading your data into an array of types)

I doubt that the V8 team has improved charCodeAt performance in recent years – I distinctly recall that their Crankshaft didn’t take the effort to make charCodeAt a string invocation method, instead extending it to all blocks of code represented as strings, Makes loading characters from a string slower than loading elements from a type array.

I browsed the V8 problem tracker and found the following issues:

  • Issue 6391: StringCharCodeAt slower than Crankshaft;
  • Issue 7092: High overhead of String.prototype.charCodeAt in typescript test;
  • Issue 7326: Performance degradation when looping across character codes of a string;

Some of the comments on these issues refer to releases since late January 2018, indicating that charCodeAt performance improvements are being actively pursued. Out of curiosity, I decided to re-run my benchmark in Chrome Beta and compare it with Chrome Dev.

In fact, all of the V8 team’s commits were productive by comparison: charCodeAt’s performance improved significantly from “6.5.254.21” to “6.6.189”. Comparing the lines of code for “no cache” and “Use arrays”, we can see that charCodeAt performed much worse in older versions of V8, so simply converting strings to “Uint8Array” to speed up access would do the job. However, in the new version of V8, just doing this transformation inside the parse doesn’t do any good.

However, if you can use arrays directly instead of strings without conversion, there is a performance benefit. Why is that? To answer this question, I run the following code in V8:

function foo(str, i) {
    return str.charCodeAt(i);
}

let str = "fisk";

foo(str, 0);
foo(str, 0);
foo(str, 0);
%OptimizeFunctionOnNextCall(foo);
foo(str, 0);
Copy the code
╰─$out.gn/x64.release/d8 --allow-natives-syntax --print-opt-code --code-comments X.JSCopy the code

This command produces a huge list of assemblies, which confirms my suspicions that V8’s “charCodeAt” still doesn’t do special processing for specific strings. This weakness seems to stem from this code in V8, which explains why array access is faster than string charCodeAt processing.

Resolution improvement

Based on these findings, we can remove the cache of parsed segments from source-map parsing code and test the impact.

As predicted by our benchmark, caching is bad for overall performance: removing it can greatly improve parsing time.

Optimized sorting – algorithm improvement

Now that we’ve improved parsing performance, let’s look at sorting again.

There are two arrays being sorted:

  1. originalMappingsUsing an arraycompareByOriginalPositionsComparator to sort;
  2. generatedMappingsUsing an arraycompareByGeneratedPositionsDeflatedThe comparator sorts.

To optimize theoriginalMappingsThe sorting

I looked at the first compareByOriginalPositions.

function compareByOriginalPositions(mappingA, mappingB, onlyCompareOriginal) {
    var cmp = strcmp(mappingA.source, mappingB.source);
    if(cmp ! = = 0) {return cmp;
    }

    cmp = mappingA.originalLine - mappingB.originalLine;
    if(cmp ! = = 0) {return cmp;
    }

    cmp = mappingA.originalColumn - mappingB.originalColumn;
    if(cmp ! == 0 || onlyCompareOriginal) {return cmp;
    }

    cmp = mappingA.generatedColumn - mappingB.generatedColumn;
    if(cmp ! = = 0) {return cmp;
    }

    cmp = mappingA.generatedLine - mappingB.generatedLine;
    if(cmp ! = = 0) {return cmp;
    }

    return strcmp(mappingA.name, mappingB.name);
}
Copy the code

I noticed that the mappings are sorted first by the Source component and then processed by other components. Source specifies which source file the mapping came from first. An obvious idea is that we could change originalMappings to a collection of arrays: OriginalMappings [I] is an array containing all mappings from the ith source file, rather than using a large originalMappings array to mix mappings from different source files. In this way, we can store the mappings parsed from the source file into different originalMappings [I] arrays, and then sort the individual, smaller arrays.

Is actually a [bucket sort] (https://en.wikipedia.org/wiki/Bucket_sort)

This is what we did in the parsing loop:

if (typeof mapping.originalLine === 'number') {
    // This code used to just do: originalMappings.push(mapping).
    // Now it sorts original mappings already by source during parsing.
    let currentSource = mapping.source;
    while (originalMappings.length <= currentSource) {
    originalMappings.push(null);
    }
    if (originalMappings[currentSource] === null) {
    originalMappings[currentSource] = [];
    }
    originalMappings[currentSource].push(mapping);
}
Copy the code

After that:

var startSortOriginal = Date.now();
// The code used to sort the whole array:
//     quickSort(originalMappings, util.compareByOriginalPositions);
for (var i = 0; i < originalMappings.length; i++) {
    if(originalMappings[i] ! = null) { quickSort(originalMappings[i], util.compareByOriginalPositionsNoSource); } } var endSortOriginal = Date.now();Copy the code

“CompareByOriginalPositionsNoSource comparator is pretty much the same with” compareByOriginalPositions comparator, It’s just that it no longer compares the “source” component – based on the way we constructed the originalMappings [I] array, which is guaranteed to be fair.

This algorithm improvement improves sorting speed on both V8 and SpiderMonkey, and also improves parsing speed on V8.

The increase in parsing speed is due to the reduction in processing originalMappings arrays: it is more expensive to generate a single large originalMappings array than to generate multiple smaller originalMappings [I] arrays. However, this is just my guess, without any rigorous analysis.

To optimize thegeneratedMappingsThe sorting

Let’s have a look at the generatedMappings and compareByGeneratedPositionsDeflated comparator.

function compareByGeneratedPositionsDeflated(mappingA, mappingB, onlyCompareGenerated) {
    var cmp = mappingA.generatedLine - mappingB.generatedLine;
    if(cmp ! = = 0) {return cmp;
    }

    cmp = mappingA.generatedColumn - mappingB.generatedColumn;
    if(cmp ! == 0 || onlyCompareGenerated) {return cmp;
    }

    cmp = strcmp(mappingA.source, mappingB.source);
    if(cmp ! = = 0) {return cmp;
    }

    cmp = mappingA.originalLine - mappingB.originalLine;
    if(cmp ! = = 0) {return cmp;
    }

    cmp = mappingA.originalColumn - mappingB.originalColumn;
    if(cmp ! = = 0) {return cmp;
    }

    return strcmp(mappingA.name, mappingB.name);
}
Copy the code

Here we first compare the mapping of the generatedLine. Generally, more lines may be generated compared to the original source file, so it does not make sense to split generatedMappings into separate arrays.

However, when I looked at the parsing code, I noticed the following:

while (index < length) {
    if (aStr.charAt(index) === '; ') {
    generatedLine++;
    // ...
    } else if (aStr.charAt(index) === ', '{/ /... }else{ mapping = new Mapping(); mapping.generatedLine = generatedLine; / /... }}Copy the code

This is the only place in the code where a generatedLine appears, which means that the generatedLine grows monotonously — meaning that the generatedMappings array has already been sorted by the generatedLine, so it makes no sense to sort the entire array. Instead, we sort each of the smaller subarrays. Let’s change our code to the following:

let subarrayStart = 0;
while (index < length) {
    if (aStr.charAt(index) === '; ') {
    generatedLine++;
    // ...

    // Sort subarray [subarrayStart, generatedMappings.length].
    sortGenerated(generatedMappings, subarrayStart);
    subarrayStart = generatedMappings.length;
    } else if (aStr.charAt(index) === ', '{/ /... }else {
    mapping = new Mapping();
    mapping.generatedLine = generatedLine;

    // ...
    }
}
// Sort the tail.
sortGenerated(generatedMappings, subarrayStart);
Copy the code

Instead of using quicksort to sort subarrays, I decided to use insert sort, similar to the hybrid strategy used by some VMS for array.prototype.sort.

Note: If the input array is already sorted, insert sort is faster than quicksort… It turns out that the mapping used for benchmarking is actually sorted. If we expect generatedMappings to be almost always sorted after parsing, it is more efficient to simply check whether generatedMappings has been sorted before sorting.

const compareGenerated = util.compareByGeneratedPositionsDeflatedNoLine;

function sortGenerated(array, start) {
    let l = array.length;
    let n = array.length - start;
    if (n <= 1) {
    return;
    } else if (n == 2) {
    let a = array[start];
    let b = array[start + 1];
    if(compareGenerated(a, b) > 0) { array[start] = b; array[start + 1] = a; }}else if (n < 20) {
    for (let i = start; i < l; i++) {
        for (let j = i; j > start; j--) {
        let a = array[j - 1];
        let b = array[j];
        if (compareGenerated(a, b) <= 0) {
            break; } array[j - 1] = b; array[j] = a; }}}else{ quickSort(array, compareGenerated, start); }}Copy the code

This produces the following results:

Sorting time drops dramatically, while parsing time increases slightly — this is because the code has generatedMappings sorted as part of the parsing loop, making our decomposition somewhat meaningless. Let’s compare the total improvement time (parsing and sorting together).

Total improvement time

It is now clear that we have greatly improved overall map parsing performance.

Is there anything else we can do to improve performance?

That’s right: we can take a page out of the ASM.js /WASM guide instead of switching to Rust entirely based on JavaScript code.

Optimize resolution – Reduce GC pressure

We are allocating thousands of Mapping objects, which puts quite a lot of pressure on the GC – though we don’t really need such objects – we can package them into a type array. This is how I do it.

A few years ago, I was very excited about the Typed Objects proposal, which will allow JavaScript programmers to define structs and struct arrays and a lot of amazing things that are very convenient. Unfortunately, the leaders behind the proposal left to work elsewhere, leaving us to either do it ourselves or write it in C++ code.

First, I change Mapping from a normal object to a wrapper that points to an array of types, which will contain all of our mappings.

function Mapping(memory) {
    this._memory = memory;
    this.pointer = 0;
}
Mapping.prototype = {
    get generatedLine () {
    return this._memory[this.pointer + 0];
    },
    get generatedColumn () {
    return this._memory[this.pointer + 1];
    },
    get source () {
    return this._memory[this.pointer + 2];
    },
    get originalLine () {
    return this._memory[this.pointer + 3];
    },
    get originalColumn () {
    return this._memory[this.pointer + 4];
    },
    get name () {
    return this._memory[this.pointer + 5];
    },
    set generatedLine (value) {
    this._memory[this.pointer + 0] = value;
    },
    set generatedColumn (value) {
    this._memory[this.pointer + 1] = value;
    },
    set source (value) {
    this._memory[this.pointer + 2] = value;
    },
    set originalLine (value) {
    this._memory[this.pointer + 3] = value;
    },
    set originalColumn (value) {
    this._memory[this.pointer + 4] = value;
    },
    setname (value) { this._memory[this.pointer + 5] = value; }};Copy the code

I then adjusted the parsing and sorting code as follows:

BasicSourceMapConsumer.prototype._parseMappings = function (aStr, aSourceRoot) {
    // Allocate 4 MB memory buffer. This can be proportional to aStr size to
    // save memory for smaller mappings.
    this._memory = new Int32Array(1 * 1024 * 1024);
    this._allocationFinger = 0;
    let mapping = new Mapping(this._memory);
    // ...
    while (index < length) {
    if (aStr.charAt(index) === '; ') {

        // All code that could previously access mappings directly now needs to
        // access them indirectly though memory.
        sortGenerated(this._memory, generatedMappings, previousGeneratedLineStart);
    } else {
        this._allocateMapping(mapping);

        // ...

        // Arrays of mappings now store "pointers" instead of actual mappings.
        generatedMappings.push(mapping.pointer);
        if(segmentLength > 1) { // ... originalMappings[currentSource].push(mapping.pointer); }}} //...for (var i = 0; i < originalMappings.length; i++) {
    if(originalMappings[i] ! = null) { quickSort(this._memory, originalMappings[i], util.compareByOriginalPositionsNoSource); }}}; BasicSourceMapConsumer.prototype._allocateMapping =function (mapping) {
    let start = this._allocationFinger;
    let end = start + 6;
    if (end > this._memory.length) {  // Do we need to grow memory buffer?
    let memory = new Int32Array(this._memory.length * 2);
    memory.set(this._memory);
    this._memory = memory;
    }
    this._allocationFinger = end;
    let memory = this._memory;
    mapping._memory = memory;
    mapping.pointer = start;
    mapping.name = 0x7fffffff;  // Instead of null use INT32_MAX.
    mapping.source = 0x7fffffff;  // Instead of null use INT32_MAX.
};

exports.compareByOriginalPositionsNoSource =
    function (memory, mappingA, mappingB, onlyCompareOriginal) {
    var cmp = memory[mappingA + 3] - memory[mappingB + 3];  // originalLine
    if(cmp ! = = 0) {return cmp;
    }

    cmp = memory[mappingA + 4] - memory[mappingB + 4];  // originalColumn
    if(cmp ! == 0 || onlyCompareOriginal) {return cmp;
    }

    cmp = memory[mappingA + 1] - memory[mappingB + 1];  // generatedColumn
    if(cmp ! = = 0) {return cmp;
    }

    cmp = memory[mappingA + 0] - memory[mappingB + 0];  // generatedLine
    if(cmp ! = = 0) {return cmp;
    }

    return memory[mappingA + 5] - memory[mappingB + 5];  // name
};
Copy the code

As you can see, readability is really affected. Ideally, I would like to assign temporary “mapping” objects when I need to process the corresponding segments. However, this code style will rely heavily on the virtual machine’s ability to eliminate these temporary wrap assignments via _Allocation Sinking_, _Scalar Replacement_, or other similar optimizations. Unfortunately, SpiderMonkey didn’t handle this code very well in my experiments, so I opted for more verbose and error-prone code.

This almost manual approach to memory management is rare in JS. That’s why I think it’s worth mentioning here that the “oxidized” source-Map actually requires users to manually manage its lifecycle to ensure that WASM resources are released.

Reruning the benchmark proved that alleviating GC pressure produced a good improvement.

Interestingly, on SpiderMonkey, this approach improved both parsing and sorting, which was a surprise to me.

SpiderMonkey performance cliff

When I used this code, I also found a puzzling performance cliff phenomenon in SpiderMonkey: When I increased the size of the preset memory buffer from 4 MB to 64 MB to measure the cost of reallocation, the benchmark showed a sudden drop in performance after iteration 7.

This looks like some kind of polymorphism, but I can’t immediately figure out how changing the size of an array can cause such polymorphic behavior.

I was confused, but I tracked down a SpiderMonkey hacker, Jan de Mooij, who quickly identified the culprit as asm.js’ related optimization from 2012 on…… Then he deleted it from SpiderMonkey so that no one would ever have to do this again.

Optimized analysis – useUint8ArraySubstitution strings.

Finally, if we use Uint8Array instead of strings for parsing, we can get a small improvement again.

We need to rewrite source-Map to parse the map directly using type arrays instead of using JavaScript’s string method json.decode. I haven’t done a rewrite like this, but I think there should be no problem.

Overall improvements to the baseline

Here’s what happened at the beginning:

$ d8 bench-shell-bindings.js ... [Stats samples: 5, total: 24050 ms, mean: 4810 ms, STddev: 155.91063145276527 ms] $sm bench shell-bindings. [Stats samples: 7, total: 22925 ms, mean: 3275 ms, STddev: 269.5999093306804 ms]Copy the code

Here’s what happens when we finish:

$ d8 bench-shell-bindings.js ... Stats samples: 22, total: 25158 ms, mean: 1143.5454545454545454545 MS, STddev: Ms] $sm table-shell-binding.js... [Stats samples: 31, total: 25247 ms, mean: 814.4193548387096 ms, STDdev: 5.591064299397745 ms]Copy the code

That’s a four-fold improvement in performance!

It may be worth noting that, although this is not required, we still sort all of the originalMappings arrays. Only two operations use originalMappings:

  • allGeneratedPositionsForIt returns all generated positions of the given line;
  • eachMapping(... , ORIGINAL_ORDER)It iterates over all maps in the original order.

If we assume that allGeneratedPositionsFor is the most common operation, and we only search ina handful of originalMappings [I] arrays, then whenever we need to search for one of them, We can all greatly improve the parsing time by sorting the originalMappings [I] array.

Finally, the V8 of January 19 and the V8 of February 19 correspond to changes that include and exclude untrusted code, respectively.

Comparison of Oxidizedsource-mapversion

Since Posting this article on February 19, I have received some feedback asking me to compare my modified Source-Map to the most modified version I have encountered with many different types of Oxidized Source-Map using Rust and WASM mainlines.

A quick look at the Rust source code for parse_mappings shows that the Rust version does not sort the original mappings, only generates the equivalent generatedMappings and sorts. To match this behavior, I adjust my JS version by commenting out the ordering of the originalMappings [I] array.

Here is only the comparison result of parsing (which also includes sorting generatedMappings), and then parsing and iterating over all generatedMappings.

Note that this comparison is a bit misleading, as the Rust version is not as optimized as my JS versiongeneratedMappingsSort of.

So I wouldn’t say, “We’ve successfully reached the level of the Rust+WASM version.” However, given this level of performance variance, we may need to reevaluate whether using Rust in source-Map is really worth it.

Updated (February 27, 2018)

Source-map author Nick Fitzgerald has updated the algorithm described in this article to a version of Rust+WASM. Here is the comparative performance chart for parsing and iteration:

As you can see, the WASM+Rust version is now about 15% faster on SpiderMonkey, and about the same speed on V8.

learning

For JavaScript developers

Profilers are your friends

Analysis and performance tracking in various forms is the best way to achieve high performance. It allows you to place hot spots in your code to reveal potential problems at run time. For this reason, don’t shy away from using low-level analysis tools like PERF – “friendly” tools may not tell you the whole story because they hide the underlying analysis.

Different performance problems require different methods to analyze and collect analysis results visually. Make sure you are familiar with the various tools available.

Algorithms are important

Being able to reason about your code in terms of abstract complexity is an important skill. How about quicksort an array of 100,000 elements? Or is it better to quickly sort 3333 arrays with 30 elements per subarray?

The math tells us ((100,000 log 100,000) is three times larger than (3333 log 30)) – the larger the data, the more important it usually becomes to be able to mathematically transform.

In addition to understanding logarithms, you need to know some common sense and be able to evaluate how your code is used on average and in worst-case scenarios: which operations are common, how are expensive operations amortized, and what are the disadvantages of expensive amortization?

The virtual machine is also working. Problem developer!

Don’t hesitate to discuss strange performance issues with developers. Not everything can be solved by changing your own code. “God does not make POTS,” says a Russian proverb. Virtual machine developers are human, and they make mistakes, too. They are also pretty good at fixing problems once they are sorted out. An email or a chat message or DM can save you time debugging through external C++ code.

The virtual machine still needs a little help

Sometimes you also need to write some low-level code or understand some low-level implementation details to help extract the last shred of JavaScript performance.

One might wish there were better language-level tools to do this, but whether we can remains to be seen.

For language implementers/designers

Smart optimizations must be detectable

If your runtime has any built-in intelligent optimizations, you need to provide an intuitive tool to diagnose when those optimizations fail and provide actionable feedback to developers.

In a language like JavaScript, at least analysis tools like profiler provide you with a specialized way to detect individual actions to determine whether virtual machine optimization results are good or bad and why.

This sort of self-checking tool cannot rely on a special patch for a version of the virtual machine and then output a bunch of unreadable debugging results. Instead, it should be available whenever you need to open a debugger window.

Language and optimization must be friends

Finally, as a language designer, you should try to anticipate what features your language lacks so that it will be easier to write well-performing code. Do users in the market need to manually set up and manage memory? I’m sure they are. If most people using your language end up writing a lot of poorly performing code, you can only improve the performance of your code by adding a lot of language features or in other ways. (For example, through more complex optimizations or by asking users to refactor code with Rust)

Here are some general rules for language design: If you are adding new features to your language, make sure that the computation is reasonable and that the features are easy to understand and detect. Consider optimizations for the entire language, rather than non-core features that are used infrequently and perform poorly.

Afterword.

The optimizations we found in this article fall roughly into three parts:

  1. Algorithm improvement;
  2. How to optimize completely independent code and potentially dependent code;
  3. Optimization methods for V8.

No matter what programming language you use, you need to consider algorithmic performance. It’s easier to notice when you use a bad algorithm in an inherently “slower” programming language, but if you just switch to a “faster” programming language and continue using the same algorithm, even if it alleviates the problem, it still doesn’t get to the root of the problem. A large portion of this article is devoted to this section:

  • Sorting the subarray is better than sorting the whole array.
  • Discuss the pros and cons of using or not using caching.

The second part is singularism. Performance degradation due to polymorphism is not unique to V8. This is not a JS specific problem either. You can apply singlets in different implementations, even across languages. Some languages (Rust, in fact) are already implemented for you within the engine.

The last and most controversial part is parameter adaptation.

Finally, the optimization using mapping notation (encapsulating a single object into a single type array) spans the three sections mentioned in this article. This is based on the limitations and performance costs of GCed systems, and what special optimizations are made for JS virtual machines.

So… Why did I choose this title? This is because I firmly believe that the problems covered in Part 3 will be fixed over time. Other parts can be implemented across languages through common programming languages.

Obviously, every developer and every team is free to choose between spending N hours analyzing, reading, and thinking about their JavaScript code, or M hours rewriting their stuff in X.

But :(a) everyone needs to be fully aware that this option exists; (b) Language designers and implementors should work together to make such choices less obvious — that is, to work on language features and tools, reducing the need for “Part 3” optimizations.

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.