How JavaScript works: Memory Management + How do you handle four common memory leaks

The originalHow JavaScript works: memory management + how to handle 4 common memory leaks

A few weeks ago we started a series of blog posts aimed at digging deeper into JavaScript and figuring out how it works: We think that by understanding the building blocks of JavaScript and being familiar with how they fit together, it helps to write better code and applications.

The first article in this series focuses on providing an overview of the engine, runtime, and call stack. The second article takes an in-depth look at the internal implementation of Google’s V8 engine and offers some suggestions for writing better JavaScript code.

In this third article, we will discuss another very important topic that has been neglected by more and more developers due to the growing maturity and complexity of everyday programming languages — memory management. We will also provide some guidelines on how to handle JavaScript memory leaks in the SessionStack. We must ensure that the SessionStack does not leak or cause the integrated application to consume more memory.

An overview of the

Languages such as C have low-level memory-management primitives such as malloc() and free(), which developers use to explicitly allocate and free memory to the operating system.

Also, JavaScript allocates memory when things (objects, strings, and so on) are created and automatically frees memory when they are no longer needed, a process known as garbage collection. This seemingly automatic resource release feature is a source of confusion, giving JavaScript (and other high-level languages) developers the false impression that they can choose not to care about memory management. This is a big misunderstanding.

Even when using a high-level programming language, developers should understand memory management (at least at a basic level). Automatic memory management issues are sometimes encountered (such as garbage collector bugs and implementation limitations), and developers should be aware of these issues in order to properly handle them (or find appropriate solutions with minimal cost and code debt).

Memory life cycle

Regardless of which programming language is used, the memory life cycle is almost always the same:

Here’s an overview of what happens at each step of the cycle:

Allocate memoryMemory is allocated by the operating system that allows the program to use it. In low-level programming languages (e.gCThis is an explicit operation that you should handle as a developer. In a high-level programming language, the language handles it for you.
Use memory – This is the stage where the program actually uses previously allocated memory. Read and write operations occur when using variables assigned in code.
Free up memory – Now is the time to free up the entire memory that is not needed so it can become free so that it can be used again. Like allocating memory, this is an explicit operation in low-level programming languages.

For a quick look at the concepts of call stacks and memory heaps, read our first article on this topic.

What is memory?

Before jumping straight into memory in JavaScript, we’ll briefly discuss what memory is and how it works.

At the hardware level, computer memory consists of a large number of triggers. Each flip-flop contains several transistors capable of storing one bit. Individual triggers can be accessed by unique identifiers, so they can be read and written to. Thus, conceptually, we can think of the entire computer memory as a huge array of bits that can be read and written.

As humans, we’re not good at doing all our thinking and arithmetic with bytes, we organize them into larger groups that together represent numbers. Eight bits are called one byte. There are other words besides bytes (sometimes 16 bits, sometimes 32 bits).

A lot of things are stored in memory:

All variables and other data used by all programs.
Program code, including the operating system.

The compiler works with the operating system to handle most of the memory management, but we recommend that you take a look at what’s going on underneath.

When compiling code, the compiler can detect the raw data type and calculate in advance how much memory is needed. Programs in the stack space are then allocated the required amount. The space allocated variables is called stack space because they are added to the top of existing memory when a function is called. When they terminate, they are removed on a last-in, first-out basis. For example, consider the following statement:

int n; // 4 bytes 4 bytes int x[4]; // array of 4 elements, each 4 bytes double m; // 8 bytes 8 bytesCopy the code

The compiler can immediately see that this code requires 4+4*4+8=28 bytes.

This is how large integers and double – precision floating-point numbers are today. Twenty years ago, integers were usually 2 bytes and doubles 4 bytes. Code should never depend on the size of the current base data type.

The compiler will insert code to interact with the operating system and request the number of bytes needed to store the variable on the stack.

In the example above, the compiler knows the exact memory address of each variable. In fact, every time the variable N is written, it is internally converted to something like “memory address 4127963”.

Note that if you try to access x[4] here, you will access the data associated with M. This is because we are accessing a non-existent element in the array — four bytes further than the last actual allocated member of the array x[3], which may end up reading (or writing) some bits in M. This is bound to produce very undesirable results for the rest of the program.

When a function calls another function, each function gets its own stack when called. It not only holds all local variables, but also a program counter that records the execution location. When the function ends, its memory cell becomes free for use again.

Dynamic allocation

Unfortunately, things aren’t so simple when we don’t know how much memory a variable requires at compile time. Suppose we want to do something like this:

int n = readInput(); // reads input from the user
...
// create an array with "n" elements
Copy the code

At compile time, the compiler has no way of knowing how much memory the array requires, because it depends on the user-supplied values.

Therefore, there is no way to allocate space for variables on the stack. Instead, our program needs to explicitly request appropriate space from the operating system at run time. This memory is allocated by heap space. The differences between static and dynamic memory allocation are summarized in the following table:

To fully understand how dynamic memory allocation works, we need to spend a little more time with Pointers, but that’s beside the point in this article. If you’re interested in learning more, let us know in the comments, and we can discuss Pointers in more detail in a future post.

`JavaScript`The distribution of

Now we’ll explain how the first step (allocating memory) works in JavaScript.

JavaScript frees the developer from the responsibility of memory allocation — it handles memory allocation itself while declaring variables.

var n = 374; // allocates memory forA number Allocates memory for numeric values var s ='sessionstack'; // allocates memory forA string Allocates memory for a string var o = {a: 1, b: null}; // allocates memoryforAn object and its contained values are allocated memory for the object and its contained values var a = [1, null,'str'];  // (like object) allocates memory forThe // array and its contained values (like objects) allocate memory for the array and its contained valuesfunction f(a) {
  return a + 3;
} // allocates a function (whichIs a callable object) //functionExpressions of the allocate an object function expression also assign an object someElement. AddEventListener ('click'.function() {
  someElement.style.backgroundColor = 'blue';
}, false);
Copy the code

Some function calls also produce object allocation:

var d = new Date(); // allocates a Date object var e = document.createElement('div'); // allocates a DOM element allocates a DOM elementCopy the code

Methods can assign new values or objects:

var s1 = 'sessionstack'; var s2 = s1.substr(0, 3); // s2 is a new string s2 is a new string Because strings are immutable // JavaScript may decide to not allocate memory, but just store the [0, 3] range. Var a1 = ['str1'.'str2'];
var a2 = ['str3'.'str4']; var a3 = a1.concat(a2); // New array with 4 elements being a concatenation of a1 and A2 elementsCopy the code

in`JavaScript`Memory used in

Using allocated memory in JavaScript basically means reading and writing in it.

This can be done by reading or writing the value of a variable or object property, or even passing parameters to a function.

Free up memory when it is no longer needed

Most memory management problems occur at this stage.

The biggest challenge is figuring out when allocated memory is no longer needed. It is often up to the developer to decide where the existing application is no longer needed and to release it.

High-level programming languages embed software called the garbage collector, whose job is to track memory allocation and usage in order to discover when allocated memory is no longer needed, and to automatically free it in that case.

Unfortunately, this process is only approximate, because the general problem of knowing if you still need some memory is undeterminable.

Most garbage collectors work by collecting memory that can no longer be accessed, such as all variables pointing to it being out of scope. But this is also an underestimate of the collectable memory space, because at any time there may still be a variable in scope pointing to a memory address that is never accessed again.

Garbage collection

The inability to determine whether certain memory is “no longer needed” limits the general approach to garbage collection. This section will explain the concepts and limitations necessary to understand the major garbage collection algorithms.

Memory references

One of the main concepts garbage collection algorithms rely on is references.

In the context of memory management, if an object can access another object, it is said that the former refers to the latter (though implicitly and explicitly). For example, JavaScript objects have references to their archetypes (implicit references) and to attributes (explicit references).

In this context, the concept of “object” extends to a much broader scope than regular JavaScript objects, and also includes functional scope (or global lexical scope).

Lexical scope specifies how variable names in nested functions are resolved: the inner function contains the scope of the parent function, even if the parent function has returned.

Reference counting garbage collection

This is the simplest garbage collection algorithm. If there is no reference to an object, it is considered “collectable.”

Take a look at this code:

var o1 = {
  o2: {
    x: 1
  }
};
// 2 objects are created.
// 'o2' is referenced by 'o1'Object as one of its properties. // None can be garbage-collected // two objects are created // o2 is referenced as an attribute of O1  o1; // the'o3' variable is the second thing that
            // has a reference to the object pointed by 'o1'// O3 is the second variable that refers to the object to which o1 points. o1 = 1; // now, the object that was originallyin 'o1' has a
            // single reference, embodied by the 'o3'Variable // Now the object referred to by O1 is a single reference to O3. var o4 = o3.o2; // reference to'o2' property of the object.
                // This object has now 2 references: one as
                // a property.
                // The other as the 'o4'Variable // establishes a reference to the object it refers to via the attribute o2 // This object now has two references: one to the attribute O2 and the other to the variable o4 O3 ='374'; // The object that was originally in 'o1' has now zero
            // references to it.
            // It can be garbage-collected.
            // However, what was its 'o2' property is still
            // referenced by the 'o4'Variable, so it cannot be // freed. // The object referenced by O1 now contains 0 references. // It can be garbage collected // but its attribute O2 is still referenced by the variable o4, so it cannot be released. o4 = null; // what was the'o2' property of the object originally in
           // 'o1'// It can be garbage collected. // The object attribute O2 that was originally referenced by O1 now has only zero references and can now be collected.Copy the code

The cycle creates problems

This has limitations when it comes to circular references. In the following example, a loop is created by creating two objects that reference each other. They go out of scope after the function call returns, so they are effectively useless and should be released. But the reference-counting algorithm takes into account that since they are referenced at least once, neither will be garbage collected.

function f() {
  var o1 = {};
  var o2 = {};
  o1.p = o2; // o1 references o2
  o2.p = o1; // o2 references o1. This creates a cycle.
}

f();
Copy the code

Tag and cleanup algorithms

To determine whether the object is still needed, the algorithm determines whether the object is accessible.

The marking and cleaning algorithm has the following three steps:

Root: In general, roots are global variables referenced by code. For example, inJavaScriptIn, the global variable that can be used as root iswindowObject. The same object inNode.jsIn is calledglobal. The garbage collector builds up a complete list of all roots.
The algorithm then checks all the roots and their children and marks them as active (meaning they are not garbage). Anything the root can’t get is marked as garbage.
Eventually, the garbage collector frees and returns to the operating system any memory fragments not marked as active.

This algorithm is better than the previous one because “an object without a reference” makes the object unreachable, but through the loop we see that the reverse is not true.

After 2012, all modern browsers are loaded with tag and cleanup garbage collectors. In recent years, improvements in all areas of JavaScript garbage collection (generational/incremental/concurrent/parallel garbage collection) have been improvements to the implementation of this algorithm (marking and cleaning), neither improvements to the garbage collection algorithm itself nor improvements to the goal of determining whether an object is reachable.

In this article, you can read a great deal of detail about tracking garbage collection and cover tagging and cleaning and its optimization.

Loops are no longer a problem

In the first example above, when the function call returns, the two objects are no longer referenced by the global object’s reachable node. As a result, they can be considered unreachable by the garbage collector.

Even if they still have references to each other, they cannot be retrieved by the root.

The counterintuitive behavior of the garbage collector

While garbage collectors are convenient, they have their own set of compromises. One is uncertainty. In other words, garbage collection is unpredictable. You never know for sure when garbage collection will take place. This means that in some cases the program will require more memory than it really needs. In other cases, short pauses may be evident in particularly sensitive applications. Although nondeterministic means that it is impossible to determine when garbage collection will be performed, most implementations of garbage collection share a common pattern: collection occurs during memory allocation. If no memory allocation occurs, the garbage collector is idle. Consider the following scenario:

Performs large memory allocations.
Most (or all) of them are marked as unreachable (assuming we null a reference to the cache that is no longer needed).
No further memory allocation occurs.

In this scenario, most garbage collections will no longer run collection pass. In other words, even if there are unreachable references that can be collected, they will not be noticed by the collector. These are not strictly leaks, but still result in higher than normal memory usage.

What is a memory leak?

As memory implies, a memory leak is a piece of memory used by an application that has not been returned to the operating system when it is not needed or is not returned due to poor memory release.

Programming languages like to manage memory in different ways. But whether a known chunk of memory is still in use is actually an open question. In other words, only the developer can figure out whether a chunk of memory should be returned to the operating system.

Some programming languages provide features for developers to manually free memory. Others want explicit declarations completely provided by the developer. Wikipedia has good articles on manual and automatic memory management.

Four common`JavaScript`Let the cat out of the

1: global variable

JavaScript handles undeclared variables in an interesting way: when referencing an undeclared variable, a new variable is created on the global object. In the browser, the global object is window, which means:

function foo(arg) {
    bar = "some text";
}
Copy the code

Is equivalent to

function foo(arg) {
    window.bar = "some text";
}
Copy the code

Let’s assume that bar only references variables in function foo. But if you don’t use the VAR declaration, you create a redundant global variable. In the example above, it doesn’t cause much damage. But you can still think of a more damaging scenario.

You can accidentally create a global variable with this:

function foo() {
    this.var1 = "potential accidental global";
}
// Foo called on its own, this points to the global object (window)
// rather than being undefined.
foo();
Copy the code

You can do this by adding ‘use strict’ to the beginning of your JavaScript file; To avoid this, this opens up a stricter mode for parsing code, which prevents accidental creation of global variables.

Unexpected global variables are certainly a problem, but more often than not, your code will be contaminated by displaying global variables that by definition cannot be collected by the garbage collector. Special attention should be paid to global variables used to temporarily store and process large amounts of information. If you must use global variables to store information and when you do, be sure to assign it to NULL or reallocate it as soon as you are done.

2: Forgotten timer or callback

Let’s look at an example of setInterval, which is often used in JavaScript.

Libraries that provide the observer pattern and other implementations that accept callbacks often ensure that references to these callbacks become unreachable when their instances are unreachable. Again, the following code is not hard to find:

var serverData = loadData();
setInterval(function() {
    var renderer = document.getElementById('renderer');
    if(renderer) { renderer.innerHTML = JSON.stringify(serverData); }}, 5000); //This will be executed every ~5 seconds.Copy the code

The code above shows the consequences of referring to nodes or data that are no longer needed.

The Renderer object may be overwritten or removed at some point, causing statements wrapped in interval handlers to become redundant. Once this happens, the processor and its dependencies cannot be collected until the spacer has been stopped first (remember, it is still active). This leads to the fact that serverData, which is used to store and process the data, will not be collected either.

When using observer mode, you need to be sure to remove them by display calls after completion (both the observer is no longer required and the object becomes unreachable).

Fortunately, most modern browsers take care of this for us: they automatically collect the observer handlers that the observed object becomes unreachable, even if you forget to remove the listeners. Some browsers in the past couldn’t do this (old IE6).

However, it is best practice to remove the observer when the object becomes obsolete. Consider the following example:

var element = document.getElementById('launch-button');
var counter = 0;
function onClick(event) {
   counter++;
   element.innerHtml = 'text ' + counter;
}
element.addEventListener('click', onClick);
// Do stuff
element.removeEventListener('click', onClick);
element.parentNode.removeChild(element);
// Now when element goes out of scope,
// both element and onClick will be collected even in old browsers // that don// Now, when an element goes out of scope, // even older browsers that don't handle cycles well can recycle elements and click-handlers.Copy the code

You no longer need to call removeEventListener before making the node unreachable, because modern browsers support garbage collectors that can detect these loops and handle them appropriately.

If you take advantage of the jQuery APIs (which are also supported by other libraries and frameworks), it can also remove listeners from nodes before they become invalid. The library also ensures that no memory leaks occur, even if the application is running under an older browser.

3: closures

One of the core areas of JavaScript development is closures: inner functions can access variables of outer (enclosing) functions. Due to the implementation details of the JavaScript runtime, memory leaks like the following can occur:

var theThing = null;
var replaceThing = function () {
  var originalThing = theThing;
  var unused = function () {
    if (originalThing) // a reference to 'originalThing'
      console.log("hi");
  };
  theThing = {
    longStr: new Array(1000000).join(The '*'),
    someMethod: function () {
      console.log("message"); }}; };setInterval(replaceThing, 1000);
Copy the code

When replaceThing is called, theThing is assigned an object consisting of a large array and a new closure (someMethod). Also, originalThing is referred to ina closure owned by the unused variable (the value is the variable theThing from the last replaceThing call). Remember that when a closure scope is created, other closures within the same parent scope also share the scope.

In this case, the scope created for the closure someMethod is unused and shared. Even if unused, someMethod can be used with theThing outside of replaceThing (for example, globally). Because someMethod shares a closure scope with unused, an unused referenced originalThing is forced to be active (the entire scope shared between the two closures). These interfere with being collected.

In the preceding column, the scope created for someMethod is shared when unused refers to originalThing. You can use someMethod with a thing outside the replaceThing domain, regardless of its unused use. Unused actually references originalThing to keep it active, because someMethod shares the closure scope with unused.

All of this has resulted in considerable memory leaks. You’ll see a surge in memory usage as the above code is run over and over again. It does not get smaller while the garbage collector is running. A series of closures are created (in this case the root is the variable theThing), and each closure scope indirectly references the large array.

The Meteor team found this problem, and they have a great article that describes it in detail.

4: external DOM reference

Another situation is when developers store DOM nodes in data structures. Suppose you want to quickly update some rows in a table. If the DOM reference to each row is stored in a dictionary or array, there are two references to the same DOM element: one in the DOM tree and one in the dictionary. If you want to remove these lines, remember to make both references unreachable.

var elements = {
    button: document.getElementById('button'),
    image: document.getElementById('image')};function doStuff() {
    elements.image.src = 'http://example.com/image_name.png';
}
function removeImage() {// The image is a direct child of The body element document.body.removeChild(document.getElementById('image'));
    // At this point, we still have a reference to #button in the
    //global elements object. In other words, the button element is
    //still inMemory and cannot be collected by the GC. // At this point, the global elements object still has a pair#button element reference. In other words, the button element// It is still in memory and cannot be collected by garbage collector. }Copy the code

There is another exception that should be considered when referencing the inside or leaf of a DOM tree. If you save a reference to a table cell (td tag) in your code, and then decide to remove the table from the DOM but keep the reference to that particular cell, you can expect a lot of memory leaks. You might think that the garbage collector would free everything but that cell. But that will not happen. Because this cell is a child node of the table, the child nodes hold references to their parent nodes, and referencing this cell will hold the entire table in memory.