Someone asked me a deep copy of the questions, I answered him as usual, he suddenly asked me if the data quantity is big, the stack for what to do, no sense I didn’t force at the moment, what critical stack 😢 😢 😢, looked at him the king of the contempt, I’m going to study, and found that the stack and deep copy of the problem, before wrote a article about shallow and deep copy, At that time, I wrote some shallow, but NOW I want to understand it further.

1. Why does the phenomenon of deep and shallow copying occur

To explain this, we first need to understand the storage of basic computers.

The stack memory stores local variables, so anything that is defined in a method is a local variable, and the stack stores a single variable, and once the variable is released, there is no. The heap memory stores arrays and objects, and everything that's new is in the heap, and the heap is all entities, and the entities are used to encapsulate data, and they encapsulate multiple data, and if one data disappears, the entity doesn't disappear, it's still available, so the heap is not freed at any time.Copy the code

Here is a brief description of the stack and heap difference. As we all know, a shallow copy is a copy of an object. When the object is in the heap, a reference address is generated, and the reference address is associated with a variable in the stack.

When a reference data type is copied to another variable, the reference address of the reference data type is actually pointed to the new variable. When the new variable changes the object, the value of the original variable will also change because it shares the same reference address.

2. Problems with implementing the deep copy method

2.1 JSON. Parse (JSON. Stringify ())

The first question mark that comes to mind is why is it possible to implement deep copy?

Json.stringify () converts the object to a JSON string, disconnects the object from the reference address, and deserializes it with json.parse (). Form a new object. Serialization is all about storage and transfer.

As mentioned in the previous article, json.parse (json.stringify ()), although capable of implementing a deep copy, was only mentioned as invalid for function. Later, based on the knowledge accumulate, it is found that it is not only function.

let obj = {
  a: 1.c: function () {},
  d: Symbol(),
  e: undefined.f: null.g: "111"};console.log(JSON.parse(JSON.stringify(obj))); // {a: 1, f: null, g: "111"}
Copy the code

Not only function is filtered, Symbol and undefined are also filtered.

let obj = {
  a: new Date(),
  reg: new RegExp(""),
  err: new Error("error message"),
  nan: NaN.isInfinite: 1.7976931348623157 e10308.minusInfinity: -1.7976931348623157 e10308};console.log(JSON.parse(JSON.stringify(obj))); // {a: "2021-04-21t01:38:11.267z ", err: {}, isInfinite: null, minusInfinity: null, nan: null, reg: {}}
Copy the code

Date objects are converted to strings, RegExp and Error objects are converted to empty objects, and NaN, Infinity, and -infinity are converted to null. I didn’t understand why this was until I looked up json.stringify () in MDN, which clearly lists: json.stringify () converts the value to the corresponding JSON format:

  1. If there is a toJSON() method, that method defines what value will be serialized.
  2. Properties of non-array objects are not guaranteed to appear in a serialized string in a particular order.
  3. Boolean, numeric, and string wrapper objects are automatically converted to their original values during serialization.
  4. Undefined, any function, and symbol values are either ignored during serialization (when they occur in non-array object property values) or are converted to NULL (when they occur in arrays). Function/undefined returns undefined when converted separately, such as json.stringify (function(){}) or json.stringify (undefined).
  5. Executing this method on objects that contain circular references (objects that refer to each other in an infinite loop) throws an error.
  6. All properties with symbol as the attribute key are completely ignored, even if they are mandatory in the replacer parameter.
  7. Date The Date calls toJSON() to convert it to a string (the same as date.toisoString ()) and is therefore treated as a string.
  8. NaN and Infinity values and null are treated as null.
  9. Other types of objects, including Map/Set/WeakMap/WeakSet, serialize only enumerable properties.

So the nature of json.parse (json.stringify ()) is entirely due to the serialization of objects by json.stringify ().

2.2 the recursive

At this stage for the implementation of deep copy method, recursion is the most reliable, but recursion itself a lot of problems, when the amount of data is large, recursive hierarchy is very deep, the biggest problem recursion stack burst appeared.

So what is a stack burst and why does it happen?

In OS/browser /… Before each call to the function, the current running environment will be saved, so that the function call is completed, to restore the scene. This data is usually stored in a stack. In general, this space is not infinite when implemented. As a result, the maximum depth at which functions can be called is limited. Because of the nature of the recursive function, if the boundary check is flawed, it may lead to the storage capacity of the stack being exceeded, commonly known as an “overflow.”

So how to solve the stack burst problem, has become the most important recursive optimization

In my previous post, deep copy and Shallow Copy, I wrote a simple implementation of deep copy recursion, but did not check for the stack burst problem.

First, we create a large amount of data to check whether the previous write deep copy recursive algorithm will overflow.

/ * * * *@param Deep {number} The depth of the object *@param IO {number} how much data each layer has *@returns data* /
const createData = (deep, breadth) = > {
  let data = {};
  let temp = data;
  for (let i = 0; i < deep; i++) {
    temp = temp["data"] = {};
    for (let j = 0; j < breadth; j++) { temp[j] = j; }}return data;
};
Copy the code

Through the verification

deepCopy(createData(1000)); // ok
deepCopy(createData(10000)); // Maximum call stack size exceeded
deepCopy(createData(10.100000)); // ok breadth will not overflow
Copy the code

Let’s see if json.parse (json.stringify ()) will explode the stack.

JSON.parse(JSON.stringify(createData(1000))); // ok
JSON.parse(JSON.stringify(createData(10000))); // Maximum call stack size exceeded
JSON.parse(JSON.stringify(createData(10.100000))); // ok breadth will not overflow
Copy the code

This is consistent with the results of our own encapsulated code, so you can infer that json.parse (json.stringify ()) is also implemented recursively inside

Resolving circular references

Circular reference, in which an object’s property indirectly or directly references itself, causes recursion into an infinite loop and stack overflow.

To solve the circular reference problem, we can open up an additional storage space to store the mapping between the current object and the copied object. When the current object needs to be copied, first go to the storage space to find whether the object has been copied, if so, directly return, if not, continue to copy.

This storage space needs to be able to store data in the form of key-value, and the key can be a reference type. We can choose the data structure Map:

  1. Check whether the map has cloned objects
  2. Yes. – Straight back
  3. None – Stores the current object as key and the cloned object as value
  4. Continue to clone
function clone(target, map = new Map(a)) {
  if (typeof target === 'object') {
    let cloneTarget = Array.isArray(target) ? [] : {};
    if (map.get(target)) {
      return map.get(target);
    }
    map.set(target, cloneTarget);
    for (const key in target) {
      cloneTarget[key] = clone(target[key], map);
    }
    return cloneTarget;
  } else {
    returntarget; }};Copy the code

Use WeakMap to optimize again

The difference between the map and WeakMap can see my articles before take you relearn ES6 | Set and map, WeakMap one of the biggest differences and map is WeakMap is a weak reference.

So what is a weak reference, and what are the benefits of a weak reference, let’s explain first.

In computer programming, a weak reference, as opposed to a strong reference, is a reference that does not guarantee that the object it references will not be collected by the garbage collector. An object that is referenced only by weak references is considered inaccessible (or weakly accessible) and may therefore be reclaimed at any time. Some languages with garbage collection, such as Java, C#, Python, Perl, Lisp, etc., support weak references to varying degrees.

The benefit of weak references is that they can be reclaimed at any time, improving performance. Maps can be manually released to improve performance, but in some cases, they cannot be manually cleared.

Compatible with other data types

The deep copy of the above implementation is compatible only with objects and arrays, but there are many more reference data types. First we determine if it is a reference datatype and if it is function and null.

const isObject = (obj) = > {
  const type = typeof obj;
  returnobj ! = =null && (type === "object" || type === "function");
};
Copy the code

Then determine the data type we want to change the typeOf to Object. The prototype. The toSting. Call to explain in detail () to see I wrote “the road to relearn JS” JS foundation types and reference types

const getType = (obj) = > {
  return Object.prototype.toString.call(obj);
};
Copy the code

There are two types of data structure types: continuable data types and non-continuable data types.

The data type that you can continue to iterate over

The arrays and objects we wrote above are the data types that we can continue to iterate over, in addition to sets and maps.

We need to get the initialization data:

const getInit = (obj) = > {
  const Con = obj.constructor;
  return new Con();
};
Copy the code

Add set and map to deep-copy functions:

const forEach = (array, iteratee) = > {
  let index = -1;
  const length = array.length;
  while (++index < length) {
    iteratee(array[index], index);
  }
  return array;
};

const cloneDeep = (target, map = new WeakMap(a)) = > {
  // Clone the original type
  if(! isObject(target)) {return target;
  }

  / / initialization
  const type = getType(target);
  let cloneTarget;
  if(["[object Map]"."[object Set]"."[object Array]"."[object Object]"."[object Arguments]",
    ].includes(type)
  ) {
    cloneTarget = getInit(target, type);
  }

  // Prevent circular references
  if (map.get(target)) {
    return map.get(target);
  }
  map.set(target, cloneTarget);

  / / clone set
  if (type === "[object Set]") {
    target.forEach((value) = > {
      cloneTarget.add(clone(value, map));
    });
    return cloneTarget;
  }

  / / clone map
  if (type === "[object Map]") {
    target.forEach((value, key) = > {
      cloneTarget.set(key, clone(value, map));
    });
    return cloneTarget;
  }

  // Clone objects and arrays
  const keys = type === "[object Array]" ? undefined : Object.keys(target);
  forEach(keys || target, (value, key) = > {
    if (keys) {
      key = value;
    }
    cloneTarget[key] = cloneDeep(target[key], map);
  });

  return cloneTarget;
};
Copy the code
A data type that cannot be traversed

Other data types are called non-traversable data types.

Bool, Number, String, String, Date, Error we can create a new object directly using the constructor and the original data:

const cloneOtherType = (targe, type) = > {
  const Con = targe.constructor;
  switch (type) {
    case boolTag:
    case numberTag:
    case stringTag:
    case errorTag:
    case dateTag:
      return new Con(targe);
    case regexpTag:
      return cloneReg(targe);
    case symbolTag:
      return cloneSymbol(targe);
    default:
      return null; }};// Clone regular
const cloneReg = (target) = > {
  const reFlags = /\w*$/;
  const result = new targe.constructor(targe.source, reFlags.exec(targe));
  result.lastIndex = targe.lastIndex;
  return result;
};

/ / clone symbol
const cloneSymbol = (targe) = > {
  return Object(Symbol.prototype.valueOf.call(targe));
};
Copy the code
function

There are two types of functions, one is ordinary function, the other is arrow function. The difference between ordinary function and arrow function is that arrow function has no prototype.

const cloneFunction = (func) = > {
  const bodyReg = / (? <={)(.|\n)+(? =})/m;
  const paramReg = / (? < = \ () + (? =\)\s+{)/;
  const funcString = func.toString();
  if (func.prototype) {
    // Normal function
    const param = paramReg.exec(funcString);
    const body = bodyReg.exec(funcString);
    if (body) {
      if (param) {
        const paramArr = param[0].split(",");
        return new Function(... paramArr, body[0]);
      } else {
        return new Function(body[0]); }}else {
      return null; }}else {
    return eval(funcString); }};Copy the code

Complete code can see ConardLi great god make ConardLi/ConardLi lot. IO

Stack overflow test for optimized code:

cloneDeep(createData(1000)); // ok
cloneDeep(createData(10000)); // Maximum call stack size exceeded
cloneDeep(createData(10.100000)); // ok breadth will not overflow
Copy the code

Although recursion has been optimized to solve the problem of circular reference and the use of weak reference to a greater extent, it still hasn’t solved the pain point of recursion, the problem of stack explosion.

Solve recursive stack explosions

Generally speaking, there are two ways to solve the recursion stack burst. The first way is to use the usual tail call, and the second way is to change the recursion into a loop.

The tail call

In ruan Yifeng’s final call Optimization article, he wrote:

The concept of a tail-call is simple and can be explained in one sentence: the last step of a function is to call another function. The tail call is different from other calls because of its special call location. As we know, a function call forms a "call record" in memory, also known as a "call frame", which holds information such as the call location and internal variables. If function B is called inside function A, then A call record of function B is formed above the call record of function A. B's call record will not disappear until B finishes running and returns the result to A. If function B also calls function C internally, there is also a call stack for C, and so on. All of these records form a "call stack". The final call is the last step of the function operation, so there is no need to keep the call record of the outer function, because the call location, internal variables and other information will not be used, as long as the call record of the inner function directly, instead of the call record of the outer function. Tail call optimization, in which only inner functions are called. If all functions are tailor-called, it is perfectly possible to have only one call record per execution, which is a big memory saverCopy the code

In order to let you understand, simple implementation of the end of the call:

1,function f(x) {
  return g(x);
}

2,function f(x) {
  return g(x) + 1;
}

3,function f(x) {
  let y = g(x);
  return y;
}
Copy the code

Only the first of the above functions is a tail call. The other two are identical in form, but they all have other operations after the call to g, so they are not tail calls.

Next we will optimize our recursion by using end-of-call optimization.

I am not only, the tail recursive optimization is only as follows, I ask you to help give directions:

let deepCopy = (obj, index = 0, new_obj = obj instanceof Array ? [] : {}) = > {
  if (typeofobj ! = ="object") return;
  // new_obj = result ? result : new_obj;
  let keys = Object.keys(obj);
  if (index === keys.length) {
    return new_obj;
  } else {
    if (obj.hasOwnProperty(keys[index])) {
      if (typeof obj[keys[index]] === "object") {
        new_obj[keys[index]] = new_obj;
        return deepCopy(obj[keys[index]], 0, new_obj);
      } else {
        new_obj[keys[index]] = obj[keys[index]];
        return deepCopy(obj, index + 1, new_obj); }}}};Copy the code

Change traversal to loop

Since recursion has so many problems, instead of recursively implementing a deep copy, we can implement a loop instead, which makes it much harder to implement.

Here recommended to view @jsmini/ Clone source code, standing on the shoulders of giants.

Let’s take an example of a deep-level object and see how recursion can be made into a loop.

let obj = {
  a: 1.b: {
    a: 2.b: {
      c: {
        a: 4,},d: {
        a: 6,},},},};Copy the code

Speaking of loops, we are going to create a stack, break out of the loop when the stack is empty, and consider several aspects, how the parent key is stored, and how the child key under that parent key is stored.

const cloneForce = (x) = > {
  let result = {};
  let stack = [
    {
      parent: result,
      key: undefined.data: x,
    },
  ];
  while (stack.length) {
    let node = stack.pop();
    let { parent, key, data } = node;
    let res = parent;
    if (typeofkey ! = ="undefined") {
      res = parent[key] = {};
    }
    for (let key in data) {
      if (data.hasOwnProperty(key)) {
        if (typeof data[key] === "object") {
          stack.push({
            parent: res,
            key,
            data: data[key],
          });
        } else{ res[key] = data[key]; }}}}return result;
};
Copy the code

Test to see if it blows the stack:

cloneForce(createData(1000)); // ok
cloneForce(createData(10000)); // ok
cloneForce(createData(10.100000)); // ok breadth will not overflow
Copy the code

Ok, perfect solution to the stack explosion problem

Test the circular reference:

const target = {
  field1: 1.field2: undefined.field3: {
    child: "child",},field4: [2.4.8]}; target.target = target; cloneForce(target);// The page crashes
Copy the code

Next, let’s solve the circular reference problem:

Based on previous experience, we can use WeakMap to solve circular references.

const cloneForceWeakMap = (x, map = new WeakMap(a)) = > {
  let result;
  if (getType(x) === "[object Object]") {
    result = {};
  } else if (getType(x) === "[object Array]") {
    result = [];
  }
  let stack = [
    {
      parent: result,
      key: undefined.data: x,
    },
  ];
  while (stack.length) {
    let node = stack.pop();
    let { parent, key, data } = node;
    let res = parent;
    if (typeofkey ! = ="undefined") {
      res = parent[key] = getType(data) === "[object Object]" ? {} : [];
    }
    if (map.get(data)) {
      parent[key] = map.get(data);
      continue;
    }
    map.set(data, res);
    if (getType(data) === "[object Object]") {
      for (let key in data) {
        if (data.hasOwnProperty(key)) {
          if (typeof data[key] === "object") {
            stack.push({
              parent: res,
              key,
              data: data[key],
            });
          } else{ res[key] = data[key]; }}}}else if (getType(data) === "[object Array]") {
      for (let i = 0; i < data.length; i++) {
        if (typeof data[i] === "object") {
          // Next loop
          stack.push({
            parent: res,
            key: i,
            data: data[i],
          });
        } else{ res[i] = data[i]; }}}}return result;
};
Copy the code

The above code is only compatible with arrays and objects, the rest can see jSmini/Clone yan Haimirror big guy write source code, is not as good as.

Performance tests for each deep copy

We will test and compare the performance of the above three deep copies (excluding tail recursion for the time being) in terms of breadth and depth, respectively, to see which is more advantageous.

The depth of the

Several depth values were fixed, 500, 1000, 1500, 2000, 2500 and 3000, and their running times were compared.

const runTime = (fn) = > {
  let startTime = new Date().getTime();
  fn();
  let endTime = new Date().getTime();
  return endTime - startTime;
};
Copy the code

CloneJSON as JSON. Parse (JSON. Stringify ())

cloneJSON cloneDeep cloneForceWeakMap
500 1 1 1
1000 1 1 1
1500 2 2 2
2000 3 3 2
2500 4 4 2
3000 6 Stack overflow 3

From the runtime point of view, cloneForceWeakMap is far stronger than the former two, cloneDeep to the depth of 3000 there is a stack burst, this is not as good as cloneJSON.

In addition to the running time, we can also view the number of times the deep copy has been executed in a given time, based on the above, we set the depth to 500, 1000, 1500, 2000, 2500.

const runTime = (fn, time) = > {
  var stime = Date.now();
  var count = 0;
  while (Date.now() - stime < time) {
    fn();
    count++;
  }

  return count;
};
Copy the code

Let’s set the time to 2s and see how many times each deep copy is executed.

cloneJSON cloneDeep cloneForceWeakMap
500 8953 29498 34116
1000 3136 14883 17365
1500 1592 8720 10408
2000 987 7053 8132
2500 652 5770 7082

According to the figure above, it is obvious that in terms of performance cloneForceWeakMap > cloneDeep > cloneJSON

breadth

According to the above method, we set the depth constant at 500 and the breadth at 100, 1000, 10000, 100000

cloneJSON cloneDeep cloneForceWeakMap
100 238 446 408
1000 22 46 43
10000 2 5 5
100000 0 1 1

As can be seen from the above diagram, in the breadth of cloneDeep > cloneForceWeakMap > cloneJSON, in the breadth of a large case cloneDeep and cloneForceWeakMap are about the same.

conclusion

cloneJSON cloneDeep cloneForceWeakMap
A circular reference Does not support support support
Burst stack will will Don’t
The difficulty easy medium difficult

When the depth is very shallow and only objects and arrays, cloneJSON can be used. When the data volume is large, cloneDeep is used. When the data volume is large, cloneForceWeakMap is used.

Thank you for reading, tail recursion to achieve deep copy, in the end of writing this article still did not want to come out, but also please all the gods to help, thanks again.

reference

  1. The ultimate search for deep copy (unknown to 99% of the population)
  2. How to write a deep copy of an amazing interviewer?

After the language

I think it’s ok, please give me a thumbs up when I leave, and we can study and discuss together! You can also follow my blog and I hope you can give me a Star on Github, you will definitely find a problem, almost all my user names are related to tomatoes, because I really like to eat tomatoes ❤️!! Want to follow the car not lost guy also hope to follow the public number front old tomato. After attention, I pull you into the front end of the advanced group, we communicate together, learn together, also can get free information!! I am a primary school student in the field of programming, your encouragement is the motivation for me to keep moving forward, 😄 hope to refueling together.