Moment For Technology

Analysis of ImmutableJS persistent data structure implementation

Posted on Aug. 4, 2022, 7:23 p.m. by Mandy Gray-Dunn
Category: The front end Tag: The front end Immutable.js

preface

The internal implementation uses a new data structure that has been rewritten. As long as the data structure is changed, new references will be returned. At the same time, the shared structure of the objects before and after is changed to solve the requirements of data replication and caching


use

const { Map } = require("immutable");
const map1 = Map({ a: 1.b: 2.c: 3 });
const map2 = map1.set("b".50);
map1.get("b") + " vs. " + map2.get("b"); // 2 vs. 50
Copy the code

Use is very simple unspecified details, mainly said how to use in actual combat


In actual combat

Match the React

Component -- react. PureComponent(function Component react. memo)-- react. PureComponent+ImmutableJS

The common Component

Before writing an SCU, once the parent component is updated, the child component must be updated, regardless of the props passed in, or even if no props were passed in to begin with, which can result in a serious performance waste.

React.PureComponent(React.memo)

For the above performance waste, the react.purecompoent component can be used to use the built-in SCU, which uses shallow contrast

class CounterButton extends React.PureComponent {
  constructor(props) {
    super(props);
    this.state = { count: 1 };
  }
  render() {
    return (
      button
        color={this.props.color}
        onClick={()=  this.setState((state) = ({ count: state.count + 1 }))}
      
        Count: {this.state.count}
      /button); }}Copy the code

Of course, once the deeper properties are changed, the component will not be aware of them and will not update, thus causing errors. (You can also write the SCU by hand, which is more intelligent, but writing the specific SCU will consume manpower.)

React.PureComponent+ImmutableJS

Is it possible to have components update precisely, update data extremely quickly, and return a new top-level reference without having to do a deep comparison every time?

Yes, it is the famous ImmutableJS, it not only meets the above advantages but also gives you efficient caching, simple time travel, before this to achieve time travel requires deep copy of data.

Keep in mind that this is only valid in the context of the larger concept of one-way data flow, such as Vue's two-way binding is done directly with proxy or recursive defineProperty, and still updates accurately (but proxy is not efficient).

Redux+Immutable

Speaking of state management, how could we do without our Redux, where the reducer needs to be clean and return a new reference every time, as follows

function todoApp(state = initialState, action) {
  switch (action.type) {
    case SET_VISIBILITY_FILTER:
      return Object.assign({}, state, {
        visibilityFilter: action.filter,
      });
    case ADD_TODO:
      return Object.assign({}, state, {
        todos: [
          ...state.todos,
          {
            text: action.text,
            completed: false,}]});default:
      returnstate; }}Copy the code

Immutable syntax is a way of saying that it is not possible to write such a syntax by mutating it Immutable, which returns a new top-level reference


The body of the

Immutable is the way Immutable is used in my mind, but it is not the purpose of this article. The main purpose of this article is to understand how to implement such a magical data structure. As a reader, Immutable is not a bad way to use it.

The prefix tree

A data structure that is Immutable is not a mutable form, but a data structure that is not mutable. It is a data structure that is Immutable.

Trie is an efficient information re***Trie***val data structure. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree. Using Trie, we can search the key in O(M) time. However the penalty is on Trie storage requirements

Simply put, a prefix tree is a multi-fork tree where the path from the root to a node constitutes a word. It doesn't have to go to the leaf node, although it does in the example diagram, but the end of the node depends on whether the node has the isEnd flag, which is a custom, for prefix trees, the main operations are insert and search,insert It is to insert a word into the tree, the specific operation is to insert a letter,search is to find out whether there is a word in the tree, click me fiercely to learn more, the advantage is fast query.

The List and Map

List is implemented using an index prefix tree. The index is generated using a Bitmap, or Bitmap. Specifically,Vector Trie uses leaf nodes to store information and other nodes to store indexes

// Constants describing the size of trie nodes.
export const SHIFT = 5; // Resulted in best performance after ______?
export const SIZE = 1  SHIFT;
export const MASK = SIZE - 1;
Copy the code

Three prefix tree - related constants are defined here

The SHIFT constant defines an index mapped to each SHIFT bit, which is defined as 5 bits, i.e. index [0,31].

The SIZE constant defines the length of each tree node index array as SIZE, where 2^5 = 32, matching the SHIFT bit-mapped index

The MASK constant defines the MASK used for the operation after the shift, which is 11111

How do we map it?

Let's look at the pseudo code

public class BitTrie {
  public static final int BITS = 5,
                          WIDTH = 1  BITS, / / 2 ^ 5 = 32
                          MASK = WIDTH - 1; // 31, or 0x1f

  // Array of objects. Can itself contain an array of objects.
  Object[] root;
  // BITS times (the depth of this trie minus one).
  int shift;
  public Object lookup(int key) {
    Object[] node = this.root;
    // perform branching on internal nodes here
    for (int level = this.shift; level  0; level -= BITS) {
      node = (Object[]) node[(key  level)  MASK];
      // If node may not exist, check if it is null here
    }
    // Last element is the value we want to lookup, return it.
    returnnode[key  MASK]; }}Copy the code

I'm looking for the Java version because Java code is easier to read

public Object lookup(int key) {
    Object[] node = this.root;
    // perform branching on internal nodes here
    for (int level = this.shift; level  0; level -= BITS) {
      node = (Object[]) node[(key  level)  MASK];
      // If node may not exist, check if it is null here
    }
    // Last element is the value we want to lookup, return it.
    return node[key  MASK];
 }
Copy the code

Note that the lookup method, by constantly shifting, intercepting a high index into the next level, Immutable source code is done by recursion, detail Bit Partitioning, I understand

// The source code has been deleted
function updateVNode(node, ownerID, level, index, value, didAlter) {
  const idx = (index  level)  MASK;
  const nodeHas = node  idx  node.array.length;
  if(! nodeHas  value ===undefined) {
    return node;
  }
  let newNode;
  if (level  0) {
    const lowerNode = node  node.array[idx];

    / / recursion
    const newLowerNode = updateVNode(
      lowerNode,
      ownerID,
      level - SHIFT, // Update level, since index is constant, so update level to intercept different bits
      index,
      value,
      didAlter
    );
    if (newLowerNode === lowerNode) {
      return node;
    }
    newNode = editableVNode(node, ownerID);
    newNode.array[idx] = newLowerNode;
    returnnewNode; }}Copy the code

Take a chestnut

Get operation

Take list.get (141) as an example, where SHIFT is defined as 2 for simplicity

  1. 141 converts to binary10001101
  2. Starting from the root node, each SHIFT=2 bits is used as the index of a layer
  3. The first index is binary10= 2, find the current node index array index = 2, get the next level of the index node reference
  4. The second index is binary00= 0, find the current node index array index = 0, get the next level of the index node reference
  5. The second index is binary11= 3, find the current node index array index = 3, get the next level of the index node reference
  6. The second index is binary01= 1, find the current node index array index = 1, obtain the result, return the result

Set operations

The set operation is similar to the GET operation, except that in order to keep the data persistent, a new top-level reference needs to be returned, so nodes along the way are copied as we index down from the root node, and the final node is found and replaced

List.set(4,"beef")

  1. Similar to get, binary 4, take SHIFT bit as index
  2. Copy along nodes
  3. Find the leaf node where the data is stored, make a copy and modify the data
  4. Return to new ·

Map 采用的是 Hash Trie

This is basically the same as Vector Trie, except that the key is hash mapped to the integer index

function hash(key: any) :Number {
  //hash key
}
Copy the code

The hash source is as follows

export function hash(o) {
  switch (typeof o) {
    case "boolean":
      // The hash values for built-in constants are a 1 value for each 5-byte
      // shift region expect for the first, which encodes the value. This
      // reduces the odds of a hash collision for these common values.
      return o ? 0x42108421 : 0x42108420;
    case "number":
      return hashNumber(o);
    case "string":
      return o.length  STRING_HASH_CACHE_MIN_STRLEN
        ? cachedHashString(o)
        : hashString(o);
    case "object":
    case "function":
      if (o === null) {
        return 0x42108422;
      }
      if (typeof o.hashCode === "function") {
        // Drop any high bits from accidentally long hash codes.
        return smi(o.hashCode(o));
      }
      if(o.valueOf ! == defaultValueOf typeof o.valueOf === "function") {
        o = o.valueOf(o);
      }
      return hashJSObj(o);
    case "undefined":
      return 0x42108423;
    default:
      if (typeof o.toString === "function") {
        return hashString(o.toString());
      }
      throw new Error("Value type " + typeof o + " cannot be hashed."); }}Copy the code

Take a chestnut

map.get("R")

  1. Hash the key to get the binary01010010
  2. Take 2 bits from back to front as the index to get the node at the next level
  3. Repeat 2 until the key storage location is found

Some friends might ask, why do I hash from back to front? I can't hash from front to back, because I don't know how many digits to hash, I can only hash from back to front


The end of the

In fact, the core concept of persistent data first appeared in a paper, such as GO language has a similar implementation,Immutable author also self-mockery is copycat, but React with a tiger to add power is also obvious, my level is limited, so this paper does not GO too deep research, such as Sep,Record and other data structures have not been analyzed, errors are unavoidable, please correct.


reference

immutable-js

Functional Go: Implementation of Vector Trie

wiki hash_array_mapped_trie

Understanding Clojure's Persistent Vectors, pt. 2

Use immutable to optimize React

React performance optimizations (1) When PureComponent encounters ImmutableJS

Reconciliation in React detailed explanation

React.js Conf 2015 - Immutable Data and React

What is the advantage of trie tree? What are his weaknesses? What is his efficiency in big O?

Search
About
mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.