By mastering the characteristics of different data structures, you can use appropriate data structures to deal with different problems and achieve twice the result with half the effort.

So this time we’ll go into detail about the characteristics of various data structures, and hopefully you’ll get the hang of it.

Intensive reading

An array of

Arrays are very common because they are contiguous chunks of memory and therefore can be accessed directly by subscript, with a look-up efficiency of O(1).

However, the efficiency of array insertion and deletion is low, only O(n), because in order to maintain the continuity of the array, some operations must be performed on the array after insertion or deletion: for example, when the KTH element is inserted, the following element needs to be moved back; To delete the KTH element, move the next element forward.

The list

Linked lists were invented to solve the problem of arrays, improving insertion and deletion efficiency at the expense of search efficiency.

The insertion and deletion efficiency of the linked list is O(1), because the insertion and deletion can be completed as long as the corresponding elements are disconnected and reconnected, and there is no need to care about other nodes.

The corresponding search efficiency is low because the storage space is not continuous, so it cannot be searched directly by subscript like an array. Instead, it needs to be searched continuously by pointer, so the search efficiency is O(n).

By the way, linked lists can be modified to bidirectional lists by adding the.prev attribute, or binary trees (.left. right) or multi-tree (N.next) can be defined by defining two.next.

The stack

A stack is a first-in, last-out structure that can be simulated using arrays.

const stack: number[] = []

/ / into the stack
stack.push(1)
/ / out of the stack
stack.pop()
Copy the code

The heap

Heap is a special kind of complete binary tree, which is divided into big top heap and small top heap.

The big top heap means that the binary root node is the largest number and the small top heap means that the binary root node is the smallest number. For the sake of illustration, the following uses the big top heap as an example, and the small top heap does the opposite.

In the big top heap, any node is larger than its leaf node, so the root node is the largest node. The advantage of this data structure is that the maximum value can be found with O(1) efficiency (the small top heap finds the minimum value), because stack[0] is the root node.

Here is a brief mention of the binary tree and array structure mapping, because using arrays to manipulate binary numbers has advantages both in operation and space: The first item stores the total number of nodes. For a node with subscript K, the subscript of its parent node is floor(K / 2), and the subscripts of its children are K * 2 and K * 2 + 1 respectively, so the location of the parent and child can be quickly located.

By using this feature, the efficiency of insertion and deletion can reach O(logn), because the order of other nodes can be adjusted by moving up and down. For a complete binary tree with N nodes, the depth of the tree is logn.

Hash table

Hash tables are called maps. Different maps are implemented in different ways. Common hash tables include HashMap, TreeMap, HashSet, and TreeSet.

Map and Set are similar in implementation, so Map is used as an example.

First, calculate the ASCII code value of the character to be stored, and then locate the subscript of an array according to methods such as remainder. The same subscript may correspond to multiple values, so the subscript may correspond to a linked list. Further search according to the linked list is called the zipper method.

If the number of stored values exceeds a certain number, the query efficiency of linked list will be reduced, and it may be upgraded to red-black tree storage. In short, the efficiency of adding, deleting and searching such a list is O(1), but the disadvantage is that its content is disordered.

To keep the content ordered, you can use a tree structure called a HashTree, which degrades the time complexity to O(logn), but has the advantage that the content can be ordered.

Tree & binary search tree

Binary search trees are a special kind of binary tree, and there are more complicated ones like red-black trees, but I won’t go into that, just binary search trees.

The binary search tree satisfies the requirement that for any node, all nodes of left < root < all nodes of right. Note that this is all nodes, so all cases need to be considered recursively when judging.

The advantage of binary search trees is that the time complexity of access, search, insert, and delete is O(logn), because any operation can be done in binary mode. But in the worst case, it will be reduced to O(n), because after many operations, the binary search tree may not be balanced, and eventually degenerate into a linked list, which becomes the time complexity of the linked list.

A better solution is AVL tree, red black tree, such as JAVA, C++ standard library implementation of binary search tree are red black tree.

The dictionary tree

Dictionary trees are mostly used in word search scenarios. As long as a single beginning is given, several recommendation words can be found quickly.

In the example above, type “O” to quickly find the words “OK” and “OL” at the end. Note that each node should have a property isEndOfWord indicating whether it is a complete word so far: For example, go and good are both complete words, but goo is not. Therefore, the second O and the fourth D are marked with isEndOfWord indicating that a complete word has been found. The mark of the leaf node can also be omitted.

Check and set

And lookups are used to solve the gang problem, or the island problem, where multiple elements belong to a set. Therefore, the data structure of Union and Find can be written as a class, providing the two most basic methods Union and Find.

Where union can put any two elements in a set, find can find which root set any element belongs to.

And find the set using an array of data structures, only with the following special meaning, set subscript k:

  • nums[k]Represents the collection to which it belongs, ifnums[k] === kThat means it’s the root of the set.

To count the number of sets, count the number of sets that satisfy the nums[k] === k condition, just as to count the number of gangs, count the number of bosses.

And look up the set implementation is different, the data will be subtle different, efficient and look up the set when inserting, will recursively point the value of the element as far as possible to the root, so that the search judgment calculation is faster, but even if the point is not the root of the boss, you can also find the root of the boss by recursive way.

Bloom filter

Bloom Filter is just a Filter that can exclude unhit data much faster than other algorithms, but the unexcluded data may not actually exist, so further queries are needed.

How does the Bloom filter do this? It’s binary.

As shown in the figure above, we first stored a and B data, converted them into binary and changed the corresponding to 1. Then when we query a or B again, the result must exist because the mapping relationship is the same.

When c is queried, one of the items is 0, indicating that C must not exist. However, when querying D, both of them are found to be 1, but actually D does not exist, which is the reason for the error.

Bloom filter is widely used in bitcoin and distributed systems. For example, when bitcoin is queried whether the transaction is on a node, bloom filter is first used to block the transaction to quickly skip unnecessary searches. Distributed system computing, such as Map Reduce, also uses Bloom filter to quickly filter out calculations that are not on a node.

conclusion

Finally, the average and worst time complexity diagrams of “access, query, insert and delete” of each data structure are given:

This chart is from Bigocheatsheet, and you can access it directly by clicking the link.

Once you’ve learned these basic data structures, hopefully you’ll be able to combine them to solve real problems, and realize that no single data structure can do everything, otherwise there wouldn’t be so many data structures to learn, just one data structure that can do everything.

For the combination of data structures, I’ll give two examples:

The first example is how to query the maximum or minimum value of a stack in O(1) average time complexity. In this case, one stack is not enough, and another stack B is needed to support it. In this case, the first number of stack B is the largest or smallest value in the current stack. The query efficiency is O(1), and it needs to be updated only when the stack is out, so the overall average time complexity is O(1).

The second example is how to improve the efficiency of linked list search. By combining hash table with linked list, we can quickly locate the position of any value in the linked list by using hash table in the way of space for time, so that the time complexity of insertion, deletion and query can be O(1) with the sacrifice of space doubling. Although the hash table can achieve this time complexity, the hash table is unordered; Although HashTree is ordered, the time complexity is O(logn), so the order and time complexity can only be achieved by combining HashMap and linked list, but at the expense of spatial complexity.

Including the bloom filter is not used alone, it is just a firewall, with high efficiency to block some illegal data, but not blocked is not necessarily legal, need to further query.

Therefore, I hope you can understand the characteristics, limitations and combinations of data structures, and BELIEVE that you can flexibly use different data structures in actual scenarios to achieve the optimal solution of the current business scenario.

React Server Component · Issue #312 · dT-fe /weekly

If you’d like to participate in the discussion, pleaseClick here to, with a new theme every week, released on weekends or Mondays. Front end Intensive Reading – Helps you filter the right content.

Pay attention to the front end of intensive reading wechat public account

Copyright Notice: Freely reproduced – Non-commercial – Non-derivative – Remain signed (Creative Commons 3.0 License)