Tree structure is a very important data structure. We can solve the sorting problem by balancing binary trees to represent the syntactic structure of the source program. Trees can also represent databases or file systems. And many containers have a tree structure at the bottom.

Here are some tree nouns:

  • Node: represents the data elements in the tree, A,B… H is the node.
  • Degree of a node: the number of subtrees a node has. Degree of B is 2.
  • Degree of tree: The maximum degree of each node in the tree, where the degree of the tree is 2.
  • Leaf Node: a Node whose degree is 0. D,H,F, and G are leaf nodes.
  • Child: node left and right nodes. In the figure, nodes D and E are children of node B.
  • Parent: The upper nodes of a node are called the Parent of the node. Node B is the parent of D and E.
  • Ancestor: All nodes on the branch from the root to the node. A is the ancestor of all other nodes.
  • Descendant: Any node in a subtree rooted in a node. All nodes other than A are descendants of A.
  • Brother: A child of the same parents. Nodes B and C are brothers, D and F are brothers,F and G are brothers.
  • Cousins: Nodes with different parents at the same level. E and F are Cousins to each other.
  • Level of Node: The number of branches along the path from the root to a Node in the tree is called the Level of that Node. The level of A is 4.
  • Depth of Tree: Indicates the maximum number of levels of nodes in a Tree. The depth of the tree here is 4.

Only binary trees (where there are only two children on a node) are explained here. Of course, trees don’t have to have only two children, as we’ll see below with unset and Trie, but we can convert most trees into binary trees to make it easier to understand the concept.

Before we start with binary trees, let’s look at some of the concepts in binary trees:

Full binary tree: A binary tree with depth K and 2^ K-1 nodes is called full binary tree

Complete binary tree: a complete binary tree is a perfect binary tree from the root node to the penultimate layer. The last layer can be partially filled and the leaves are aligned to the left.

Balanced binary tree: The depth difference between the left and right subtrees of each node is no more than 1

Absolutely balanced tree: Only the last layer is leaf nodes and satisfies the characteristics of binary search trees

Binary search tree

If there are an array, given a number (the number in the array) and asked to find the number in the array subscript position, we will have to search for, one by one to traverse the time complexity is O (n), in order to accelerate the search speed, if we get the array is sorted, we can use the binary search, So it’s order logn, which is fast. And the idea of half search is to use the idea of binary search.

Let’s take a look at the node construction of binary search tree:

class Node{ int data; // Node left; // Left child Node right; // Right child}Copy the code

Binary search tree definition

All nodes in the left subtree of a node are smaller than this node, and all nodes in the right subtree of a node are larger than this node.

Binary search tree operation

insert

Binary search tree insertion is simple:

  • Check whether there is a root node. No current node is used as the root node
  • If it is larger than the current node, it is compared to its right node, and if it is smaller than the current node, it is compared to its left node
  • Insert if you compare to a leaf node
  • Repeat Step 2 until step 3 is complete

The average time complexity of inserts is order logn.

delete

Deleting is a little trickier, and there are three types of nodes to be deleted

  • No children: simply traverse to node deletion
  • Have a child: delete this node and move the child of this node to the location of this node
  • There are two children: delete this node, so that the left-most node X of the next child of this node (the right-most node of the left child can also be) moves to this position, then X can only have the right child, and then delete the original position of X when there is only one child.

The average time complexity of deletions is O(logn)

To find the

The search and insert methods are similar, but I won’t go into them here

The average time complexity of lookups is order logn.

DFS and BFS

If you want to traverse a binary tree, there are two ways, depth-first traversal and breadth-first traversal, and depth-first traversal can be divided into pre-order, middle-order and post-order traversal.

  • Depth-first traversal (DFS) : The idea of depth-first traversal is to traverse nodes from each branch first
    • Front-order traversal: node – left child – right child
    • Middle order traversal: left child – node – right child
    • Post-order traversal: left child – right child – node
  • Breadth-first traversal (BFS) : Traversal starts at each level

Let’s take a look at the diagram to see how the traversal looks in detail:

  • Prior traversal: ABCDEF
  • Middle order traversal: CBDAEF
  • Sequential traversal: CDBFEA
  • Breadth-first traversal: ABECDF

The role of binary search trees

  • It takes order n time to delete an array, order n time to find it.
  • For binary search trees it takes order logn to add, delete, and find.

The heap

The definition of the heap

A heap is a complete binary tree, and a node is greater than (or less than) its left and right children

Look at the node construction of the heap

class Node{ int data; // Node left; // Left child Node right; // Right child}Copy the code

The operation of the heap

For heaps he pays more attention to putting the biggest first and maintaining the nature of the heap. Here, since the heap is a full binary tree, you can use arrays to hold elements in the heap.

The following uses the minimum heap, where each node is smaller than its child node:

So for a node, we can easily find the parent nodes and left and right nodes of this node.

Int getParent(int index){return(index + 1) / 2 - 1; } // getLeftChild int getLeftChild(int index){returnindex * 2 + 1; } // getRightChild int getRightChild(int index){return index * 2 + 2;
}
Copy the code
  • Insert, this step is called sift up of the heap
    • Put the element at the end of the array
    • Compare the size of the element to that of its parent node, and if the node is smaller, swap places with it
    • Repeat step 2 until the end node is larger than the parent node
  • To remove the heap, this step is called SIFT down of the heap
    • Removes the element with index 0 from the array and places the last element in the array at index 0
    • Mark the node with the current position 0 as x, find the seat of the smaller element in the child to the left and right of X as Y, compare x with Y, if x is less than y, then x and Y switch positions
    • Repeat step 2 until the leaves are greater than or equal to y

The role of the heap

The heap can implement priority queues, and there are also applications of priority queues in real life, such as putting the elderly and children first to the front of the queue.

expand

D fork heap

Perfect D-tree, minimum root. So if you think of it as a binary heap, this d could be 2,3,4

The index of

The heap we discussed above compares the data of each node, which can be time-consuming if the data is a large data structure. In this case, we can use the index heap to store the location of the data element with an index array on the basis of the original heap, that is, the index heap contains two arrays

The binomial heap

A binomial heap is a collection of binomial trees. A binomial tree is also a tree structure. The KTH tree of a binomial tree has 2k nodes and a height of K. Nodes of depth D have a total of nodes. A binomial heap consists of a set of binomial trees.

There are Fibonacci heaps, Pairing heaps, etc.

Line segment tree

A line segment tree is also a balanced binary tree, but its nodes are somewhat different from those of a balanced binary tree. Each node can be regarded as consisting of an array. If the array of a node is [L,r] and its left child and its right child values are the array space of [L, Mid = l+(r-l)/2 mid = l+(r-L)/2 mid = l+(r-L)/2

Take a look at the key fields of the segment tree:

Private int[] tree; private int[] tree; private int[] tree;Copy the code

The role of line segment trees

Line segment trees can solve some algorithmic problems such as

  • Color an interval (to cover the original color), m times to the random interval, ask how many colors can be seen in the interval [I,j]
  • Find the total number of objects in some space region
  • Give n numbers, n<=1000000, and m operations, each of which modifies the value of a contiguous interval [a,b]

If the structure of line segment tree has a convenient idea to solve the above problems

It is assumed that we want to solve the value of [1, 7] us directly to obtain the tree tree [4] [8] + + tree [2] +, namely [1, 1] + [2, 3] + [4, 7] get the value of [1, 7]. So if you take 0 minus 7 and you take a binary search tree, and you add up from 1 to 7 that’s 7 times O logn.

Line tree operations

Segment trees are mainly for query and modification. We do not discuss additions and deletions.

The query

Here’s a quick look at what to do if you need to query an interval. (Once you understand interval queries, it’s easier to query individual elements.)

For example, if I want to query the value of [1,7], I will return x:

  • The left interval to be searched is larger than the left interval of the node. Start the search from the left subtree [0,3] and divide [1,7] into [1,3] and [4,7].
  • Search from left subtree [0,3], [1,3] and [0,3] are not equal, the left interval to search is larger than the left interval of the node, search from the left interval of the node [0,1], and divide [1,3] into [1,1] and [2,3].
  • Start search from left subtree [0,1], because it is left subtree, [1,1] and [0,1] are not equal, as above is divided into [1,1] and [2,3], so [0,1] should search right subtree [1,1].
  • In this case, [1,1] and [1,1] are equal, and return the corresponding value of [1,1]
  • Then add the previously returned values with [2,3] and [4,7] left in the previous step respectively to get the final value (because [2,3] and [4,7] can be directly found in the node, there is no need to divide).

Value is the tree of [2, 3] [4], [4, 7] value is tree [2], [1, 1] value is tree [8]

The general flow is as described above. It may seem a little obscure at first, but using recursion is pretty clear when combined with the code

Modify the

The operation of modification is the same as that of query, but it is necessary to modify the nodes traversed, which is not described here

In fact, any problem that can be solved with a line tree can be solved

Trie tree dictionary

The dictionary tree is not a binary tree, as its name suggests. First look at its node structure (the root node contains no characters, and the c of the root node is empty).

class Node{ char c; // The character Node[] next; // Next node of the current node Boolean end; // Check if the letter is an ending}Copy the code

We can find that each node of the dictionary tree consists of several Pointers to the next node. The overall structure of the dictionary tree is shown as follows:

The role of the dictionary tree

As you can see in the graph above, when traversed in depth-first middle order, he traverses every leaf node with a single word, such as and, as, at, cn, com.

In the old phone book system, if we use binary search tree to save the information of each contact, let us assume that our address book has 10 million contact information, then we query the information of this contact by name is very time-consuming O(logn). Tree and if we use a dictionary, for example, we have to query a name and contact, we find a root node in the next, find n in a node, then find b in n nodes (if there is full of English letters next only 26 cases), then we only related to the length of the word of time complexity O (K), K is the word length. So the advantage of the dictionary tree in this case is obvious. Of course, the design of dictionary trees is the classic idea of trading space for time.

Dictionary tree operations

The operations of dictionary tree mainly include adding, deleting and searching

Before we begin, explain the use of end. For example, if we have an AS and an ass(smirk.jpg), how can we tell if the word “ass” is just “ass” or if there is also the word “as”? We need an identifier for each node to indicate whether the word is a complete word here. For example, “ass”, the last “s” of “ass” should be true, because the “S” at the end of “ass” is the end of “ass”. If there is an as word, then the first S must be true to indicate that as ends with an S.

increase

The logic for adding is simple, such as adding a geek

  • Locate the head node and check if there is g in next. If not, construct node G (all newly constructed nodes end is false), add to next of current node, and move to node G
  • Similarly determine e, construct and locate to the next node, then E, then K
  • When moving to k, the word is inserted at the end, and end is set to true

delete

For example, in the image above we want to delete the word an

  • A is reached through the head node, and n is reached. If next of n is not empty, change end of n to false
  • If next of n is empty, delete n, go back to the last node and delete n in next, then check whether next is empty, if not, end. If it is empty, delete a, end.

To find the

For example, now we want to find and

  • Find A through the next of the root node, there is, and the current node transposes node A
  • Determine the next of node A to find node N, exists, and the current node transposes node N
  • The current node transposes the node D. At this point, the word to be searched reaches the end, and then determine whether the end of D is true. If true, it means found, otherwise, it means not found.

Check and set

Unjoint set is a tree structure, also called “disjoint set”, which maintains a group of disjoint dynamic sets. Each set is identified by a representative, which is a member of the set. Usually, the root is chosen as the representative.

And look up the role of the set

Can solve connection problems, network node connection status, path problems.

For example, let’s say we have a number of towns, and there might be roads connecting different towns. Consider these towns as nodes and determine which towns are accessible to each other.

And look up the structure of the set

So if we look at the picture on the left, before the merger, we can say that D is connected to C, g is connected to E, and C is not connected to F.

If we want to connect between town B and town F, then we cannot connect b and F directly, but should add the root node of F under the root node of D, so here the root node of F is E, and the root node of B is A, that is, a becomes the parent node of E directly, as shown in the figure.

We can see that the logical structure of the set is a tree structure, but each node has only its own parent node.

And look up the operation of the set

Since a set lookup is primarily used to solve pathfinding problems, the corresponding set lookup requires an operation, isConnect, to determine whether two nodes are connected. If the two nodes can then be connected, the union operation is required to connect the two nodes

Check whether nodes are connected

boolean isConnect(Node p, Node q)

Based on the above analysis, we know that whether the nodes are connected or not is mainly compared with the root node, so we can judge whether the two nodes are the same as long as we get the root node of P and the root node of Q

Node merge

void unionElements(Node p, Node q)

As in the beginning of the analysis, we can first find the root node of p node, and then find the root node of Q node, so that the parents of the root node of Q node point to the root node of P node.

expand

Path to the compression

It’s possible that all the nodes are in one chain, and we can make the whole tree just two levels. As shown in figure

The next chapter will summarize AVL, red black trees, and B+ trees