Writing in the front

All the lists, stacks, queues and so on that we talked about are linear, one-to-one relationships. A tree is a data structure with a one-to-many relationship. For example, we often say that a class diagram of Wuhan, Hubei province, and Changsha, Hunan province, is similar to an inverted tree.

What is a tree

A tree is a data structure, consisting of n nodes with a finite set of hierarchical relationships.

Basic terms for trees

Node: Each data element in the tree is A node (A, B…)

Degree of A node: Number of subtrees of A node (A’s subtrees are B and C)

Tree degree: The maximum degree of all nodes in A tree (tree A and tree C are both 2 degrees)

Leaf node: node (D, E, F) whose degree is 0

Node hierarchy: the tree starts from the root, the first layer where the root resides, and the second layer where the child nodes of the root reside (A first layer, BC second layer).

Ordered tree and disordered tree: Subtrees have left and right order (for example, the left is less than the right), and vice versa

The characteristics of the tree

  1. The subtrees don’t intersect
  2. In addition to the root node, each node has only one parent
  3. A tree of N nodes has only n-1 edges

What is a binary tree

An ordered tree in which the degree of nodes in the tree does not exceed 2.

Features of binary trees:

  1. In a binary tree, the number of nodes at layer I is at most 2i-1

  2. A binary tree of depth K has at most 2k-1 nodes

  3. For any binary tree, the number of terminal nodes (the number of leaf nodes) is N0, and the number of nodes with degree 2 is n2, then n0=n2+1,

    That is, node n0, whose degree is 0, is always one more than node n2, whose degree is 2.

Full binary tree

In a binary tree, the degree of every node except the leaf node is 2, then this binary tree is a full binary tree.

A full binary tree with n nodes has a depth of log2(n+1).

Perfect binary tree

In a binary tree, only the bottom two levels of nodes can have a degree less than two, and the bottom two levels of leaves are concentrated in some positions to the left. Such binary trees are called complete binary trees. Leaf nodes can only appear in the lowest level and the second level, and the lowest level of leaf nodes are concentrated in the left part of the tree. Obviously, a full binary tree must be a complete binary tree, and a complete binary tree is not necessarily a full binary tree.

Huffman tree (optimal binary tree)

Given a binary tree as follows:

Path: A path from one node to another in a tree, called a path. See the path from the root node to A in the figure above.

Path length: In a path, the length of the path increases by 1 for each node passed. In the preceding figure, the path from the root node to node C is 3 in length.

Node weight: Each node is assigned a new value. For example, a has the right 7 and B has the right 5.

Node weight path strength: the product of the length of the path from the root node to the node and the weight of the node. For example, the weighted path length of B is 2 x 5=10.

Tree weighted path length: sum of weighted path lengths of all leaf nodes in the tree. Usually referred to as “WPL”. The weighted path length of the tree shown in the figure is: WPL = 7 * 1 + 5 * 2 + 2 * 3 + 4 * 3

What is a Huffman tree

Construct a binary Tree (each node is a leaf node and has its own weight), and the weighted path length of the Tree reaches the minimum, which is called the optimal binary Tree, also known as Huffman Tree.

Coding problem

Scenario: Given a string of 58 characters, consisting of the following seven characters: A, B, C, D, E, F, and G. These seven characters occur in different frequencies. How to encode these seven characters?

If standard equal-length ASCII encoding is used: 58 × 8 = 464 bits

Code with a binary tree

If 0 and 1 are used to represent the left and right branches, the following tree can be obtained by taking out the four characters with the highest frequency above:

It can be seen from the figure above that there are three possible cases for the characters with the encoding of 0100 to be represented. Therefore, the distribution of the above nodes is ambiguous. How to avoid ambiguity? You just need to make every node a leaf node.

Huffman tree diagram structure

In the above way, we combine the above characters in two in order of frequency, and all of them are leaf nodes. The final structure diagram is as follows:

Therefore, we find that each node is a leaf node, and the final encoding of the character is:

Therefore, the encoding length is:

10×3 + 15×2 + 12×2 + 3×5 + 4×4 + 13×2 + 1×5 = 146 bits

The code structure

Public class HuffmanTree {public static class Node<E> {public static class Node<E>... E data; // int weight; // leftChild Node leftChild; // You child Node rightChild; public Node(E data, int weight) { this.data = data; this.weight = weight; } public String toString() { return "Node[" + weight + ",data=" + data + "]"; }} public static Node createHuffmanTree(List<Node> nodeList) {while (nodelist.size () > 1) sort(nodeList); Nodelist.get (0); nodelist.get (0); nodelist.get (0); Node right = nodeList.get(1); Node<Node> parent = new Node<>(null, left.weight + right.weight); Parent.leftchild = left; parent.rightChild = right; // Remove the smallest node nodelist.remove (0); // Remove the second smallest nodelist.remove (0); Nodelist.add (parent); } // Return nodelist.get (0); } /** * @param nodeList */ public static void sort(List<Node> nodeList) {if (nodelist.size () <= 1) {return; } for (int i = 0; i < nodeList.size(); i++) { for (int j = 0; j < nodeList.size() - 1; If (nodelist.get (j + 1).weight < nodelist.get (j).weight) {Node temp = nodelist.get (j + 1); nodeList.set(j + 1, nodeList.get(j)); nodeList.set(j, temp); }}}} /** * print the Huffman tree from left to right * Public static void printTree(Node root) {if (root.leftChild! = null) {system.out.println (" leftChild: "+ root.leftchild); printTree(root.leftChild); } if (root.rightChild ! = null) {system.out.println (" rightChild: "+ root.rightchild); printTree(root.rightChild); Public static void main(String[] args) {List<Node> nodes = new ArrayList<Node>(); // Add nodes to the list of nodes. Add (new Node("a", 10)); nodes.add(new Node("b", 15)); nodes.add(new Node("c", 12)); nodes.add(new Node("d", 3)); nodes.add(new Node("e", 4)); nodes.add(new Node("f", 13)); nodes.add(new Node("g", 1)); Node root = createHuffmanTree(nodes); printTree(root); }}Copy the code

The test results

Left sub-node: Node[25,data=null] Left sub-node: Node[12,data=c] Right sub-node: Node[13,data=f] Right sub-node: Node[33,data=null] Left sub-node: Node[15,data=b] Right sub-node: Node[15,data=b] Node[18,data=null] Left child Node: Node[8,data=null] Left child Node: Node[4,data=e] Right child Node: Node[4,data=null] Left child Node: Node[1,data=g] Right child Node: Node[8,data=null] Left child Node: Node[4,data=null] Right child Node: Node[1,data=g] Node[3,data=d] Right child Node: Node[10,data=a]Copy the code

conclusion

This chapter is mainly to have a basic concept of the tree, understand the common characteristics of the tree, will continue to talk about binary sorting tree, binary balanced tree, data structure is relatively boring, but stick to it will have a harvest, whether it is to the algorithm or to see the source code has a lot of improvement.

reference

  • Data structure tree, tree storage structure detailed