1. Construction Principle

Assuming n weights, the constructed Huffman tree has n leaf nodes. N weights are set as W1, W2… , wn, then the construction rule of Huffman tree is: (1) W1, W2… , wn is regarded as a forest with N trees (each tree has only one node); (2) Select the two trees with the lowest root weights in the forest and merge them as the left and right sub-trees of a new tree, and the root weights of the new tree are the sum of the left and right sub-root weights of the tree; (3) Delete the two selected trees from the forest, and add the new tree to the forest; (4) Repeat steps (2) and (3) until there is only one tree left in the forest, which is the Huffman tree obtained.

Obviously, for n weights, constructing Huffman tree requires n-1 merging times, and the total number of tree nodes formed is 2n-1.

2 Code implementation

According to the construction principle of Huffman Tree, for the convenience of implementation, we use an array to store each node, named Tree;

2.1 Node Structure

A node has the following structures: weight: weight parent: position of the parent node in the array (used to determine whether the node has participated in the merge) lChild, rchild: position of the left and right child nodes in the array code: the code assigned to the node

// Node structure
struct HNode{
	float weight;  / / has a weight
	int parent;  // The parent node determines whether a node has been added to the Huffman tree
	int lchild, rchild;  // Left child, right child
	string code;  // Record the node encoding
	HNode() {
		weight = 0;
		parent = lchild = rchild = - 1;
		code = "";
	}
	HNode(float w) {
		weight = w;
		parent = lchild = rchild = - 1;
		code = ""; }};Copy the code

2.2 Key member function analysis

2.2.1 Constructors

To determine whether a node has been added to the Huffman tree, use the value of the parent field. The initial value of parent is -1. When a node is added to the tree, the value of the parent field is the subscript of the parent node in the array.

When constructing Huffman Tree, the leaves with n weights are stored in the first n components of the array Haftree, and then the two subtrees are continuously merged into one subtree, and the root nodes of the new subtree are sequentially stored after the first n components of the array Tree.

Hafman algorithm is described in pseudocode as follows: 1. Array hafTree is initialized, and parents and left and right children of all array elements are set to -1. 2. Set the weights of the first n elements of array HafTree to the given weights; 3. N-1 merge: 3.1 Select two root nodes with the lowest weights in the binary tree set, whose subscripts are i1 and i2 respectively; 3.2 Merge binary trees I1 and I2 into a new binary tree K.

The constructor input is an array of weights and an array length

	// Construct the Huffman tree, enter the weight array and the array length n
	void Create(float* a, int n) {
		TreeSize = 2 * n - 1;  // Determine the number of nodes of Huffman tree
		Tree = new HNode[TreeSize];
		// The weights are stored in the first n values of the Huffman tree array
		for (int i = 0; i < n; i++)
		{
			Tree[i].weight = a[i];
		}

		// Start n-1 merge operation to form a Huffman tree
		int s1, s2;
		int nextPos = (TreeSize + 1) / 2; // Record an insertion position, starting with n
		for (int i =0; i < (TreeSize- 1) /2; i++) {
			SelectTwoMin(nextPos, s1, s2); // Find the two points in the array with the least weight
			// Assign weights to children of new nodes in the Tree
			Tree[nextPos].lchild = s1;
			Tree[nextPos].rchild = s2;
			Tree[nextPos].weight = Tree[s1].weight + Tree[s2].weight;
			// Assign a parent to both nodes (meaning both nodes have been added to the Huffman tree)
			Tree[s1].parent = Tree[s2].parent = nextPos;
			nextPos++;  // Move the insertion position back one bit}}Copy the code

2.2.2 Find the two smallest points

First, find the first two uncombined points (parent=-1), and compare the weights with the uncombined points to determine the final minimum two points, and ensure that the lower weight in the front;

	// Find the two points in the array with the least weight
	void SelectTwoMin(int nextPos,int &s1, int &s2) 
	{
		// find the first node not added to the Huffman tree, marked s1
		int i=0;
		while(Tree[i].parent ! =- 1) {
			i++;
		}
		// Find the second node that has not been added to the Huffman tree starting from the last digit of s2
		s1 = i;
		int j = i + 1;
		while(Tree[j].parent ! =- 1) {
			j++;
		}
		s2 = j;
		// Adjust the weight of s1 and S2 to ensure that the weight of the corresponding point of S1 is smaller, so that when looking for smaller points, only s2 can be compared
		KeepOrder(s1, s2);

		// Start with the last bit of s2 and compare with s1 and s2 to determine the two points in the array with the least weight
		for (int k = j + 1; k <nextPos; k++) 
		{
			if (Tree[k].parent == - 1) {
				if (Tree[k].weight < Tree[s2].weight) {
					s2 = k;
					// Find points less than s2, and compare them with s1 to determine the order of sizeKeepOrder(s1, s2); }}}}// Ensure that the weight of the node corresponding to n1 is less than n2
	void KeepOrder(int& n1,int& n2) {
		if (Tree[n1].weight > Tree[n2].weight)
		{
			floattmp = n1; n1 = n2; n2 = tmp; }}Copy the code

2.2.3 Code and output the constructed Huffman tree

Hierarchical traversal is used here, with a queue; The code of the child node is the parent’s code plus ‘1’/’0′ (depending on whether the left child is the parent’s right child, which is ‘1’); Output only the encoding of the leaf node. It can also output the first n node codes of Tree array directly after the coding is completed.

	// Encode and output the nodes in the tree
	void PrintCoder(a) 
	{
		cout << "index weight code" << endl;
		queue<int> codeQueue;  // Save only the number
		int tmp;
		codeQueue.push(TreeSize - 1);
		while(! codeQueue.empty()) { tmp = codeQueue.front(); codeQueue.pop();if (Tree[tmp].lchild ==- 1 && Tree[tmp].rchild == - 1) {  
				// Direct output of leaf nodes
				cout << setw(5) << tmp << setw(6) << Tree[tmp].weight << setw(10) << Tree[tmp].code << endl;	
			}
			else {// Because Huffman trees have only N0 and N2
				codeQueue.push(Tree[tmp].lchild);
				Tree[Tree[tmp].lchild].code = Tree[tmp].code + "1";  // Encode the left child

				codeQueue.push(Tree[tmp].rchild);
				Tree[Tree[tmp].rchild].code = Tree[tmp].code + "0"; // Encode the right child}}}};Copy the code

2.2.4 Output all information of all nodes of the Huffman tree (except coding);

	// Outputs the completed Huffman tree
	void PrintTree(a) {
		cout << "index weight parent lchild rchild" << endl;
		for (int i = 0; i < TreeSize; i++) {
			cout << setw(5) << i;
			cout << setw(6) << Tree[i].weight;
			cout << setw(6) << Tree[i].parent;
			cout << setw(6) << Tree[i].lchild;
			cout << setw(6) << Tree[i].rchild;
			cout << endl; }}Copy the code

2.2 Complete Code

#pragma once
#include <iostream>
#include <iomanip>
#include <Queue>
#include <string>
using namespace std;
// Get the length of the array
template <class T>
int getArrayLen(T& array)
{
	return (sizeof(array) / sizeof(array[0]));
}

// Node structure
struct HNode{
	float weight;  / / has a weight
	int parent;  // The parent node determines whether a node has been added to the Huffman tree
	int lchild, rchild;  // Left child, right child
	string code;  // Record the node encoding
	HNode() {
		weight = 0;
		parent = lchild = rchild = - 1;
		code = "";
	}
	HNode(float w) {
		weight = w;
		parent = lchild = rchild = - 1;
		code = ""; }};class HuffmanTree {
	HNode* Tree;  // Huffman tree node array header
	int TreeSize; TreeSize= n(number of characters to encode) * 2-1
public:
	HuffmanTree() {
		Tree = NULL;
		TreeSize = 0;
	}
	~HuffmanTree() {
		delete[] Tree;
	}

	// Construct the Huffman tree, enter the weight array and the array length n
	void Create(float* a, int n) {
		TreeSize = 2 * n - 1;  // Determine the number of nodes of Huffman tree
		Tree = new HNode[TreeSize];
		// The weights are stored in the first n values of the Huffman tree array
		for (int i = 0; i < n; i++)
		{
			Tree[i].weight = a[i];
		}

		// Start n-1 merge operation to form a Huffman tree
		int s1, s2;
		int nextPos = (TreeSize + 1) / 2; // Record an insertion position, starting with n
		for (int i =0; i < (TreeSize- 1) /2; i++) {
			SelectTwoMin(nextPos, s1, s2); // Find the two points in the array with the least weight
			// Assign weights to children of new nodes in the Tree
			Tree[nextPos].lchild = s1;
			Tree[nextPos].rchild = s2;
			Tree[nextPos].weight = Tree[s1].weight + Tree[s2].weight;
			// Assign a parent to both nodes (meaning both nodes have been added to the Huffman tree)
			Tree[s1].parent = Tree[s2].parent = nextPos;
			nextPos++;  // Move the insertion position back one bit}}// Find the two points in the array with the least weight
	void SelectTwoMin(int nextPos,int &s1, int &s2) 
	{
		// find the first node not added to the Huffman tree, marked s1
		int i=0;
		while(Tree[i].parent ! =- 1) {
			i++;
		}
		// Find the second node that has not been added to the Huffman tree starting from the last digit of s2
		s1 = i;
		int j = i + 1;
		while(Tree[j].parent ! =- 1) {
			j++;
		}
		s2 = j;
		// Adjust the weight of s1 and S2 to ensure that the weight of the corresponding point of S1 is smaller, so that when looking for smaller points, only s2 can be compared
		KeepOrder(s1, s2);

		// Start with the last bit of s2 and compare with s1 and s2 to determine the two points in the array with the least weight
		for (int k = j + 1; k <nextPos; k++) 
		{
			if (Tree[k].parent == - 1) {
				if (Tree[k].weight < Tree[s2].weight) {
					s2 = k;
					// Find points less than s2, and compare them with s1 to determine the order of sizeKeepOrder(s1, s2); }}}}// Ensure that the weight of the node corresponding to n1 is less than n2
	void KeepOrder(int& n1,int& n2) {
		if (Tree[n1].weight > Tree[n2].weight)
		{
			floattmp = n1; n1 = n2; n2 = tmp; }}// Outputs the completed Huffman tree
	void PrintTree(a) {
		cout << "index weight parent lchild rchild" << endl;
		for (int i = 0; i < TreeSize; i++) {
			cout << setw(5) << i;
			cout << setw(6) << Tree[i].weight;
			cout << setw(6) << Tree[i].parent;
			cout << setw(6) << Tree[i].lchild;
			cout << setw(6) << Tree[i].rchild;
			cout << endl; }}// Encode and output the nodes in the tree
	void PrintCoder(a) 
	{
		cout << "index weight code" << endl;
		queue<int> codeQueue;  // Save only the number
		int tmp;
		codeQueue.push(TreeSize - 1);
		while(! codeQueue.empty()) { tmp = codeQueue.front(); codeQueue.pop();if (Tree[tmp].lchild ==- 1 && Tree[tmp].rchild == - 1) {  
				// Direct output of leaf nodes
				cout << setw(5) << tmp << setw(6) << Tree[tmp].weight << setw(10) << Tree[tmp].code << endl;	
			}
			else {// Because Huffman trees have only N0 and N2
				codeQueue.push(Tree[tmp].lchild);
				Tree[Tree[tmp].lchild].code = Tree[tmp].code + "1";  // Encode the left child

				codeQueue.push(Tree[tmp].rchild);
				Tree[Tree[tmp].rchild].code = Tree[tmp].code + "0"; // Encode the right child}}}};Copy the code

3 test code and output

    /// Huffman tree
    HuffmanTree hTree;
	float x[] = { 5.29.7.8.14.23.3.11 };
	hTree.Create(x, getArrayLen(x));// Avoid the error of entering the wrong array length
	hTree.PrintTree();
	hTree.PrintCoder();
Copy the code

Correct output:

4 Reference Materials

1. The Huffman tree algorithm and c + + implementation: www.cnblogs.com/smile233/p/… 2. Baidu Encyclopedia · Huffman tree: baike.baidu.com/item/ Huffman tree /2… 3. Data structure: Huffman tree (Huffman tree) principle and C++ implementation: blog.csdn.net/weixin_4142…