The concept is introduced


Consider the definition of the word “relative” : “one who is related by blood or marriage.” You and your girlfriend’s family are not relatives per se. Once you get married, the two families become a family. Your family, including you, automatically become relatives with your girlfriend and her family. And check the set is a tree data structure, to deal with some merging disjoint set and query, “marriage” is in fact in the above example and check set merge Let’s demo and check the set under normal operations, we default to create six elements, the six elements we can as a six mutually disjoint sets



















Implementation method


Initialize (make_set)

We can regard the parallel search set as a forest composed of many trees. The nodes connected in each tree represent the same set. The root node of the parent in the tree is regarded as the representative of the set. Initially, we used a parent array to store the subscripts of all nodes. Since each set is disjoint by default, we made each node’s parent point to itself, resulting in a forest of N self-rooted trees.




The initialization structure of the parent array looks like this:




The code to initialize and look up the collection:

void make_set() {for(int i=0; i<N; i++){ parent[i] = i; // The parent of I points to itself}}Copy the code
Union (Union)

Union(a,b) will combine the set of a with the set of B. In the realization of data structure, only the root node of B points to the root node of A, or the root node of A points to the root node of B, and the former is used by default in this paper. Suppose we now want to combine 0,2,4 as a set, 1,3 as a set, and 5 as a set alone, then the process of operation may be as follows:

To combine 0 and 2, point 2 to 0.



















And then, if we want to continue the Union(5,3), we can get the root of the tree at node 3, 1, and make it point to 5. However, the height of such a tree is higher than that of a tree with 5 pointing to 1. With the increase of the size of the parallel search set, the tree will have many unnecessary heights, which will lead to more time-consuming query of the parallel search set.

In order to make the overall height of the merged tree relatively lower, we merge the shorter tree into the taller tree at each merge, and this optimization will show up later in the code.

Finally, if we want Union(2,3), the default is to point the root of “3” “1” to the root of “2” “0” since both are at the same tree height.







Before implementing the Union function, we add a rank[N] array to record the height. By default, all ranks are set to 0, and the rank values change as the Union sets are merged. Here is the code for Union:

void union_set(int a,int b) {
    if (a==b) return; // same int root_a = find_root(a); Int root_b = find_root(b); // find the root of bif (root_a == root_b)  return; Int higher_root = rank[root_a]>rank[root_b]? root_a:root_b; Int lower_root = rank[root_a]<rank[root_b]? root_a:root_b; // Select the lower treeifParent [root_b] = root_a; (higher_root == lower_root) {parent[root_b] = root_a; //root_b.parent points to root_a (default operation) ++rank[root_a]; // Height +1}else{ parent[lower_root] = higher_root; // Shorter trees point to taller trees and do not change the overall height}}Copy the code
Find

Querying the collection of an element is simple, and since the parent array records the parent of each element, we just need to recursively backtrack.




After find_root(5) is executed, 0 is traced upward along the red line, and after find_root(2) is executed, 0 is also found, indicating that 5 and 2 belong to the same set. However, after find_root(7) is executed, 6 is found along the red line, so 7 and element 5 and 2 do not belong to the same set. The implementation code is given below:

int find_root(int node) {
    if (parent[node] == node) return node;
    return find_root(parent[node]);
}
Copy the code
Path Compression

The find_root(int node) function returns the representation of ——, the root of the tree in which the element is located. In this process, we can point the current element directly to the root, reducing the height of the tree, so that the query speed is improved. If find_root(3) is executed, the tree structure will change to the following structure:




The code implementation changes are minimal:

int find_root(int node) {
    if (parent[node] == node) returnnode; parent[node] = find_root(parent[node]); // point to the rootreturn parent[node];
}
Copy the code
Complete code + previous examples

The code and logic of the parallel collection are very lean and in my opinion very elegant data structure.

#include<iostream>
#include <stdio.h>
#define N 10000+10

int parent[N];
int rank[N];
void make_set() {for(int i=0; i<N; i++){ parent[i] = i; rank[i] = 0; } } int find_root(int node) {if (parent[node] == node) return node;
    parent[node] = find_root(parent[node]);
    return parent[node];
}
void union_set(int a,int b) {
    if (a==b) return; // same int root_a = find_root(a); Int root_b = find_root(b); // find the root of bif (root_a == root_b)  return; Int higher_root = rank[root_a]>rank[root_b]? root_a:root_b; Int lower_root = rank[root_a]<rank[root_b]? root_a:root_b; // Select the lower treeifParent [root_b] = root_a; (higher_root == lower_root) {parent[root_b] = root_a; //root_b.parent points to root_a (default operation) ++rank[root_a]; // Height +1}else{ parent[lower_root] = higher_root; // Shorter trees point to taller trees and do not change the overall height}} intmain(){
    make_set();
    union_set(0, 2);
    union_set(2, 4);
    union_set(1, 3);
    union_set(5, 3);
    union_set(2, 3);
    find_root(3);
    find_root(5);
    for(int i=0; i<6; i++) {printf("%d ",parent[i]); // output all points to}}Copy the code

Refer to the article


Union-find Algorithms Wikipedia

Recommended topics

Pat-1118 Birds in Forest– code


This article has been synced to personal blog: Elegant Data Structures – and looked up collections