This is the first day of my participation in the August More Text challenge

And look up set concepts

A disjoint-set is a data structure that literally means “join”, “search”, “set” and can simply be understood as: “merge”, “find”, “set”.

  • A set is a concept that we learned in high school, roughly understood as putting a pair of elements in the same place.
  • A merge, or union, means to join two sets together, for example: {1, 2, 3, 4} ∪ = {5, 6} {6} \ {1, 2, 3, 4 \} \ cup \ {5, 6 \} = \ {1, 2, 3, 4, 5, 6 \} {1, 2, 3, 4} ∪ = {5, 6} {6}
  • Find. What do I need to find? The lookup here means to find out if the element is in the collection and determine which collection the element is in

The data structure is:

  • You can merge collections
  • You can find out which collection the element is in
  • And lookup sets maintain a bunch of sets

What problem can be solved by searching sets simultaneously

According to the concept of look-up set, it can find whether an element is in a set or not, and can also merge two sets. It can solve some problems of path connection or node ownership. Such as:

  • Between the two lans originally independent of each other, now hope that the two Lans can be connected, and according to what rules connected, can use and look up the set of merge operation.
  • Check whether two hosts belong to the same LAN. You can also use and look up sets to determine. Some problems of node ownership can be solved by using and searching the set

And look up the implementation of the set

To solve the problem of network connection and network ownership, the following data structure is implemented based on the above example

  • First we define six nodes [1,2,3,4,5,6] representing six hosts
  • Connect the four hosts [1, 2, 3, 4] as shown in the following figure, named AS LAN A, where node 1 is the root node
  • Similarly, two nodes [5, 6] are also connected, named AS LAN B, where node 5 is the root node

  • So, hosts 1, 2, 3, and 4 belong to the same network, and hosts 5 and 6 belong to the same network and we put these two connected hosts in two sets. A = A = {1, 2, 3, 4}, {1, 2, 3, 4 \} A = {1, 2, 3, 4}, B = B = {5, 6} \ {5, 6 \} B = {5, 6} as shown

When the two Lans are set up, several questions are now being asked:

  • 1. Given the network connections of all hosts (for example, LAN A in the figure above: hosts 3 and 2 are interconnected, which can represent [3, 2], and 3 is the parent node), how many networks are needed to calculate?
  • 2. Check whether host 2 and host 4 reside on the same LAN.
  • 3. If host 4 and host 6 are not on the same LAN, how can they be connected?

Through these nodes step by step to achieve and search the set of data structure

Calculates the parent of each node (initialization)

Let’s start with the first question.

  • Let’s write out all the network connections. They are:[0, 2] [2, 1], [2, 3], [4, 5]Where: The first element of the array is the parent node and the second element is the child node.
  • Step 2: Use a one-dimensional array to record the parent nodes of all elements
  • When the parent nodes of all elements are recorded, the question of how many Lans there are is shifted to how many elements in the array do not have a parent node, since there is no parent node on the root node

Problem analysis finished, began to use code to achieve ~

// Number of hosts
const COMPUTER_NUM = 6; 
// Host connection
const connectLine = [
    // The first element of the array is the parent node, and the second element is the child node
    [0.2], [2.1], [2.3], [4.5]].// Initialize the array of record parent nodes and set the parent node of all data hosts to -1
// return [-1,-1,...,-1]
function init() {
    const arr = [];
    for (let i = 0; i < COMPUTER_NUM; i++) {
        arr.push(-1);
    }
    return arr;
}

// Record the parent of each node
function fillParentNode(parentArr, connectLine) {
    connectLine.forEach(item= > {
        parentArr[item[1]] = item[0]; })}// Find how many parent nodes there are
function findParentNum(parentArr) {
    return parentArr.filter(index= > index === -1).length;
}

/ / the parent node
const parentArr = init();
// Populates the parent node
fillParentNode(parentArr, connectLine);
console.log(parentArr); // [-1, 2, 0, 2, -1, 4]

// Query how many parent nodes there are
const parentNum = findParentNum(parentArr);
console.log(`there are ${ parentNum } networks`); // there are 2 networks
Copy the code

The default value is -1, and the parent value of all hosts is recorded in the array. Finally, the parent value is -1, which is the number of lans. Since then, the first problem has been solved.

Find the root node of both elements (find the node)

Continue to look at the second problem: determine whether host 2 and host 4 are in the same LAN, then look back and forth at the above figure, to determine whether the two hosts are in the same LAN, the problem can be translated into: whether the root nodes of the two hosts are the same, if they are in the same LAN.

Now that the solution has been found, let’s implement it in code and continue writing from the code already implemented above.

// Implement a method to find the root node
// @param {number} child Needs to find the element of the root node
// @param {array} parentArr element joins the relational array
// return number
function findRoot(child, parentArr) {
    let result = child;
    while(parentArr[result] ! = = -1) {
        result = parentArr[result]
    }
    return result;
}
// parentArr = [-1, 2, 0, 2, -1, 4]
const tag_2 = findRoot(1, parentArr);
const tag_4 = findRoot(3, parentArr);

console.log('Two hosts${ tag_2 === tag_4 ? 'in' : 'not' }In the same network); // The two hosts are on the same network

Copy the code

Stop and return by looking up the parent node until you find the root node that defines the element. If the root nodes of two hosts are the same, they are in the same set. Otherwise, they are not in the same set. The second problem was solved. The method we defined is now capable of initialization and sorting.

Two network connections (collection merge)

Now that we’ve solved the lookup problem, let’s look at the third problem:

  • If host 4 and Host 6 are not on the same LAN, how can the two networks be connected?

There are many ways to connect the two networks, and any host connected to the two networks will become a LAN. But we need to find the most efficient way to connect them, and if I connect them as shown below.

In the figure, the root node of LAN B is connected to the 4 nodes of LAN A. Although the two networks are connected, the height of the network increases (the height of the tree increases). The higher the height of the tree, the slower the search. This method of connection is not desirable in terms of efficiency.

To make the tree efficient, reduce the height of the tree. So you can connect from the root of both trees, as shown below

Now that you’ve implemented the method to find the root element from the second question, you just need to implement a method to combine the two root elements.

// Merge methods
// @param {number} x Host x
// @param {number} y Host y
// @param {parentArr} parentArr parent node
// return boolean
function unionSet(x, y, parentArr) {
    // Find the root node of both hosts
    const x_root = findRoot(x, parentArr);
    const y_root = findRoot(y, parentArr);
    
    // If two root nodes are the same, they are in the same network and do not need to be merged
    if (x_root === y_root) {
        return false;
    } else {
        parentArr[x_root] = y_root
        return true;
    }
}

unionSet(3.5, parentArr); // true
console.log(parentArr) // [4, 2, 0, 2, -1, 4]

Copy the code

At this point, the two networks have been successfully merged.

Optimized merge (path compression)

As you can see from the code, although it is connected from the root node, its time complexity is O(n) and performance may not be good when there are multiple networks that need to be connected. The main functions for searching sets are union and find. In the search method, we only need to find the root node of the element, and do not need to care about the parent node of the element, so we only need to remember the root node of the node when recording the node relationship, as shown in the following figure:

Now modify the code:

Merge method modification

// Merge methods
// @param {number} x Host x
// @param {number} y Host y
// @param {parentArr} parentArr parent node
// @param {rankArr} tree height array
// return boolean
function unionSet(x, y, parentArr, rankArr) {
    // Find the root node of both hosts
    const x_root = findRoot(x, parentArr);
    const y_root = findRoot(y, parentArr);
    
    // If two root nodes are the same, they are in the same network and do not need to be merged
    if (x_root === y_root) {
        return false;
    } else {
        // Merge height from small to large
        if (rankArr[ x_root ] > rankArr[ y_root ]) {
            parentArr[ y_root ] = x_root;
        } else if (rankArr[ x_root ] < rankArr[ y_root ]) {
            parentArr[ x_root ] = y_root;
        } else {
            // Randomly merge with the same height, and add 1 to the merged depth
            parentArr[ x_root ] = y_root;
            rankArr[y_root]++;
        }
        return true; }}Copy the code

The initialization method is modified

// Initialize the array of record parent nodes and set the parent node of all data hosts to -1
// return [-1,-1,...,-1]
function init() {
    const arr = [];
    const rankArr = [];
    for (let i = 0; i < COMPUTER_NUM; i++) {
        arr.push(-1);
        rankArr.push(0);
    }
    return [arr, rankArr];
}
Copy the code

Records method changes on the parent node

// Record the parent of each node
function fillParentNode(parentArr, rankArr, connectLine) {
    connectLine.forEach(item= > {
        unionSet(item[1], item[0], parentArr, rankArr)
    })
}
Copy the code

That’s it. Now let’s try it out

// Call merge method to merge the set of hosts 4 and 6
unionSet(3.5, parentArr, rankArr); // true
console.log(parentArr); // [4, 0, 0, 0, -1, 4]
Copy the code

The combined network is shown as follows:

Obviously, the height of the tree is reduced compared to optimization. And the time is order A (n).

The complete code


// Initialize the array of record parent nodes and set the parent node of all data hosts to -1
// return [-1,-1,...,-1]
function init() {
    const arr = [];
    const rankArr = [];
    for (let i = 0; i < COMPUTER_NUM; i++) {
        arr.push(-1);
        rankArr.push(0);
    }
    return [arr, rankArr];
}

// Record the parent of each node
function fillParentNode(parentArr, rankArr, connectLine) {
    connectLine.forEach(item= > {
        unionSet(item[1], item[0], parentArr, rankArr)
    })
}

// Find how many parent nodes
function findParentNum(parentArr) {
    return parentArr.filter(index= > index === -1).length;
}

// Implement a method to find the root node
// @param {number} child Needs to find the element of the parent node
// @param {array} parentArr element joins the relational array
// return number
function findRoot(child, parentArr) {
    let result = child;
    while(parentArr[result] ! = = -1) {
        result = parentArr[result]
    }
    return result;
}

// Merge methods
// @param {number} x Host x
// @param {number} y Host y
// @param {parentArr} parentArr parent node
// @param {rankArr} tree height array
// return boolean
function unionSet(x, y, parentArr, rankArr) {
    // Find the root node of both hosts
    const x_root = findRoot(x, parentArr);
    const y_root = findRoot(y, parentArr);
    // If two root nodes are the same, they are in the same network and do not need to be merged
    if (x_root === y_root) {
        return false;
    } else {
        // Merge height from small to large
        if (rankArr[ x_root ] > rankArr[ y_root ]) {
            parentArr[ y_root ] = x_root;
        } else if (rankArr[ y_root ] > rankArr[ x_root ]) {
            parentArr[ x_root ] = y_root;
        } else {
            // Randomly merge with the same height, and add 1 to the merged depth
            parentArr[ x_root ] = y_root;
            rankArr[y_root]++;
        }

        return true; }}/*
 * 测试
 */
// Number of hosts
const COMPUTER_NUM = 6; 
/ / the parent node
const [ parentArr, rankArr ] = init();
// Populates the parent node
fillParentNode(parentArr, rankArr, connectLine);

unionSet(3.5, parentArr, rankArr)
Copy the code

summary

This paper mainly introduces the concept, function and implementation of parallel search set. It is a data structure that can find out whether two nodes are in the same set.

And search set of main methods: merge and search

And the key to looking up sets is

  • You can merge collections
  • You can find out which collection the element is in
  • And lookup sets maintain a bunch of sets

Please point out any inaccuracies or errors in the comments section

reference

  1. Disjoint-set
  2. What can I do with the collection?