Zero basis and search set algorithm

And check collection is my summer vacation from the master there to learn a move, think is really too subtle design. A class of problems that I could not solve before can be solved in such a simple and efficient way. Sorry party for not sharing. -Blair: Oh, my god, it’s none of my business. Do I know you well?

Take a look at an example, Hangdian 1232 unblocked project \

First you are given a map of several towns, each of which can be regarded as points, and then you are shown which pairs of towns are directly connected by roads. The last thing to solve is the connectivity of the whole picture. For example, you are given two random points and asked to determine whether they are connected or not, or you are asked how many connected branches there are in the whole picture, which is divided into independent pieces. For example, to solve the problem of unimpeded engineering, how many roads need to be built is essentially to find how many connected branches. If it is a connected branch, it means that the points on the whole graph are connected, and there is no need to build roads; If there are two connected branches, just build another road, pick a point from each branch, connect them, then all the points are connected; If there are 3 connected branches, then only 2 more roads…

This is illustrated with the following set of data input data

4, 2, 1, 3, 4, 3

The first row tells you that there are 4 points and 2 paths. The next two lines tell you that there is a path between 1 and 3, and there is a path between 4 and 3. So the whole picture is divided into one-three-four and two parts. As long as there is another way to connect 2 to any other point, the smooth project is realized, then the output result of this group of data is 1. Ok, now program to implement this function, there are hundreds of towns, I don’t know how many roads, and there may be a loop. What is to be done?

I also won’t ah before, since used and check collection, hi, the effect is really good! My whole family uses it!

The lookup set consists of an integer array and two functions. The array pre[] records what the leading point of each point is. The function find is to find and join is to join.

int pre[1000 ];

int find(int x) // Find the root node

{

int r=x;

while ( pre[r ] ! = r ) //Return the root node r

r=pre[r ];

int i=x , j ;

while( i ! = r ) //Path to the compression

{

*j = pre[ i ]; *// Record the value of the temporary variable j before changing the parent

**pre[ i ]= r ; **// Change the parent node to root

i=j;

}

return r ;

}

Void join(int x,int y) void join(int x,int y)

// If they are connected, do not care. // If they are not connected, merge them together.

{

int fx=find(x),fy=find(y);

if(fx! =fy)

pre[fx ]=fy;

}

To explain and look up how sets work, I’ll take a more loving example. In other words, there are thousands of heroes scattered on rivers and lakes. They had no proper profession, and walked about all day with their swords on their backs, and when they met a stranger, they were bound to fight. But one advantage of the warriors is loyalty, never hit their friends. And they believe that “a friend of a friend is a friend of mine.” As long as they can be connected through friendship, no matter how many twists and turns they take, they are considered one of us. In this way, the rivers and lakes form a community, connected by the friendship relationship between the two. People who are not in the same community, however, can not be connected through friendship, so it is safe to fight to death. But how do two people who don’t know each other know if they belong in a group of friends?

We can choose a famous person from each circle of friends as the representative of the circle, so that each circle can be named like “Zidane friends team”, “Ronaldo friends Team”… By checking with each other whether their captain is the same person, the two men can determine whether they are friends or enemies.

But there is still a problem. Chivalrous men only know who their immediate friends are. Many of them don’t know the captain at all. Are you the captain?” In this way, the captain is embarrassed, inefficient and potentially trapped in an infinite loop. So the captain gave the order to regroup. All members of the team implement the hierarchical system, forming a tree structure, my captain is the root node, the following are the second-level team members, third-level team members. Everyone just needs to remember who their boss is. When it comes to judging friends and enemies, you just have to go up one level to the top, and you can find out who the captain is in a short time. Since all we care about is whether or not two people are connected, it doesn’t matter how they are connected, what the internal structure of each circle is, or even who the captain is. So we can let the captain regroup at will, as long as we don’t mess up our friendship. Thus, the clan came into being.

I3.6. Cn/CVBNM / 6 f/ec…

Let’s look at and look up the implementation of the set. int pre[1000]; This array is a record of who each warrior is superior to. The heroes are numbered from 1 or 0 (depending on the meaning of the question). If pre[15]=3, it means that the superior of No. 15 is No. 3. If a man’s boss is himself, that means he’s in charge. End of search. For example, Ouyang Feng, his superior was himself. Everyone only thinks of their superiors. For example, Hu Qingniu only knew that his superior was Yang Zuozuo. Who is Zhang Wuji? Don’t know! The only way to find out who’s in charge is to go through the ranks. The find function is used to find the owner, and the meaning is clear enough (path compression algorithms will be discussed later).

Int find(int x) //

{

int r=x; // delegate r to the manager

while (pre[r ]! =r) // If r’s superior is not r himself (i.e., he is not the master of the found warrior)

r=pre[r ] ; // R then goes to his superior until he finds the boss.

return r ; // ~~~ ~

}

Consider the join function, which joins a line between two points so that all points on the two plates can communicate with each other. That’s easy to do on a graph, just draw a line. But now we are using and lookup set to describe the situation in wulin, there is only one pre[] array, how to implement? For example, suppose the situation in wulin is as shown in the picture. Virtual bamboo monk and Zhou Zhi if MM is my favorite two characters, their ultimate boss are respectively xuanci abbot and extinction teacher too, that is obviously two camps. I didn’t want them to fight each other, so I said to them, “You two, be good friends.” They agreed on my behalf. This agreement was no small matter, and the entire Shaolin and Emei faction could not fight. How could such a major change be realized and how much would be changed? In fact, it was very simple. I said to The abbot xuanci, “Master, please change your superior to the exterminator. In this way, the ultimate boss of all the original staff of the two groups is teacher Tai, and that is also playing a ball! Anyway, all we care about is the connectivity, not the internal structure.” Xuan Ci 1 listen to affirmation fire big: “I depend, with what be I become her under hand ah, how not turn over? I protest!” The protest is invalid. It was arranged by god. Anyway, it’s the same who joins, so I just randomly assigned one. Okay, so that makes sense, right?

Void join(int x,int y) // I want to make friends with Zhou Zhi Ruo

{

int fx=find(x),fy=find(y); // The boss of virtual bamboo is Xuanci, the boss of Zhi Re MM is extinction

if(fx! =fy) // Xuanci and Extermination are obviously different people

pre[fx ]=fy; // The abbot had no choice but to become a subordinate of the teacher

}

Let’s look at the path compression algorithm. The process of establishing a faction is to use the join function to connect two people, and who should be under whom is completely random. I have no idea what the final tree structure will look like. So the efficiency of the search will be relatively low. The ideal situation is that everyone’s immediate superior is the master, there are two levels of structure, you only need to search once to find the master. Even if you can’t do it completely, you’d better get as close as you can. This gives rise to the path compression algorithm. Consider this scenario: two unknown heroes meet and want to know if they can punch each other. So hurriedly call his superior: “are you in charge?” The superior say: “I am not ah, my superior is who who who, you ask him to see.” All the way to ask, the original two final boss is east factory Cao father-in-law. “Ah ah, it is ji ji, Xi Li xi Li, in the lower third battalion six groups of white gourd children!” “Nice to meet you, fairy dog Tail flower, unit 18, Lower ninth Camp!” Two people happily hand in hand to drink. “And so on and so on, two students please stay, there are things not finished!” I called to them. “Oh, and do path compression.” Two people wake up. The white-faced calabash baby called his superior six group leader: “Group leader, I have checked, and the boss of his apprentices is Father Cao. It is better for us to meet and worship in cao’s father-in-law’s hand, so as to save the level is too low and find the palm hemp ring later.” “Oh, fair enough.” White face gourd baby then call the three battalion commander that visited just now… Fairy dog Tail flower did the same thing. In this way, all the characters involved in the query are gathered under the direct leadership of Cao’s father-in-law. Each query has been optimized, so the number of layers of the whole door tree will be maintained at a relatively low level. Path compression code, understand very good, do not understand it does not matter, directly copy on the line. Anyway, that’s what it does.

I3.6. Cn/CVBNM / 60/98…

hdu1232

1 #include<iostream> 2 using namespace std; 3 int pre[1050]; 4 bool t[1050]; 5 int Find(int x) 6 {7 int r=x; 8 while(r! =pre[r]) 9 r=pre[r]; 10 11 int i=x,j; 12 while(pre[i]! =r) 13 { 14 j=pre[i]; 15 pre[i]=r; 16 i=j; 17 } 18 return r; 19 } 20 void mix(int x,int y) 21 { 22 int fx=Find(x),fy=Find(y); 23 if(fx! =fy) 24 { 25 pre[fy]=fx; 26 } 27 } 28 int main() 29 { 30 int N,M,a,b,i,j,ans; 31 while(scanf("%d%d",&N,&M)&&N) 32 { 33 for(i=1; i<=N; Pre [I]= I; 35 36 for(i=1; i<=M; 37 {38 scanf("%d%d",&a,&b); 39 mix(a,b); 40 } 41 memset(t,0,sizeof(t)); 42 for(i=1; i<=N; T [Find(I)]=1; 45 } 46 for(ans=0,i=1; i<=N; i++) 47 if(t[i]) 48 ans++; 49 50 printf("%d\n",ans-1); 51 52 } 53 return 0; 54}Copy the code

#include

using namespace STD; int pre[1000]; int find(int x) { int r=x; while (pre[r ]! =r) r=pre[r ]; int i=x; int j; while(i! =r) { j=pre[i ]; pre[i ]=r; i=j; } return r; } int main() { int n,m,p1,p2,i,total,f1,f2; While (scanf(“%d”,&n) &&n) // read n, if n is 0, end {// At first there are n towns with no roads // Then there are n-1 roads to connect them together. Total =n-1; For (I =1; for(I =1; i<=n; i++) { pre[i ]=i; If (“%d”,&m); While (m–) {scanf(“%d %d”,&p1,&p2); while(“%d %d”,&p1,&p2); f1=find(p1); f2=find(p2); // The number of branches is reduced by 1, and the number of roads to build is reduced by 1. =f2) { pre[f2 ]=f1; total–; Printf (“%d\n”,total); printf(“%d\n”,total); } return 0; }

About dynamic connectivity

Let’s look at a graph to see what dynamic connectivity is:

Suppose we input a set of pairs of integers (4, 3) (3, 8) and so on in the figure above. Each pair of integers represents that the two points/sites are connected. As data continues to be input, the connectivity of the entire graph will change, as can be clearly seen from the figure above. Meanwhile, points/sites that are already connected are directly ignored, such as (8, 9) in the figure above.

Application scenarios of dynamic connectivity:

Network connection judgment:

If two integers in each pair represent one network node, the pair is used to indicate that the two nodes need to be connected. After establishing a dynamic connection graph for all pairs, the need for wiring can be reduced as little as possible because the two nodes that are already connected will be directly ignored.

Variable names are isotropic (similar to the concept of Pointers) :

In a program, multiple references can be declared to refer to the same object. In this case, we can determine which references actually refer to the same object by establishing a dynamic connection graph between the declared references and the actual objects.

Modeling the problem:

When modeling a problem, we should try to think clearly about what the problem is to be solved. Because the data structures and algorithms chosen in the model obviously vary from problem to problem, for the dynamic connectivity scenario, the questions we need to solve might be:

Give two nodes and judge whether they are connected. If they are connected, there is no need to give a specific path
Give two nodes, judge whether they are connected, if connected, need to give the specific path

In terms of the above two problems, although the difference of specific path can only be given, this difference leads to different selection algorithms. This paper mainly introduces the first case, that is, the union-find algorithm with specific path is not required, while the algorithm based on DFS can be used in the second case.

Modeling ideas:

The simplest and intuitive assumption is that for all connected nodes, we can assume that they belong to one group, so disconnected nodes must belong to different groups. With the Pair input, we need to first determine whether the two input nodes are connected. How do you tell? According to the above assumption, we can determine which group they belong to, and then see if the two groups are the same. If they are the same, then the two nodes are connected, and vice versa. For simplicity, we represent all nodes as integers, that is, integers from 0 to n-1 for N nodes. Before processing the input Pair, each node must be isolated, that is, they belong to different groups. This relationship can be represented by an array. The index of the array is the integer representation of the node, and the corresponding value is the group number of the node. This array can be initialized as:

for(int i = 0; i < size; i++)  
    id[i] = i;   
Copy the code

For node I, its group number is also I.

After initialization, there are several possible operations on the dynamically connected graph:

Example Query the group to which a node belongs

The value at the corresponding position in the array is the group number

Check whether two nodes belong to the same group

Get the group numbers of the two nodes respectively, and then judge whether the group numbers are equal

Join two nodes so that they belong to the same group

Obtain the group numbers of the two nodes respectively. If the group numbers are the same, the operation ends. If not, change the group number of one node into the group number of another node

Gets the number of groups

Initialize to the number of nodes, and then decrement by 1 after each successful connection

API

We can design the API accordingly:

Note that the nodes are represented by integers. If you need to represent nodes with other data types, such as strings, you can use a hash table to map a String to the Integer type you need here.

Connected calls find twice for two arguments, and Union calls find twice before it can actually execute the union. Therefore, we need to design the implementation of find as efficiently as possible. Hence the following quick-find implementation.

The Quick – Find algorithm:

1 public class UF 2 { 3 private int[] id; // access to component id (site indexed) 4 private int count; // number of components 5 public UF(int N) 6 { 7 // Initialize component id array. 8 count = N; 9 id = new int[N]; 10 for (int i = 0; i < N; i++) 11 id[i] = i; 12 } 13 public int count() 14 { return count; } 15 public boolean connected(int p, int q) 16 { return find(p) == find(q); } 17 public int find(int p) 18 { return id[p]; } 19 public void union(int p, int q) 20 {21 public void union(int p, int q) 20 {21 int pID = find(p); 23 int qID = find(q); If (pID == qID) return; 24 // If (pID == qID) return; 27 for (int I = 0; i < id.length; i++) 28 if (id[i] == pID) id[i] = qID; 29 count--; 30}} 31Copy the code

For example, if the Pair (5, 9) is entered, find the group number is not the same, and then pass through the union to change the group number from 1 to 8. Of course, changing from 8 to 1 is also possible, just make sure you use one rule for all operations.

The code above the find method is very efficient, because just need an array read operations will be able to find the node set, but the problem, in the case of need to add a new path, for no modification is involved, because not sure which nodes group number need to be modified, so we must to traverse the entire array, Find the need to modify the node, one by one to modify, that every time to add a new path of complexity is the linear relationship, if you want to add the number of new path is M, the number of nodes is N, then last time complexity is MN, is clearly a square order of complexity, for large-scale data, square order algorithm is a problem, in this case, The key to solving this problem is to increase the efficiency of the union method so that it does not need to traverse the entire array.

The Quick – Union algorithm:

Consider, why does the above solution result in “affecting the whole body”? Because the group numbers to which each node belongs are recorded individually and individually, there is no way to organize them in a better way, and when it comes to modification, there is no other way to inform and modify one by one. So now the question becomes, how to organize the nodes in a better way, there are many ways to organize, but the most intuitive is to organize the nodes with the same group number together, think about the data structure we learned, what kind of data structure can organize some nodes together? You see linked lists, graphs, trees, things like that. But which structure is the most efficient for finding and modifying? Trees, of course, so consider how the relationship between nodes and groups can be represented as a tree.

If you don’t change the underlying data structure, that is, the representation using arrays. Nodes can be organized in parent-link mode. For example, id[p] is the number of the parent node of p, and if P is the root node, id[p] is p, so eventually after several lookups, a node can always find its root node. The node whose id is [root] = root is the root node of the group. So in dealing with a pair of time, will be the first to find a pair of every node in the group number (i.e., they are in the tree root node of the serial number), if belong to different group, is one of the root node of the parent node is set to another root node, equivalent to a separate tree programming another independent subtree of the tree. The intuitive process is shown below. But here comes the problem.

The only implementation differences from the previous quick-find methods are Find and union:

1 private int the find (int) p {2/3 / looking for p root node of the node group, the root node has the nature of the id (root) = 4 while root (p! = id[p]) p = id[p]; 5 return p; 6 } 7 public void union(int p, int q) 8 { 9 // Give p and q the same root. 10 int pRoot = find(p); 11 int qRoot = find(q); 12 if (pRoot == qRoot) 13 return; 14 id[pRoot] = qRoot; // Change a tree (a group) to a subtree (a group). 16}Copy the code

The tree data structure is prone to extreme cases, because in the process of tree construction, the final shape of the tree depends heavily on the nature of the input data itself, such as whether the data is sorted, whether the data is randomly distributed and so on. For example, if the input data is ordered, the constructed BST degrades to a linked list. In our problem, there are also extreme cases, as shown in the figure below.

To overcome this problem, BST can evolve into red-black trees or AVL trees and so on.

However, in the application scenario we are considering, each pair of nodes is not comparable. So something else needs to be done. If you don’t have any ideas, it might be helpful to look at the code. Consider the implementation of the Union method in the Quick-Union algorithm:

1 public void union(int p, int q) 2 { 3 // Give p and q the same root. 4 int pRoot = find(p); 5 int qRoot = find(q); 6 if (pRoot == qRoot) 7 return; 8 id[pRoot] = qRoot; // Change a tree (i.e. a group) to a subtree (i.e. a group). 10}Copy the code

Id [pRoot] = qRoot Since this is also “hard coding”, the implementation is based on the convention that the tree in which P is located will always be a subtree of the tree in which Q is located, thus achieving the fusion of two independent trees. Is such an arrangement always reasonable? Obviously not. For example, if the size of the tree p is in is much larger than that of q, the combination of P and Q would create a very incongruous “deformed tree” with one light end and one heavy end. \

So we should consider the size of the tree before deciding whether to call:

Id [pRoot] = qRoot or ID [qRoot] = pRoot

That is, small trees are always merged as subtrees with large trees. This will keep the whole tree as balanced as possible.

So the question now becomes: How do you determine the size of a tree?

Let’s go back to the original situation, where each node initially belongs to a separate group and is initialized with the following code:

for (int i = 0; i < N; i++) id[i] = i; // The group number of each node is the sequence number of the nodeCopy the code

Similarly, in the initial case, each group has a size of 1. Since there is only one node, we can maintain the size of each group with an additional array, which is also straightforward to initialize:

for (int i = 0; i < N; i++) sz[i] = 1; // Each group starts with a size of 1Copy the code

In the process of merging, the size of the two trees to be merged will be determined first, and then merge according to the idea in the above figure, and the code is implemented:

1 public void union(int p, int q) 2 { 3 int i = find(p); 4 int j = find(q); 5 if (i == j) return; If (sz[I] < sz[j]) {id[I] = j; if (sz[I] < sz[j]) {id[I] = j; sz[j] += sz[i]; } 8 else { id[j] = i; sz[i] += sz[j]; } 9 count--; 10}Copy the code

Quick Union and Weighted Quick Union:

You can see that after deciding how to merge the two trees using the SZ array, the height of the resulting tree is greatly reduced. This makes sense because any operation in the Quick-Union algorithm inevitably calls the find method, whose efficiency depends on the height of the tree. As the height of the tree decreases, the efficiency of the find method increases, and thus the efficiency of the overall Quick-Union algorithm.

In fact, the figure above can also give us some enlightenment, that is, for the Quick-Union algorithm, the ideal situation of node organization should be a very flat tree, and all children nodes should be at height of 1, that is, all children are directly connected to the root node. This organizational structure ensures the maximum efficiency of find operations.

So how do you construct this ideal structure?

During the execution of find, don’t you need to do a while loop to find the root node? Wouldn’t it be nice to store all passing intermediate nodes in an array and, after the while loop ends, point the parent of those intermediate nodes to the root node? This approach is problematic, however, because the frequency of find operations causes intermediate arrays to be generated frequently, and the allocation destruction time increases accordingly. So is there a better way? There is, that is, the parent node of the node points to the grandfather node of the node, which is very clever, very convenient and effective. It is equivalent to finding the root node at the same time, the path is compressed, making the whole tree structure flat. The corresponding implementation is as follows, and really only one line of code needs to be added:

1 private int find(int p) 2 { 3 while (p ! Id [p] = id[id[p]]; 7 p = id[p]; 8 } 9 return p; 10}Copy the code

At this point, the dynamic connection-related union-find algorithm is basically introduced, from Quick-Find, which is easy to think of, to Quick-Union, which is relatively complex but more efficient, and then to the several improvements of Quick-Union, so that the efficiency of our algorithm continues to improve.

The time complexity of these algorithms is shown as follows:


Algorithm	Constructor	Union	Find
Quick-Find	N	N	1
Quick-Union	N	Tree height	Tree height
Weighted Quick-Union	N	lgN	lgN
Weighted Quick-Union With Path Compression	N	Very near to 1 (amortized)	Very near to 1 (amortized)

For large-scale data processing, it is not appropriate to use square-order algorithm, such as simple and intuitive Quick-find algorithm. By finding more characteristics of the problem, we Find the appropriate data structure, and then make targeted improvement, so as to obtain the Quick-Union algorithm and a variety of improved algorithms. Finally, the complexity of the algorithm is reduced to almost linear complexity.

If the function needed is not only to detect whether two nodes are connected, but also to obtain a specific path when connected, then another algorithm, such as DFS or BFS, is needed.

Related Posts

It took me 10 hours to write this K8S architecture analysis

First half of 2021 fitness summary

Summary of Computer Network 1