1. An overview of the

1.1 Composition of the Figure

The graph consists of two sets: the point set V, the edge set E, and the binary is defined as G = (V, E).

1.2 Directed and undirected graphs

Graphs are divided into directed graphs and undirected graphs.

  • Undirected graphs: Where each edge is bidirectional, (x, y) and (y, x) give the same result.
  • Directed graph: Each edge is unidirectional,

    and

    indicate different results.
    ,>
    ,>

1.3 In degree and out degree

Degrees are divided into inward and outward degrees.

  • Degree of entry: the sum of the number of times a vertex is the terminus.
  • Outdegree: The sum of the degrees at which a vertex is the starting point.

Note: The vertices of an undirected graph have the same in and out degrees.

2. Storage mode

Common graphs can be stored in two ways:

  • Adjacency list
  • Adjacency matrix

2.1 adjacency table

Suppose there are x nodes in the point set V, then we need to apply for a one-dimensional array of capacity x, in which each element represents each node in the point set V. Starting from each element, a linked list is used to hold all adjacent nodes of the corresponding node.

2.2 Adjacency matrix

Suppose there are x nodes in the point set V, then we need to apply for a two-dimensional array of capacity x^2, each element of the array represents each edge (weight) in the edge set E. Starting from each element, the x-coordinate and y-coordinate of the element are the two vertices of the edge.

2.3 Comparison of the two methods

  • The adjacency list and adjacency matrix can represent both directed and undirected graphs.
  • The adjacency list considers the graph in terms of nodes.
  • The adjacency matrix considers the graph in terms of edges.
  • When the graph is sparse and has many vertices, that is, the graph structure is large, the adjacence list is more suitable to be chosen as the memory structure.
  • When the graph is dense and the vertices are few, or there is no need to record the weights of the edges in the graph, it is more appropriate to use the adjacence matrix as the storage structure.

3. Code definition of the figure

The following code describes all the information about the graph structure, implements all the graph structures, and is a general structure template for graphs.

According to different problem requirements, some data items may not be used, so you do not need to fill in when creating the graph, just maintain the initial value.

Figure 3.1 structure

public class Graph {
    // Set key -- number of points,value -- actual point
    HashMap<Integer, Node> nodes;
    / / set
    HashSet<Edge> edges;

    public Graph(a) {
        nodes = new HashMap<Integer, Node>();
        edges = newHashSet<Edge>(); }}Copy the code

Note: In the code above, a HashMap is used to store the set of points. In the actual brushing process, to further shorten the constant time, you can actually replace the HashMap with an array structure, and the subscripts of the array elements can act as the keys in the HashMap. Although the time required to add, delete, and check a HashMap is O(1), it is not as fast as array addressing.

3.2 structure

public class Node {
    // Node data item
    int value;
    // The input degree of the node
    int in;
    // The output of the node
    int out;
    // Direct neighbor node of the edge connection emanating from the current point
    ArrayList<Node> nexts;
    // The edge that diverges from the current point
    ArrayList<Edge> edges;

    public Node(int value) {
        this.value = value;
        this.in = 0;
        this.out = 0;
        nexts = new ArrayList<>();
        edges = newArrayList<>(); }}Copy the code

3.3 edge structure

public class Edge {
    / / right side
    int weight;
    // The starting node of the edge
    Node from;
    // Change the target node
    Node to;
    
    public Edge(int weight, Node from, Node to) {
        this.weight = weight;
        this.from = from;
        this.to = to; }}Copy the code

Note: This structure represents directed edges; an undirected edge can be represented by two directed edges.

4. Figure transformation

The algorithms for graphs are not very difficult, but the difficulty is that there are many ways to store representations of the same graph. This means that the same algorithm, if you choose a different way to store the graph, will be written differently.

Therefore, we can choose a preferred graph storage structure, such as the adjacencies matrix, and then use the adjacencies matrix to implement all the algorithms for graph operations. When encountering problems that use other data structures to store graphs, the information of the graphs can be extracted and transformed into adjacencies matrix for storage. Then solve the problem by calling the relevant algorithm interface that you have written in advance and implemented using the adjacencies matrix, so that you do not need to implement the algorithm again on the data structure used in the problem.

In this article, I only recommend storing the diagrams that I use frequently, as shown in 3. I’ve already implemented most of the algorithms using the way I store the graphs, and you can use my template to extend the new graph algorithms yourself.

For example: Now the title uses the following method to store a graph, how to translate the graph into my storage method to represent?

Code:

public static Graph createGraph(int[][] matrix) {
    Graph graph = new Graph();
    for (int[] edgeVals : matrix) {
        int weight = edgeVals[0];
        int fromVal = edgeVals[1];
        int toVal = edgeVals[2];
        // The first node is added to the set of points
        if(! graph.nodes.containsKey(fromVal)) { graph.nodes.put(fromVal,new Node(fromVal));
        }
        if(! graph.nodes.containsKey(toVal)) { graph.nodes.put(toVal,new Node(toVal));
        }
        Node from = graph.nodes.get(fromVal);
        Node to = graph.nodes.get(toVal);
        // Build edges and add edge sets
        Edge edge = new Edge(weight, from, to);
        graph.edges.add(edge);
        // Update the node information
        from.out++;
        to.in++;
        from.nexts.add(to);
        from.edges.add(edge);
    }
    return graph;
}
Copy the code

5. Traversal of the graph

The difference between a walk of a graph and a walk of a binary tree is that a binary tree cannot produce rings, whereas a graph may. Therefore, in the traversal, you must add a mechanism to determine the occurrence of the loop, otherwise the traversal will go into an infinite loop.

5.1 Depth-first traversal

Steps:

  1. Prepare a stack.
  2. The source node presses the stack first and prints.
  3. Iterative fixed process:
    1. The top stack element, cur, hits the stack.
    2. Traversal the cur’s direct neighbor nodes. If the unregistered direct neighbor nodes are traversed, the CUR presses the stack first, the direct neighbor node presses the stack later, prints and registers the direct neighbor node, and the traversal ends.
    3. End the iteration until the stack is empty.

Note: Since each node’s immediate neighbors may be stored in a different order, depth-first traversal can have multiple results.

Code:

public static void depthFirstSearch(Node node) {
    if (node == null) {
        return ;
    }
    Stack<Node> stack = new Stack<>();
    // Secondary tables, pressing nodes are registered into the table, so as to detect whether the ring is present
    HashSet<Node> set = new HashSet<>();
    stack.push(node);
    set.add(node);
    // Print the source node separately
    System.out.println(node.value);
    while(! stack.isEmpty()) { Node cur = stack.pop();for (Node neighbor : cur.nexts) {
            if (set.contains(neighbor)) {
                continue;
            }
            // Push the current node and the immediate neighbor node back on the stack
            stack.push(cur);
            stack.push(neighbor);
            set.add(neighbor);
            System.out.println(neighbor.value);
            // As long as there is a new direct neighbor node pressing the stack, the direct neighbor node just pressing the stack is processed, and the other direct neighbor nodes waiting for the next time the stack is processed
            break; }}}Copy the code

5.2 Width first traversal

Steps:

  1. Prepare a queue.
  2. The source node is queued first.
  3. Iterative fixed process:
    1. Queue leader node cur out of the queue.
    2. Print a cur.
    3. Traversal the cur direct neighbor nodes. All unregistered direct neighbor nodes are queued and registered.
    4. Until the team is empty, end the iteration.

Note: Since each node’s immediate neighbors may be stored in a different order, width-first traversal can have multiple results.

Code:

public static void breadthFirstSearch(Node node) {
    if (node == null) {
        return ;
    }
    Queue<Node> queue = new LinkedList<>();
    // Secondary table. Nodes that are queued for the first time are registered into the secondary table to detect whether a ring is present
    HashSet<Node> set = new HashSet<>();
    queue.add(node);
    set.add(node);
    while(! queue.isEmpty()) { Node cur = queue.poll(); System.out.println(cur.value);// Traverse all the direct neighbors of the node
        for (Node neighbor : cur.nexts) {
            // Check whether a node has been queued
            if (set.contains(neighbor)) {
                continue; } queue.add(neighbor); set.add(neighbor); }}}Copy the code

6. Topological sorting algorithm

6.1 define

Topological sorting of a Directed Acyclic Graph G is to arrange all vertices in G into a linear sequence, such that any pair of vertices u and V in the Graph appear before V in the linear sequence if edge

∈ E(G). In general, such linear sequences are called Topological Order sequences, or Topological sequences for short. In simple terms, a partial order in a set leads to a total order in that set. This operation is called topological sorting.
,>

6.2 Application Scenarios

The most common application scenario of topological sorting algorithm is the compilation order of dependency packages introduced in projects.

For the project to compile successfully, packages A and B must be compiled first. For package A to compile successfully, package C must be compiled first. For package B to compile successfully, package D must be compiled. C package must compile D package and E package, and D package must compile E package.

In the project compilation, must not appear loop compilation. Any project that traces configuration files and finds circular compilation is bound to report an error. Thus, from package E to the project, a directed acyclic graph is formed.

Once you have the dependency map, how do you determine the build order? Use topological sorting.

6.3 Sorting Process

In a directed acyclic graph, there must be one or more nodes with an input degree of 0. Each time, a node with an input degree of 0 is randomly selected and put into the topological sequence. Then, the node and its influence are removed from the graph, and the next selection is made until the graph is empty.

Because nodes with indegree 0 are randomly selected each time, a graph may have multiple reasonable topological sequences.

6.4 code

public static List<Node> potologicalSort(Graph graph) {
    // Record the input degree of each node
    HashMap<Node, Integer> map = new HashMap<>();
    // A secondary container for each node whose input degree is 0
    Queue<Node> queue = new LinkedList<>();
    // Topological sequence
    ArrayList<Node> order = new ArrayList<>();
    // Record the input degree of all nodes and queue the first batch of nodes whose input degree is 0
    for (Node node : graph.nodes.values()) {
        map.put(node, node.in);
        if (node.in == 0) { queue.add(node); }}// The queue head node exits the queue and enters the topology sequence, and enqueues all the direct neighbor nodes whose entry degree is 0
    while(! queue.isEmpty()) { Node cur = queue.poll(); order.add(cur);for (Node node : cur.nexts) {
            // Adjust the input of the direct neighbor node
            map.put(node, map.get(node) - 1);
            if (map.get(node) == 0) { queue.add(node); }}}return order;
}
Copy the code

7. Generate the minimum spanning tree

7.1 Minimum Spanning tree

7.1.1 Application Scope

Only undirected graphs can generate minimum spanning trees, so both Kruskal algorithm and Prim algorithm target undirected graphs.

7.1.2 concept

Minimum spanning tree is a data structure that deletes several edges from a rightfully undirected graph where all nodes are connected and ensures the connectivity of all nodes and the minimum weight of all edges.

7.2 Kruskal algorithm

7.2.1 process

From the point of view of edges, the weight of all edges is sorted, and the edge with the smallest option value is added to the graph. When you add an edge, if you form a ring, then you don’t add that edge. Until all edges have tried to join, the graph structure is the minimum spanning tree.

The only difficulty with this algorithm is, how do you add an edge, and then tell if the graph generates a ring? This uses a set query structure: look up sets.

7.2.2 Querying and analyzing sets

When an edge is added to the graph, it is determined whether the from and to nodes of the edge exist in the same set. If they exist, it indicates that the node is a ring. If not, the collections where the FROM and to nodes reside are merged.

The process of implementing the minimum spanning tree by bringing the set lookup mechanism into the figure above:

  1. Set state when no edges are added: {A}, {B}, {C}, {D}.

  2. {A}, {B, C}, {D}, {B, C}, {D}

  3. When (A, B) is added, A and B are not in the same set, set condition: {A, B, C}, {D}

  4. {A, B, C, D}; {A, B, C, D};

  5. {A, B, C, D} {A, B, C, D} {A, B, C, D}}

  6. {A, B, C, D} {A, B, C, D} {A, B, C, D}

  7. {A, B, C, D} {A, B, C, D} {A, B, C, D}

This algorithm can be implemented as long as we implement the set query and set merge mechanisms described above.

7.2.3 And query set implementation

This code does not implement a real set lookup, but uses HashMap and ArrayList to simulate a simple version of set lookup.

Although the simple version of parallel set lookup has the function of parallel set lookup, the efficiency of operation is far from that of parallel set lookup, and the time complexity of the following two functions of parallel set lookup is O(1).

public class UnionFind {

    // Store and query the collection to which each node in the set belongs
    public static HashMap<Node, List<Node>> collectionMap = new HashMap<>();

    public UnionFind(List<Node> nodes) {
        // Build a separate collection for each node
        for (Node node : nodes) {
            List<Node> collection = newArrayList<>(); collection.add(node); collectionMap.put(node, collection); }}// Determine whether two nodes are in the same set
    public boolean isSameCollection(Node from, Node to) {
        List<Node> fromCollection = collectionMap.get(from);
        List<Node> toCollection = collectionMap.get(to);
        return fromCollection == toCollection;
    }

    // Merge the set where the two nodes reside
    public void union(Node from, Node to) {
        List<Node> fromCollection = collectionMap.get(from);
        List<Node> toCollection = collectionMap.get(to);
        // move all the collections fromCollection to toCollection and redirect them
        for(Node node : fromCollection) { toCollection.add(node); collectionMap.put(node, toCollection); }}}Copy the code

7.2.3 code

The Edge comparator, which ranks edges by comparing their weight.

public class EdgeComparator implements Comparator<Edge> {
    @Override
    public int compare(Edge edge1, Edge edge2) {
        returnedge1.weight - edge2.weight; }}Copy the code

Algorithm code:

public static Set<Edge> kruskal(Graph graph) {
    // Store the collection of the smallest generated tree edges
    Set<Edge> result = new HashSet<>();
    // All nodes build and look up the set
    UnionFind unionFind = new UnionFind((List<Node>) graph.nodes.values());
    // Build a small root heap and define sorting by Edge weight
    PriorityQueue<Edge> smallRootHeap = new PriorityQueue<>(new EdgeComparator());
    // Sort all edges in the graph by placing them in a small root heap
    smallRootHeap.addAll(graph.edges);
    // Enter the graph from small to large
    while(! smallRootHeap.isEmpty()) { Edge edge = smallRootHeap.poll();// If two vertices of an edge are in the same set, they cannot be added to the result set
        if (unionFind.isSameCollection(edge.from, edge.to)) {
            continue;
        }
        unionFind.union(edge.from, edge.to);
        result.add(edge);
    }
    return result;
}
Copy the code

7.3 Prim algorithm

7.3.1 process

From the point of view, select any node in the graph as the starting node and put that node into the set. Each time, the edge with the smallest weight connected from the inside of the set to the outside of the set and its corresponding nodes outside the set are added to the set. When all nodes are added to the set, the set is the minimum spanning tree.

7.3.2 code

public static HashSet<Edge> prim(Graph graph) {
    // Store the collection of the smallest generated tree edges
    HashSet<Edge> result = new HashSet<>();
    // Stores the set of minimum spanning tree nodes. The two sets are stored synchronously, simulating a set containing both nodes and edges
    HashSet<Node> set = new HashSet<>();
    // Build a small root heap and define sorting by Edge weight
    PriorityQueue<Edge> smallRootHeap = new PriorityQueue<>(new EdgeComparator());
    Node begin = null;
    // Select a random element from the point set as the starting node
    for (Node node : graph.nodes.values()) {
        begin = node;
        break;
    }
    if(begin ! =null) {
        // The starting node enters the collection first
        set.add(begin);
        smallRootHeap.addAll(begin.edges);
        while(! smallRootHeap.isEmpty()) { Edge edge = smallRootHeap.poll(); Node to = edge.to;// Check whether the object is included in the collection. If not, add it to the collection
            if(! set.contains(to)) { set.add(to); result.add(edge);// Add all edges of the new node to the small root heapsmallRootHeap.addAll(to.edges); }}}return result;
}
Copy the code

Note:

This code causes duplicate edges to go into the small root heap, because there may be two nodes in the graph that share an edge. However, the judgment of repeated nodes is added, so even if there are repeated edges, it will not affect the final result, but the number of judgment nodes is increased.

7.4 summarize

Why does The Kruskal algorithm need to use the complex query structure of merge set to query and merge sets, but the Prim algorithm does not?

When constructing the minimum spanning tree, Kruskal algorithm may have the problem of connecting two locally connected large parts together, because its connection order is not connected node by node like Prim algorithm.

The diagram below:

8. Dijkstra algorithm

8.1 illustrates

Dijkstra algorithm is a cell shortest path algorithm, which solves the question: what is the shortest distance from a given point to other nodes?

Scope of application:

  • Undirected graph

  • Edges that cannot have a negative value

Note:

For self node, the shortest path is 0; For unreachable nodes, the shortest path is infinity.

8.2 process

Each time, select the node in the table that is the shortest distance from the specified node, and see if the edge from that node makes some records on the table fewer. After each node is observed, the current node is locked. When all nodes are locked, the shortest path of the cell can be obtained.

8.3 code

public static HashMap<Node, Integer> dijkstra(Node target) {
    // Stores the distance between each node and the target node
    HashMap<Node, Integer> distanceMap = new HashMap<>();
    // Store the locked node
    HashSet<Node> lockedNodes = new HashSet<>();
    // The target node is 0 away from itself
    distanceMap.put(target, 0);
    // The first minimum distance node is target
    Node minNode = getUnlockedMinNode(distanceMap, lockedNodes);
    // Iterate through all nodes in the graph and stop when all nodes are locked
    while(minNode ! =null) {
        for (Edge edge : minNode.edges) {
            Node to = edge.to;
            // Determine whether to is the first time to enter the table
            if(! distanceMap.containsKey(to)) {// If yes, the initial value is: current minimum distance between TO and target = current minimum distance between minNode and target + the weight of minNode's edge connected to to
                distanceMap.put(to, distanceMap.get(minNode) + edge.weight);
            } else {
                // If not, the update table is smaller than the current minimum distance from to to targetdistanceMap.put(to, Math.min(distanceMap.get(to), distanceMap.get(minNode) + edge.weight)); }}// The current node is locked
        lockedNodes.add(minNode);
        // Find the next minimum distance node
        minNode = getUnlockedMinNode(distanceMap, lockedNodes);
    }
    return distanceMap;
}

// Find the node that is not locked and is the smallest distance from the target node
public static Node getUnlockedMinNode(HashMap<Node, Integer> distanceMap, HashSet<Node> lockedNodes) {
    Node minNode = null;
    int minDistance = Integer.MAX_VALUE;
    for (Map.Entry<Node, Integer> entry : distanceMap.entrySet()) {
        Node cur = entry.getKey();
        int distance = entry.getValue();
        if (!lockedNodes.contains(cur) && distance < minDistance) {
            minDistance = distance;
            minNode = cur;
        }
    }
    return minNode;
}
Copy the code

8.4 summarize

Why can’t Dijkstra’s graph have negative edges?

Because if you have negative edges, then at some point, the records that have been locked may not be correct.

Such as:

When selecting node D, because (C, D) = -30, the current shortest distance from A to D is -27, and the current shortest distance from A to C is 3. When we update the table, we see that A can go from D to C so that the distance is minus 30 minus 27 is minus 57, which is less than 3. However, 3 is the record that has been locked, and the shortest path from A to C has been determined to be 3 according to the normal logic of the algorithm, so that the record that has been locked is incorrect.

8.5 Optimization Scheme

8.5.1 instructions

Dijkstra algorithm can also use the heap to optimize. The optimization point is that the classical Dijkstra algorithm finds the node with the smallest distance from the target by traversal, while the optimized algorithm uses the small root heap to remove the node with the smallest distance from the target and then pops it directly from the top of the heap. The problem is that the system heap and the data in the heap are constrained. If the data in the system heap suddenly changes, the system has to readjust the whole heap structure. The system heap adjusts the heap through global scanning, and the cost is very high.

If you were to apply the heap to Dijkstra’s optimization, then if the current minimum distance node was D, when the heap popped out of D, it might still be C in the heap because one of the edges of D has updated itself. At this point, we need to manually build the heap, and when a node in the heap updates data, we do a heapInsert or heapify operation on that node. Such heap adjustments are local and cheap.

Why can’t the system heap be partially tuned? Because if you modify the data of a node in the system heap, the system automatically local adjustment, then the system heap is not a strict black box. The system heap just provides you with a number of nodes of the identified data to form a large root heap or a small root heap, and then pops you a node at the top of the heap. There is no operation where you change the data of nodes in the original system heap, and the heap automatically adjusts locally.

8.5.2 code

This optimization is especially important to master because it teaches you how to implement the heap yourself manually in situations where the system heap is not available.

Pile of code:

public class DistanceHeap {

    // Node data
    private Node[] data;
    // The distance between each node and the root node
    private HashMap<Node, Integer> distanceMap;
    // The index of each node in the data, the index of the node out of the heap is -1
    private HashMap<Node, Integer> indexMap;
    // The capacity of the heap
    private int size;

    public DistanceHeap(int size) {
        this.data = new Node[size];
        this.distanceMap = new HashMap<>();
        this.indexMap = new HashMap<>();
        this.size = size;
    }

    // Check whether the current heap is empty
    public boolean isEmpty(a) {
        return size == 0;
    }

    // Switch between two nodes in the heap
    private void swap(int i, int j) {
        // The exchange of indices in indexMap
        indexMap.put(this.data[i], j);
        indexMap.put(this.data[j], i);
        // Swap in the heap
        Node tmp = this.data[i];
        this.data[i] = this.data[j];
        this.data[j] = tmp;
    }

    // Check whether a node has entered the heap
    private boolean isEnteredHeap(Node node) {
        return indexMap.containsKey(node);
    }

    // Determine whether a node is currently in the heap
    private boolean isInHeap(Node node) {
        returnisEnteredHeap(node) && indexMap.get(node) ! = -1;
    }

    public void addOrUpdateOrIgnore(Node node, int distance) {
        // If it is already in the heap, determine whether the data needs to be updated
        if (isInHeap(node)) {
            distanceMap.put(node, Math.min(distanceMap.get(node), distance));
            heapify(indexMap.get(node), size);
        }
        // Determine if it is the first time to enter the heap, and if so, initialize the data
        if (!isEnteredHeap(node)) {
            data[size] = node;
            indexMap.put(node, size);
            distanceMap.put(node, distance);
            heapInsert(size ++);
        }
    }

    // Information class provided to Dijkstra
    static class NodeRecord {
        / / the node
        Node node;
        // The distance between the node and the target node
        int distance;

        public NodeRecord(Node node, Integer distance) {
            this.node = node;
            this.distance = distance; }}// Pop the top node of the heap
    public NodeRecord poll(a) {
        // Top node information
        NodeRecord nodeRecord = new NodeRecord(data[0], distanceMap.get(0));
        // Switch nodes at the top and bottom of the heap
        swap(0, -- size);
        // Delete the information about the top node of the heap
        distanceMap.remove(data[size]);
        indexMap.put(data[size], -1);
        data[size] = null;
        // Readjust the heap
        heapify(0, size);
        return nodeRecord;
    }

    // Adjust the heap upward by the distance from the node to the root node
    private void heapInsert(int index) {
        while (distanceMap.get(data[index]) < distanceMap.get(data[(index - 1) / 2])) {
            swap(index, (index - 1) / 2);
            index = (index - 1) / 2; }}// Adjust the heap downward by the distance from the node to the root node
    private void heapify(int index, int heapSize) {
        int leftChildIndex = index * 2 + 1;
        while (leftChildIndex <= heapSize) {
            int leastIndex = leftChildIndex + 1 <= heapSize && distanceMap.get(data[leftChildIndex + 1]) < distanceMap.get(data[leftChildIndex])
                    ? leftChildIndex + 1 : leftChildIndex;
            leastIndex = distanceMap.get(data[leastIndex]) < distanceMap.get(data[index]) ? leastIndex : index;
            if (leastIndex == index) {
                break;
            }
            swap(index, leastIndex);
            index = leastIndex;
            leftChildIndex = index * 2 + 1; }}}Copy the code

Dijkstra code after optimization:

public static Map<Node, Integer> dijkstra(Node target, int size) {
    HashMap<Node, Integer> distanceMap = new HashMap<>();
    DistanceHeap distanceHeap = new DistanceHeap(size);
    // Put the root node into the heap first
    distanceHeap.addOrUpdateOrIgnore(target, 0);
    while(! distanceHeap.isEmpty()) {// Get the top node data
        DistanceHeap.NodeRecord curRecord = distanceHeap.poll();
        Node cur = curRecord.node;
        int distance = curRecord.distance;
        // Add all direct neighbors at the top of the heap to the heap or update their data or ignore them
        for (Edge edge : cur.edges) {
            distanceHeap.addOrUpdateOrIgnore(edge.to, edge.weight + distance);
        }
        distanceMap.put(cur, distance);
    }
    return distanceMap;
}
Copy the code