This article is based on an analysis of Myers

In the reference article, I have a certain understanding of Myers difference algorithm, but there are still many problems in the process of code implementation, and the author’s explanation of the source code is relatively few, so I focus on an analysis of the implementation process.

An overview of the

I divided the implementation of the code into three parts:

Draw edit map

Generate the snake

Back in the snake

Draw edit map

The following figure is the edit diagram in the original paper. We need to save the corresponding circled points in a similar way to the table below

In this table, the relationship between line K and the number of steps D is shown

Specific code implementation can refer to the pseudo code in the paper

 private static void myers(String str1, String str2) {
        char[] arr1 = str1.toCharArray();
        char[] arr2 = str2.toCharArray();
        int m = arr1.length;
        int n = arr2.length;
        // We need to get a graph
        // The ordinate represents the number of steps, the ordinate represents the value of k, and the value of the ordinate represents the point that can be reached under the number of steps d
        // define a map to store the horizontal data in our graph. Key is the value of k, and value is the x coordinate, which is actually the number of steps d
        // the value of each k in the kposition can be derived from the previous k-1 or k+1
        Map<Integer,Integer> kPositions = new HashMap<>();
        // Define a set of kPositions that store each step
        List<Map<Integer,Integer>> dkPositions = new ArrayList<>();
        // The maximum number of steps that can be taken
        int maxd = n + m;
        // Step by step
        kPositions.put(1.0);

        dloop: for(int d=0; d<=maxd; d++){ Map<Integer,Integer> kPositionsTemp =new HashMap<>();
            // Each step will fall on the k line, and the value of k is -d, d
            boolean breakFlag = false;
            for(intk = -d; k<=d; k+=2) {int x;
                //k==-d means I'm going to go down, I'm going to go up, I'm going to go up, I'm going to go up, I'm going to go up, I'm going to go up, I'm going to go up
                if(k==-d){
                    x = kPositions.get(k+1);
                    //k==d means to go to the right, which means that this step removes a character, so the x-coordinate is x+1 of the previous one
                }else if(k==d){
                    x= kPositions.get(k-1) +1;
                    // These are the two boundary cases,

                    // In the following case, the previous step of the current step has two values, k-1 and k+1, we follow the rule of deleting first and adding later
                }else{
                    //k = x-y , x = k+y , y = x- k
                    // We decide between two alternatives
                    // How to find x?
                    // We need to judge the direction between our current position and the last position
                    // How to judge?
                    // To go from k-1 to k means to go to the right, so x needs to +1, k+1 to k means to go down, so x does not change, so do we want to come from k-1 or k+1?
                    // We need to pick a point that goes furthest. I think I'm coming from the far point. The condition for judging the far point is that whoever has a large x goes farther.
                    // If kPositions. Get (k-1) is big, that means you're going from k-1 to k, which means x needs to be +1, and vice versa, that means you're going from kPositions. Get (k+1) to k, which means x doesn't change
                    x= kPositions.get(k-1)<kPositions.get(k+1)? kPositions.get(k+1):kPositions.get(k-1) +1;
                }
                // Now that we have taken one step, we can calculate the y-coordinate of this step
                int y = x-k;
                // In this case we also need to determine whether there will be a slash, i.e., equality. If equality means walking a slash, we can continue walking until the steps are not equal
                while( y<n&&x<m && arr1[x]==arr2[y]){
                    x=x+1;
                    y=y+1;
                }
                kPositions.put(k,x);
                kPositionsTemp.put(k,x);
                // If x, y is greater than or equal to m, n, it means that we have found the end point and there is no need to go any further

                if (x>=m&&y>=n){
                    dkPositions.add(kPositionsTemp);
                    breakFlag = true;

                    break dloop;
                }
            }
            dkPositions.add(kPositionsTemp);
            if (breakFlag) break ;
        }
          // Now that you have drawn the entire edit diagram, you need to use this edit diagram to generate snakes
        / / generated snakes
        Stack<Snake> snakes = generateSnake(dkPositions, arr1, arr2);
        / / back the snake
        printSnake(snakes,arr1,arr2);
    }
Copy the code

Generate the snakes

Snakes need to be generated after drawing the editing graph above. Here, I define snake as an object that contains the x and y coordinates of the point where the current step is located and the direction of the next step

 @Data
    @ToString
    @AllArgsConstructor
    static class Snake{
        private int x;
        private int y;
        private Boolean right;
    }
Copy the code

Once you have the definition of this object, you need to generate all the Snakes


    private static Stack<Snake> generateSnake(List<Map<Integer, Integer>> dkPositions, char[] arr1, char[] arr2) {
        Stack<Snake> snakes = new Stack<>();
        // Define the starting position as the end, since we are backtracking from the end
        int currentX = arr1.length;
        int currentY = arr2.length;
        snakes.push(new Snake(currentX,currentY,null));
        // Start from the end and go back
        for(int d=dkPositions.size()-1; d>0; d--){// Get the possible points of the previous step
            Map<Integer, Integer> preKPositions = dkPositions.get(d-1);
            // Calculate the k line corresponding to the current point
            int k=currentX-currentY;
            // The previous point can only be k+1 or k-1
            // Take a step down to get the previous point of the current point (from k+1 to k, k becomes smaller)
            // This is a step down from the previous point to get to the current point.
            Integer prex1 = preKPositions.getOrDefault(k + 1, -1);
            // Take a step to the right to get the point before the current point
            Integer prex2 = preKPositions.getOrDefault(k - 1, -1);
            if (prex1==null&&prex2==null) {
                return snakes;
            }
            // To determine from which point, first determine if the previous point to the right step to the current point
            booleanright = prex2! =null;
            int y1 ;
            int y2 ;
            // Compute the x and y coordinates of the previous point
            int prey =right? prex2-k+1: prex1-k-1;
            intprex = right? prex2:prex1;// The above x and y coordinates are based on prex1 and prex2, where one of them is empty
            // If both are not empty we need to recalculate
            if(prex1! =null&&prex2! =null) {// Compute the y coordinates of both
                y1 = prex1-k-1;
                y2 = prex2-k+1;
                // Calculate the difference between the current x coordinate and the last x coordinate
                int diff1 = currentX-prex1;
                int diff2 = currentX-prex2;
                // We tend to come from the nearest point
                // If it is close to prex2, i.e. diff2 is small, then it is obtained by taking a step to the right of the previous point; otherwise, it is obtained by going down
                right = diff2<diff1;
                // What if these two x's are the same? So let's keep comparing y
                if(diff1==diff2){
                    // If the difference between prex2 and y (y2) is smaller, it is also to the right, and vice versa
                    right = Math.abs(y2-currentY)<Math.abs(y1-currentY);
                }
                // Finally we have the x and y coordinates of the previous pointprey = right? y2:y1; prex = right? prex2:prex1; }// Construct a Snake object from the previous point we obtained and push it
            snakes.push(new Snake(prex,prey,right));
            // The current coordinate becomes the previous coordinate
            currentX = prex;
            currentY = prey;

        }
        return snakes;
    }

Copy the code

Backtrace output difference

It’s easy to get the difference information when you have Snake, just loop out of the stack


    private static void printSnack(Stack<Snake> snakes, char[] arr1, char[] arr2) {

        int x = 0;
        int y = 0;
        // Start at the origin (do not ignore the first n identical characters)
        while(x<arr1.length&&y<arr2.length&&arr1[x]==arr2[y]){ System.out.println(arr2[y]); x++; y++; }// If it is right, go one step to the right (x+1).
        // Then look for slashes (i.e. equal elements)
        // So if right is FALSE it goes down, and so does the rest of the code
        // If right is empty, it means the end
        while(! snakes.isEmpty()){ Snake snack = snakes.pop(); Boolean right = snack.right; x = snack.x; y = snack.y;// If right is empty, it is the last point
            if (right==null) return;
            if (right){
                System.out.println("-"+arr1[x]);
                x++;
                while(x<arr1.length&&y<arr2.length&&arr1[x]==arr2[y]){
                    System.out.println(arr1[x]);
                    x++;y++;
                }
            }else{
                System.out.println("+"+arr2[y]);
                y++;
                while(x<arr1.length&&y<arr2.length&&arr1[x]==arr2[y]){ System.out.println(arr2[y]); x++; y++; }}}}Copy the code

Final output:

A
-B
-C
-A
+S
D
F
+C
+V
+F
E
D
-S
-F
G
+D
+C
Copy the code