The content is introduced

Introduction to Hill sort

Bubble sequencing was studied in 1956. But the sorting speed is relatively slow, bubble sort time complexity is O(n^2), however, in the following a long time, although people invent a variety of sorting algorithms (such as the previous selection sort and insertion sort), but the time complexity is 0(n^2), time complexity seems to be unable to exceed 0(n^2). At this point, the computer world was full of “sorting algorithms can’t break O(n^2)”. Finally, one day, when a scientist published a new sorting algorithm that went beyond 0(n^2), several sorting algorithms that could go beyond 0(n^2) followed, and the time complexity of inner sorting algorithm increased to 0(nlogn). “Can’t go beyond zero (n^2)” is history.

Hill sort, named after its designer Donald Shell, was published in 1959. Hill sort is also an insertion sort, which is a more efficient version of the improved simple insertion sort, also known as reduced incremental sort. Before this, the time complexity of the sorting algorithm is basically 0(n^2), Hill sort algorithm is one of the first algorithms to break through this time complexity.

The efficiency of insert sort mentioned above is very high when the data is nearly ordered. Only a few insert operations are needed to complete the sorting work. In this case, direct insert is very efficient. When data is relatively small, the advantage of direct insertion is also more obvious. But the problem is that the two conditions themselves are too harsh, the actual development of little data or basic order are special cases.

But don’t worry, the conditions do not exist, we can create the conditions to do it. Shell’sSort, also known as Diminishing Increment Sort, is a more efficient and improved version of the direct insert Sort algorithm. The method is named after D.L.Shell, who proposed it in 1959. So insert sort is order n^2, but if it’s already ordered, you can end it early, and you can make it more efficient and we know that insert sort is very efficient in nearly ordered arrays, so how do you get the number of records that you want to sort to be small and nearly ordered? It is easy to think of grouping records that would otherwise have a large number of records. At this time, the number of records to be sorted in each sub-sequence is relatively small. Then, direct insertion sort is carried out respectively in these sub-sequences. When the whole sequence is basically ordered, note that it is only basically ordered, then direct insertion sort is carried out for all records.

Some students will be confused. For example, we now have a sequence {9, 1, 5, 8, 3, 7, 4, 6, 2} and divide it into three groups: {9, 1, 5}, {8, 3, 7}, {4, 6, 2}, even if they are sorted separately, become {1, 5, 9}, {3, 7, 8}, {2,4,6} and then combine them into {1, 5, 9, 3, 7, 8, 2,4,6}, this sequence is still disorderly. It’s not basically ordered, so if you want to sort it, let’s do it all over again and just insert it, it doesn’t work. It should be emphasized that the so-called basic order means that the small keywords are basically in front, the large ones are basically in the back, and the medium ones are basically in the middle, such as {2, 1, 3, 6, 4, 7, 5, 8, 9}, which can be called basic order. But {1, 5, 9, 3, 7, 8, 2, 4, 6} with a 9 in the third place and a 2 in the third place from the bottom are not fundamentally ordered. In fact, the problem is here, the purpose of splitting the records to be sorted is to reduce the number of records to be sorted, and make the whole sequence to the basic order. And the method of sorting the groups individually like the above can not meet our requirements. Therefore, we need to adopt the strategy of jump splitting: the records with a certain “increment” apart are divided into one sub-sequence, so as to ensure that the results obtained by direct insertion sorting in the sub-sequence are basically ordered rather than locally ordered.

Hill’s idea of ordering

Hill sort, which groups elements in pairs by increment (gap), and sorts each group using the direct insertion sort algorithm; The increments (gap) gradually decrease, when the gap is reduced to 1, the whole data is exactly divided into groups, the last insertion sort, the entire array is ordered.

Hill sort animation demo

In general, there is no special requirement for sorting algorithms to sort in ascending order, small first, big last. The array consists of {7, 3, 1, 9, 5, 4, 2, 8, 6}.

First time: Gap = 9/2 = 4 animation:

Second time :gap = 4/2 = 2 animation:

Third time :gap = 2/2 = 1 animation:

Hill sequencing analysis

Hill sort, which divides elements into N groups by increments (gaps) and sorts each group using the direct insertion sort algorithm. The increments decrease gradually, and when the increments decrease to 1, the whole data is exactly grouped, and finally an insertion sort is performed.

The first time :gap = 9/2 = 4, the space between each element and the following element is 4, the element is divided into 4 groups. Elements 7, 5 and 6 are divided into a group, elements 3 and 4 are divided into a group, elements 1 and 2 are divided into a group, and elements 9 and 8 are divided into a group. The effect is shown as follows:

The second time :gap = 4/2 = 2, the space between each element and the following element is 2, the element is divided into 2 groups.

Third time :gap = 2/2 = 1, the space between each element and the following element is 1, the element is divided into 1 group.

Hill sort code

So just to review our analysis, when we had 9 elements, the gap was 4, 2, 1.

The value of gap varies with the number of elements, and a loop is needed to control the value of gap. When gap is taken, insertion sort is used to sort each group, and nested loops are used. Be careful where you start when doing a small insertion sort. Let’s review the insertion sort we learned earlier. The first element 6 doesn’t need to be moved, and the second element 5 only needs to be compared and inserted.

After hill sort is split into multiple groups, each group uses insertion sort, which also compares inserts from the second data of each group, as shown in the figure below:

As you can see from the figure above, the first group compares from index 4, the second group compares from index 5, the third group compares from index 6, and the fourth group compares from index 7.

The hill sort code is as follows:

public class ShellSortTest2 {

    public static void main(String[] args) {

        int[] arr = new int[] {7.3.1.9.5.4.2.8.6};

        shellSort(arr);

    }



    public static void shellSort(int[] arr) {

        // Incremental gap, through the gap to divide the different sub-sequence,gap continuously reduced, divided more sub-sequence

        for (int gap = arr.length / 2; gap > 0; gap /= 2) { // get different gap, gap = 4, 2, 1

            System.out.println("gap: " + gap);



            for (int i = gap; i < arr.length; i++) { // Use insert sort for each group

                for (int j = i; j-gap >= 0; j -= gap) {

                    if (arr[j-gap] > arr[j]) {

                        swap(arr, j-gap, j);

                        System.out.print("\t exchange:" + arr[j-gap] + "And" + arr[j]);

                        System.out.println(Arrays.toString(arr));

                    }

                }

            }

        }

    }



    public static void swap(int[] arr, int start, int end) {

        int temp = arr[start];

        arr[start] = arr[end];

        arr[end] = temp;

    }

}

Copy the code

Execution effect:

gap: 4

Exchange:5and7[5.3.1.9.7.4.2.8.6]

Exchange:8and9[5.3.1.8.7.4.2.9.6]

Exchange:6and7[5.3.1.8.6.4.2.9.7]

gap: 2

Exchange:1and5[1.3.5.8.6.4.2.9.7]

Exchange:4and8[1.3.5.4.6.8.2.9.7]

Exchange:2and6[1.3.5.4.2.8.6.9.7]

Exchange:2and5[1.3.2.4.5.8.6.9.7]

gap: 1

Exchange:2and3[1.2.3.4.5.8.6.9.7]

Exchange:6and8[1.2.3.4.5.6.8.9.7]

Exchange:7and9[1.2.3.4.5.6.8.7.9]

Exchange:7and8[1.2.3.4.5.6.7.8.9]

Copy the code

Incremental and complexity of Hill sort

Through the above animation and code, I believe that some of you understand that the key of Hill sorting is not randomly grouped and sorted separately, but to form a sub-sequence of records separated by a certain “increment” to achieve leap-forward movement, so as to improve the efficiency of sorting. Here the choice of “increment” is critical. Int gap = arr. Length / 2; But what kind of increment should be the best is still a mathematical problem, so far no one has found a best sequence of increments.

So far, no one has been able to theoretically analyze the efficiency of Hill sorting except in some special cases. There are various experimentally based estimates that estimate its time complexity from 0(N^3/2) to 0(N^7/6).

Hill sort common incremental sequences:

  1. Shell delta sequence, the recursive formula of Shell delta sequence is:

  2. The general formula of Hibbard incremental sequence is:

  3. The general formula of Knuth incremental sequence is:

  4. The recursive formula of Gonnet incremental sequence is:

  5. The general formula of Sedgewick increment sequence is:

Note that the last increment in the sequence must be equal to 1, and the last sort is a normal insertion sort. The invention of Hill sorting algorithm, so that we finally break through the slow sorting, more efficient sorting algorithm has emerged.

conclusion

Hill sort is a more efficient and improved version of the direct insert sort algorithm. Group elements by increments (gaps), sort each group using a direct insert sort algorithm; The increments decrease gradually, and when the increments decrease to 1, the whole data is exactly grouped, and finally an insertion sort is performed.


———- End ———-


Original articles and animation production is really not easy, your thumbs up is the biggest support!

For more articles, please pay attention to wechat public number: Cousin Animation learning programming