Many of Hill’s videos and texts don’t seem to make much sense of what’s going on. Hill sort, often understand, after a long time to look at the code, and a little confused.

The first thing to understand about Hill sort is to understand insertion sort. Hill sort itself is an improvement on insert sort, because when most elements of an array are sorted from large to small, using insert sort to sort from small to large will reduce the efficiency (need to do many swaps), in this case, hill sort will improve the efficiency of sorting.

Insertion Sort

If arr = [6, 5, 4, 3, 2, 1], we need to sort the array by insertion sort, then the process is as follows: we can divide the array into ordered and unordered parts, while sorting, we constantly insert the first element of the unordered part into the ordered part. [6] [5, 4, 3, 2, 1] Insert 5 as the second element of the array (index = 1). When inserting, compare the elements of the ordered part one by one in the order from back to front. If the value is smaller than that of the ordered element, swap. If the value is larger than the ordered element, the comparison is stopped. Insert element 4 (index = 2) into the ordered part of [5, 6]. So if we go backwards, four swaps with six, then five. [4, 5, 6] [3, 2, 1] insert element 3 (index = 3) into [4, 5, 6], and swap positions with 6, 5, and 4 successively. .

Here is the C code implementation (it doesn’t matter which language you use, because the code will almost always look the same) :

// Variable names are easy for code to understand
void insertionSort(int arr[], int size) 
{
    int i, j, temp;

    for(i = 1; i < size; i++)
        for(j = i- 1; j >= 0 && arr[j] > arr[j+1]; j--) {
            temp = arr[j];
            arr[j] = arr[j+1];
            arr[j+1] = temp; }}Copy the code

Hill Sort Shell Sort

Hill sort is the continuous insertion sort of the array according to a certain gap value. Different Hill sorting algorithms may use different gap values. The gap values of the classical Hill algorithm are N/2, N/4,…… Until the gap is 1, where N is the length of the array.

Process of understanding

As mentioned earlier, when most elements of an array are sorted from largest to smallest, many swaps are needed to sort from smallest to largest using insert sort, as in the example [6, 5, 4, 3, 2, 1] above. In this case, hill sort will first process the array to order most elements from smallest to largest, and then use insert sort to process the last, which can greatly reduce swap and improve sorting efficiency. [61, 109, 149, 111, 34, 2, 24, 119, 122, 27] First, there are 10 elements in the array (size = 10).

In the first round, size/2 was used as the gap. We can understand it as dividing the original array into small arrays according to the interval, and sorting the small arrays by insertion. Here gap = 5, small array is [61, 2], [109, 24], [149, 119], [111, 122], [34, 27]. Insert sort small arrays: [2, 61], [24, 109], [119, 149], [111, 122], [27, 34]. We do not change the array, so the sorted array is [2, 24, 119, 111, 27, 61, 109, 149, 122, 34]. As can be seen, after a sorting, some elements with larger values in the original array are moved to the back of the array, which greatly reduces the number of swap insertions.

In the second round, gap = gap/2 = 2. Continue to divide the original array into smaller arrays, and insert sort the smaller arrays. [2, 119, 27, 109, 122], [24, 111, 61, 149, 34], [24, 34, 61, 111, 149, 149] So the original array is [2, 24, 27, 34, 109, 61, 119, 111, 122, 149]

In the third round, gap = gap/2 = 1. So this is the last round, and it’s just sort the array by insertion after the second round.

Code implementation

When the array length is n, we’re going to do lg(n) round sort. Lg (n) round sort, which is a for loop, makes sense. So what does the sorting code look like for each round in the for loop?

Round 1: Gap = 5, the large array is divided into 5 smaller arrays ([61, 2], [109, 24], [149, 119], [111, 122], [34, 27]). I’m going to do five insertion lines to go from index 5 to 9.

Round 2: Gap = 2, the large array is split into two smaller arrays ([2, 119, 27, 109, 122], [24, 111, 61, 149, 34]). I’m going to do eight insertion lines to go from index 2 to 9. We know that when you sort an array by insertion, you compare each insert to the previous ordered element, but if it’s larger than the ordered element, you stop comparing and you finish inserting. The element with serial number 2 is first compared with 0. The value of serial number 2 is larger than that of serial number 0. Swap is not required to complete the insertion. The serial number 4 is compared with the serial number 2 first. Since it is smaller than 2, swap is required, and then compare with the serial number 0. Compare the serial number 6 with serial number 4, swap, and then compare with serial number 2, stop and complete the insert; The number 8 is first compared to number 6, because it is larger than 6, it stops comparing directly.

# include <stdio.h>

void shellsort(int arr[], int n) {
    int gap, i, j, temp;

    for(gap = n/2; gap > 0; gap /= 2)
        for(i = gap; i < n; i++)
            for(j = i - gap; j >= 0&& arr[j] > arr[j+gap]; j -= gap) { temp = arr[j]; arr[j] = arr[j+gap]; arr[j+gap] = temp; }}int main(a) {
    int arr[] = {61.109.149.111.34.2.24.119.122.27}; 
    int size = sizeof arr / sizeof arr[0];
    shellsort(arr, size); 

    for(int i = 0; i < size; i++) {
        printf("i: %d\n", arr[i]); 
    }
    return 0; 
}
Copy the code