# Insert sort: simple insert, hill sort

Posted on Oct. 11, 2023, 11:26 a.m. by 成雅慧
Category: The back-end Tag: The back-end

This is the fourth day of my participation in the August Text Challenge.More challenges in August

So now we're ready for our sorting correlation algorithm. I believe that whether systematic learning or no systematic learning algorithm friends will have heard of many very famous sorting algorithm, of course, our entry today is not directly from the most common algorithm, but in accordance with a certain rule of an introduction.

First, the sorting algorithm we will introduce is the sorting algorithm for insertion types. As the name suggests, insertion sort is to "insert" one or more unordered records into an ordered sequence. Typical examples are simple insertion sort and Hill sort.

## Simple insertion sort

Simple insert sort, also called direct insert sort. Let's look at the code first, and then explain the next step.

``````function InsertSort(\$arr)
{
\$n = count(\$arr);
for (\$i = 1; \$i  \$n; \$i{+ +)// Start the loop, starting with the second element, subscript 1
\$tmp = \$arr[\$i]; // Fetch the first element of the unsorted sequence
for (\$j = \$i; \$j  0  \$arr[\$j - 1]  \$tmp; \$j-) {// Judge from the current index forward, if the previous element is larger than the current element
\$arr[\$j] = \$arr[\$j - 1]; // Move the elements in turn
}
// Put the element in the appropriate position
\$arr[\$j] = \$tmp;
}
echo implode(', '.\$arr), PHP_EOL;
}

InsertSort(\$numbers);

// 49, 38, 65, 97, 76, 13, 27, 49
// 13, 27, 38, 49, 49, 65, 76, 97
Copy the code``````

It's not a lot of code, but it's actually pretty easy to understand. Let's take the first two numbers of the test data to illustrate this briefly.

First, the first loop starts at 1, which means that the first unordered sequence element fetched is TMP = arr[1], which is currently TMP = 38.

The current loop is to determine whether the element j-1 is larger than the current TMP element. If so, enter the loop body, arr[1] = arr[0]. So far, both ARR [0] and ARR [1] are now 49. The whole sequence is 49,49,65...

Let arr[0] = \$TMP = 38. (Loop time j--). The entire sequence is 38,49,65...

In the following image, we can see more clearly the process of completing the sequence.

As you can see from the above steps, simple insertion sort is the process of starting from one side and gradually ordering the previous data. As can be seen from the code, it is continuously decreasing j in the internal loop, comparing with the previous ordered sequence. When it finds its proper position, it puts the data into this position.

From the code and our analysis, the time complexity of simple insertion sort is O(n2). At the same time, it's stable sort. What is stable sort? Those of you who are careful should have noticed that in our test code, there are two identical numbers, namely 49. Stable means that the position of the same data does not change before and after sorting, and the 49 in front is still in front of the 49 behind. That's sort stability.

In addition, simple insertion sort is more suitable for the case where the initial records are basically ordered. When the initial records are out of order and n is large, the time complexity of this algorithm will be relatively high and it is not suitable to use.

## Hill sorting

Simple insertion sort makes sense, but what's hill sort? Don't worry, we don't know much from the name, because the sequence is named after its discoverer. Hill sort is actually an insertion sort algorithm.

As mentioned above, simple insertion sort is suitable for the case of basic order, and Hill sort is to improve the efficiency of simple insertion sort. Its main purpose is to reduce the size of n of sorting and form the basic ordered format of data through several sorting.

For this algorithm, we can't do the code first, but let's look at the graph.

Does that make sense? We are actually grouping the data in increments. For example, in this diagram, we sort the data in increments of 5 for the first time and 3 for the second time. So the third time, it's incremented by 1, and it's just a normal simple insertion sort. And we'll see that in our code in a moment.

Let's do a detailed analysis of the three sorts in incremental iteration order:

1) In the first iteration, we set the grouping increment to 5. At this time, there are three groups of data respectively, namely 49 and 13, 38 and 27, 65 and 49. Then we do a simple insertion sort for these three groups of data, and the array result is 13, 27, 49, 97, 76, 49, 38, 65.

2) In the second iteration, the grouping increment is 3, and then it is divided into two groups with three data in each group, namely 13, 97 and 38 as one group, and 27, 76 and 65 as the other group. A simple insertion sort between the two sets of data results in 13, 27, 49, 38, 65, 49, 97, 76.

3) In fact, after grouping and sorting twice, we can see that the array is basically ordered. The last thing is to do simple insertion sort again with a block increment of 1. In a nutshell, this last step is just plain old simple insert sort.

After the step-by-step explanation is not a lot clearer, again, hill sort is actually a large range of insertion sort by group, and finally narrowed down to a simple insertion sort with only 1 increment. Let's look at the code again:

``````function ShellSort(\$arr)
{
\$n = count(\$arr);
\$sedgewick = [5.3.1];

// The initial increment cannot exceed the length of the sequence to be sorted
for (\$si = 0; \$sedgewick[\$si]  =\$n; \$si+ +);// Start the grouping loop by 5, 3, and 1
for (\$d = \$sedgewick[\$si]; \$d  0; \$d = \$sedgewick[+ +\$si]) {
// Get the current number of packets
for (\$p = \$d; \$p  \$n; \$p{+ +)\$tmp = \$arr[\$p];
// Insert sort starts, within the current group
for (\$i = \$p; \$i =\$d  \$arr[\$i - \$d]  \$tmp; \$i- =\$d) {
\$arr[\$i] = \$arr[\$i - \$d];
}
\$arr[\$i] = \$tmp; }}echo implode(', '.\$arr), PHP_EOL;
}
ShellSort(\$numbers);
Copy the code``````

The code looks like it has three for loops. Where does it improve efficiency? In fact, the efficiency of Hill sorting is really limited, it is actually through the first few times of grouping to make the data basic order. In the grouped state, the number of data comparisons does not reach the level of N. When the last simple sort is done, the whole data is basically ordered, in which case the number of swaps is significantly reduced, so its time complexity can ideally be reduced to O(log2n)2.

## conclusion

How about a sort of starter meal? I don't think we're just gonna pick up street bubbling and spewing. Just because you're not famous doesn't mean you won't be able to use it. For example, when I was interviewing for a job there was a company that stated on the interview questions that bubbling and quicklining were not allowed. At this time, I believe that simple insertion sort intuitive and easy to understand the characteristics of the interview will help us overcome the difficulties of oh!

Test code:

Github.com/zhangyue050...

Reference Documents:

Examples in this paper are from The second edition of Data Structures, Weimin Yan

Data Structures, 2nd edition, Chen yue

Search