Quicksort attribute

The last article introduced bubble sort and its optimization. This introduction of the quick sort is bubble sort evolved from the algorithm, much more efficient than bubble sort.

Quicksort is fast because it uses divide-and-conquer. Although it is also through constant comparison and movement to achieve the sort, but it is implemented, increases the distance of comparison and distance of movement. Bubble sort is just adjacent comparisons and swaps.

The idea of quicksort is to divide the records to be sorted into two independent parts through a sort. If the keywords of one part of the records are smaller than those of the other part, the two parts of the records can be sorted separately to achieve the purpose of ordering the whole sequence.

Not feeling the benefit literally, let’s take a look at the basic quicksort algorithm with an example that assumes the current array elements are: 5, 1, 9, 3, 7, 4, 8, 6, 2.

Basic quicksort algorithm

Initial state: 5, 1, 9, 3, 7, 4, 8, 6, 2

Select 5 as a base element, and then move the hight subscript from right to left to compare the base element with the element with the hight subscript.

If the base element is large, move hight one bit to the left; If the base element is small, the element is swapped.

In moving the hight index to the left, we want to find what is smaller than the base element and then swap.

After the swap, the left is shifted to the right to find the one larger than the base element, which is swapped.

Video animation

Animation address

Code

Result

Exchanged [2, 1, 9, 3, 7, 4, 8, 6, 5] exchanged [2, 1, 5, 3, 7, 4, 8, 6, 9] in exchange [2, 1, 4, 3, 7, 5, 8, 6, 9] in exchange [2, 1, 4, 3, 5, 7, 8, 6, 9] calculate the pivot value 5 and the corresponding subscript 4 exchange [1, 2, 3, 3, 5, 7, 8, 6, 9] calculate the pivot value 2 and the corresponding subscript 1 exchange [1, 2, 3, 4, 5, 7, 8, 6, [1, 2, 3, 4, 5, 6, 8, 7, 9] calculate the pivot value 7 and the corresponding subscript 6 calculate the pivot value 8 and the corresponding subscript 7

Optimize the selection of pivots

As a good example, suppose the array elements are 9,8,7,6,5,4,3,2,1.

So when you do the hight shift to the left you swap the first one, 1,8,7,6,5,4,3,2,9. Well, it seems to be pretty fast, but one by one you do unnecessary calculations when you move low to the right, and none of the elements are larger than the pivot value. As with bubble sort, this trip made eight comparisons, reaching the worst time complexity O(n^2). It’s the opposite of quick O(nlongn).

So what’s a better way to pick the pivot value?

I read on the Internet that it’s a random number to use as the base element. Well, it seems like a good idea, but the worst case scenario is still possible. Each pivot value can be a maximum or a minimum. If it’s a huge amount of data and the first one randomly selects the maximum number, the program card is half dead, only to kill it and run again?

To improve the case, take the middle of the three numbers. What does that mean?

It is to take three key numbers in a group of numbers, and use the middle number as the pivot. Generally, you can take the left end, right end and the middle three numbers, or you can choose them randomly. So you might say, what if all three numbers are the smallest or all three numbers are the largest?

Yes, there is still an overhead in time, which does not prove that a good pivot value has been selected. What if you take the middle of nine?

It’s not just taking nine random numbers and sorting them in half and taking the middle number.

It takes three samples from the array, three numbers at a time, one middle number from each sample, and one middle number from each of the three centers as a pivot. If you take one extreme you don’t care, but if you take three samples you’re going to get three extremes which is obviously very small. This method increases the probability of choosing a good pivot.

Optimize unnecessary exchanges

Go back to the basic quicksort algorithm and review the video animation above. We can see that there is unnecessary movement going on.

We end up with the pivot value of the turn, with the large number to its right and the small number to its left. But this pivot value fits the criteria and goes where it shouldn’t go every time. I think the place in front of it do not move, such as a trip to the end of their own to go to the place, reduce the consumption of time.

Video animation

Animation address

Code

Result

In exchange [5, 1, 2, 3, 7, 4, 8, 6, 9] in exchange [5, 1, 2, 3, 4, 7, 8, 6, 9] in exchange [4, 1, 2, 3, 5, 7, 8, 6, 9] Calculate the pivot value 5 and the corresponding subscript 4 exchange [3, 1, 2, 4, 5, 7, 8, 6, 9] [1, 2, 3, 4, 5, 7, 8, 6, 9] [1, 2, 3, 4, 5, 7, 8, 6, 9] [1, 2, 3, 4, 5, 7, 6, 8, 9] 5, 6, 7, 8, 9] figure out the pivot value 7 and the corresponding subscript 6 figure out the pivot value 8 and the corresponding subscript 7

Optimize recursive operations

We all know that recursion has some effect on performance, and there are two recursive operations at the end of quickSort. If the sequence to be sorted is extremely unbalanced, the depth of the recursion is close to the height of n (without the advantage of dichotomy). This is also the worst time complexity O(n^2), not O(nlogn) in equilibrium.

Slow time is fine, but the stack size is also limited. Each recursive operation consumes a certain amount of stack space. The more arguments a function has, the more space each recursive call consumes.

If you can reduce recursion, the performance is greatly improved.

So how do you optimize recursive operations?

Look at the following code.

Code

Result

Exchanged [2, 1, 9, 3, 7, 4, 8, 6, 5] exchanged [2, 1, 5, 3, 7, 4, 8, 6, 9] in exchange [2, 1, 4, 3, 7, 5, 8, 6, 9] in exchange [2, 1, 4, 3, 5, 7, 8, 6, 9] calculate the pivot value 5 and the corresponding subscript 4 exchange [1, 2, 3, 3, 5, 7, 8, 6, 9] calculate the pivot value 2 and the corresponding subscript 1 exchange [1, 2, 3, 4, 5, 7, 8, 6, [1, 2, 3, 4, 5, 6, 8, 7, 9] calculate the pivot value 7 and the corresponding subscript 6 calculate the pivot value 8 and the corresponding subscript 7

The execution result is the same as the previous two results. That’s a good way to do it. We change if to while, and after one recursion, low is no longer useful, so we assign pivot + 1 to low as the next argument. And as you can see, it’s all the same.

Therefore, using an iterative rather than recursive approach can reduce the stack depth and improve overall performance.