warning

This article is suitable for the sorting algorithm do not understand the novice students to watch, the big guy can directly ignore. The length is longer to allow for continuity. Brother are watching the need about an hour, but from introduction to fully understand may need 10 hours (ha ha ha, in my own experience to calculate), so your brother can collect down first, synchronous update in the making, in this article, reference to the implementation of all the algorithms in this address, a sorting algorithm can spare some time to understand every day.

Sort and their type

Our data is mostly unsorted, which means we need a way to sort it. We do this by comparing different elements to each other and improving the ranking of one element. In most cases, without comparison, we cannot decide which parts need to be sorted. After the comparison, we also need to swap the elements so that we can reorder them. A good sorting algorithm is characterized by minimal comparison and exchange. In addition, there are non-comparison based sorts, which do not require comparison of data to sort. We will introduce these algorithms to you in this article. Here are some sorting algorithms we will discuss in this article:

  • Bubble sort
  • Insertion sort
  • Selection sort
  • Quick sort
  • Merge sort
  • Bucket sort

The above sorting can be grouped and classified according to different criteria. For example, simple sorting, efficient sorting, distribution sorting and so on. We will now explore the implementation and complexity analysis of each order, as well as their advantages and disadvantages.

Time and space complexity and stability

Let’s first look at the temporal and spatial complexity and stability of the various sorting algorithms mentioned in this article. You can click here to learn more.

Bubble sort

Bubble sort is one of the most discussed sorting algorithms in the programming world, and the first algorithm that most developers learn about sorting. Bubble sort is a comparison-based sorting algorithm, which is regarded as one of the least efficient sorting algorithms. Bubble sort always requires the maximum number of comparisons, and the average and worst complexity are the same.

In bubble sort, each pending item is compared to the remaining items and swapped as needed. Here is the pseudocode for bubble sort.

procedure bubbleSort(A: list of sortable items)
n = length(A)
for i = 0 to n inclusive do
 for j = 0 to n - 1 inclusive do
    if A[j] > A[j + 1] then
        swap(A[j + 1], A[j])
    end if
  end for
end for
end procedure
Copy the code

As we saw in the pseudocode above, we first run an outer loop to ensure that we iterate over each number, and an inner loop to ensure that once we point to an item, we compare that number to other items in the data set. The following figure shows a single iteration of sorting an item in the list. Assuming that our data include the following items: 20,45,93,67,10,97,52,88,33,92. The first iteration will be the following:

The item with the background color shows the two items we are comparing. As we can see, the first iteration of the outer loop results in the largest item being stored at the top of the list. Then continue until we have traversed each item in the list. Now let’s implement the bubble sort algorithm in PHP.

We can use A PHP array to represent an unsorted list of numbers. Since arrays have both indexes and values, we can easily iterate over each item by location and swap them to the appropriate location.

function bubbleSort(&$arr) : void
{
	$swapped = false;
	for ($i = 0, $c = count($arr); $i < $c; $i{+ +)for ($j = 0; $j < $c - 1; $j{+ +)if ($arr[$j + 1] < $arr[$j]) {
				list($arr[$j].$arr[$j + 1]) = array($arr[$j+ 1].$arr[$j]); }}}}Copy the code

Complexity analysis of bubble sort

For the first pass, in the worst case, we have to do n-1 comparisons and swaps. For the second traversal, in the worst case, we need n minus 2 comparison and interchange. So, if we write it step by step, then we will see: Complexity = n-1 + n-2 +….. Plus 2 plus 1 is n times n minus 1 over 2, which is O(n2). Thus, the complexity of bubble sort is O(n2). Assigning temporary variables, swapping, traversing inner loops, etc., takes some constant time, but we can ignore them because they are invariant. Here is a list of the time complexity of bubble sort for best, average, and worst cases:

best time complexity Ω (n)
worst time complexity O(n2)
average time complexity Θ (n2)
space complexity (worst case) O(1)

Although the time complexity of bubble sort is O(n2), we can use some improvements to reduce the number of data comparisons and exchanges during the sorting process. The best time is O(n) because we need at least one inner loop to be sure that the data is sorted.

Improvements to bubble sort

One of the most important aspects of bubble sort is that for each iteration in the outer loop, there is at least one exchange. If there is no swap, the list is sorted. We can use it to improve our pseudocode

procedure bubbleSort(A: list of sortable items)
    n = length(A)
    for i = 1 to n inclusive do
        swapped = false
        for j = i to n - 1 inclusive do
            if A[j] > A[j + 1] then
                swap(A[j], A[j + 1])
                swapped = true
            endif
        end for
        if swapped = false
            break
        endif
    end for
end procedure
    
Copy the code

As we can see, we now have a flag set to false for each iteration, and we expect that in the internal iteration, the flag will be set to true. If the flag is still false after the inner loop completes, then we can break the outer loop.

function bubbleSort(&$arr) : void
{
	for ($i = 0, $c = count($arr); $i < $c; $i++) {
		$swapped = false;
		for ($j = 0; $j < $c - 1; $j++) {
			if ($arr[$j + 1] < $arr[$j]) {
				list($arr[$j], $arr[$j + 1]) = array($arr[$j + 1], $arr[$j]);
				$swapped = true; }}if(! $swapped)break; // No exchange occurred, the algorithm ends}}Copy the code

We also find that in the first iteration, the largest item is placed on the right side of the array. In the second loop, the second largest item will be second to the right of the array. We can imagine that after each iteration, the i-th unit already stores the sorted items, and there is no need to access the index and do comparisons. Therefore, we can reduce the number of iterations and reduce comparisons from internal iterations. This is our second improved pseudocode

procedure bubbleSort(A: list of sortable items)
    n = length(A)
    for i = 1 to n inclusive do
        swapped = false
        for j = 1 to n - i - 1 inclusive do
            if A[j] > A[j + 1] then
                swap(A[j], A[j + 1])
                swapped = true
            endif
        end for
        if swapped = false
            break
        end if
    end for
end procedure
   
Copy the code

The following is the PHP implementation

function bubbleSort(&$arr) : void
{
	
	for ($i = 0, $c = count($arr); $i < $c; $i++) {
        $swapped = false;
		for ($j = 0; $j < $c - $i - 1; $j++) {
			if ($arr[$j + 1] < $arr[$j]) {
				list($arr[$j], $arr[$j + 1]) = array($arr[$j + 1], $arr[$j]);
				$swapped = true;
			}

			if(! $swapped)break; // No exchange occurred, the algorithm ends}}}Copy the code

Let’s look at the inner loop in the code. The only difference is$c $c $c; Other parts are the same as the first improvement. Therefore, for 20, 45, 93, 67, 10, 97, 52, 88, 33, 92, we can reasonably assume that after the first iteration, the top number 97 will not be considered for the second iteration comparison. The same applies to 93 and will not be considered for the third iteration.

When we look at the previous picture, the question that should immediately come to mind is “Isn’t 92 already sorted? Do we need to compare all the numbers again? Yes, that’s a good question. So we’ve done the last swap in the inner loop and we know where the next array has been sorted. Therefore, we can set a bound for the next loop, with pseudocode like this:

procedure bubbleSort(A: list of sortable items)
    n = length(A)
    bound = n - 1
    for i = 1 to n inclusive do
        swapped = false
        bound = 0
        for j = 1 to bound inclusive do
            if A[j] > A[j + 1] then
                swap(A[j], A[j + 1])
                swapped = true
                newbound = j
            end if
        end for
        bound = newbound
        if swapped = false
            break
        endif
    end for
end procedure
   
Copy the code

Here, we set the bounds after each inner loop completes, and make sure we don’t iterate unnecessarily. Here is the PHP code:

function bubbleSort(&$arr) : void
{
	$swapped = false;
    $bound = count($arr) - 1;
	for ($i = 0, $c = count($arr); $i < $c; $i++) {
		for ($j = 0; $j < $bound; $j++) {
			if ($arr[$j + 1] < $arr[$j]) {
				list($arr[$j], $arr[$j + 1]) = array($arr[$j + 1], $arr[$j]);
				$swapped = true;
				$newBound = $j;
			}
		}
		$bound = $newBound;
		if(! $swapped)break; // No exchange occurred, the algorithm ends}}Copy the code

Selection sort

Selection sort is another comparison-based sorting algorithm that is similar to bubble sort. The big difference is that it requires less swapping than bubble sort. In selection sort, we first find the smallest/largest item in the array and place it first. If we sort in descending order, then what we’re going to get out of the array is the maximum. For ascending, we get the minimum. In the second iteration, we will find the second maximum or minimum value of the array and place it in the second place. Until we get each number in the right place. This is called selection sort, and the pseudo-code for selection sort is as follows:

procedure  selectionSort( A : list of sortable items)
    n = length(A)
    for i = 1 to n inclusive do
        min  =  i
        for j = i + 1 to n inclusive do
            if  A[j] < A[min] then
                min = j 
            end if
        end  for

        ifmin ! = i swap(a[i], a[min]) endif
    end  for
end procedure
Copy the code

Looking at the algorithm above, we can see that after the first iteration in the outer loop, the first minimum term is stored in the first place. In the first iteration, we select the first item and then find the minimum from the remaining items (from 2 to N). Let’s assume that the first item is the minimum. We find another minimum, and we will mark its location until we have scanned the rest of the list and found the new minimum minimum. If we don’t find the minimum, then our hypothesis is correct, this is indeed the minimum. The diagram below:

As we saw in the previous figure, we start with the first item in the list. We then find the minimum value of 10 from the rest of the array. At the end of the first iteration, we swapped values for only two places (marked with arrows). So, at the end of the first iteration, we get the minimum in the array. We then point to the next number, 45, and start to find the next smallest item to the right of its position, and we find 20 from the remaining items (as shown by the two arrows). At the end of the second iteration, we swap the value of the second position with the newly found minimum position from the rest of the list. This operation continues to the last element, and at the end of the process, we have a sorted list. Here is the IMPLEMENTATION of the PHP code.

function selectionSort(&$arr)
{
	$count = count($arr);

	// Repeat the number of elements -1
	for ($j = 0; $j <= $count - 1; $j++) {
		// Set the first unsorted element to the minimum value
		$min = $arr[$j];
		// Iterates over each unsorted element
		for ($i = $j + 1; $i < $count; $i++) {
			// If this value is less than the minimum
			if ($arr[$i] < $min) {
				// Set this element to the minimum value
				$min = $arr[$i];
				// Set the minimum value to the element's position$minPos = $i; }}// The inner loop ends by swapping the minimum value with the unsorted element
		list($arr[$j], $arr[$minPos]) = [$min, $arr[$j]]; }}Copy the code

Select the complexity of the sort

Selection sort also looks like bubble sort, it has two for loops, from 0 to n. The difference between bubble sort and selection sort is that, in the worst case, selection sort makes the maximum number of swaps n minus 1, whereas bubble sort can require n by n swaps. In selection sorting, the best case, worst case, and average case have similar time complexity.

best time complexity Ω (n2)
worst time complexity O(n2)
average time complexity Θ (n2)
space complexity (worst case) O(1)

Insertion sort

So far, we’ve seen two comparison-based sorting algorithms. Now, we’ll explore another sorting algorithm, insertion sort. Compared to the other two sorting algorithms you just saw, it has the simplest implementation. If the number of items is small, insertion sort is better than bubble sort and selection sort. If the data set is large, like bubble sort, it becomes inefficient. Insertion sort works by inserting numbers into the correct place in the sorted list. It starts with the second item in the array and determines if the item is smaller than the current value. If so, it moves the items and stores the smaller items in their correct locations. Then, it moves on to the next item, and the same principle continues until the entire array is sorted.

procedure insertionSort(A: list of sortable items)
    n length(A)
    for i=1 to n inclusive do
        key = A[i]
        j = i - 1
        while j >= 0 and A[j] > key do
            A[j+1] = A[j]
            j--
        end while
        A[j + 1] = key
    end for
end procedure
Copy the code

If we have the following array, the elements are 20, 45, 93, 67, 10, 97, 52, 88, 33, 92. Let’s start with the second item 45. Now we’ll start with the first item to the left of 45, and then go to the beginning of the array to see if there are any values on the left greater than 45. Since there is only 20, there is no need to insert, currently the two terms (20, 45) are sorted. Now we move the pointer to 93, and we start from it again, and the comparison starts at 45, and since 45 is not greater than 93, we stop. Now, the first three items (20, 45, 93) are sorted. Next, for 67, we start with the left-hand side of the number. The first number on the left is 93, which is larger, so you have to move one position. Let’s move from 93 to 67. Then, we move to the next item to its left, 45. 45 is less than 67 and no further comparison is required. Now, let’s move 93 to 67, and then let’s insert 67 to 93. Continue until the entire array is sorted. The following figure illustrates the until full sort process using insertion sort at each step.

function insertionSort(array &$arr)
{
	$len = count($arr);
	for ($i = 1; $i < $len; $i++) {
		$key = $arr[$i];
		$j = $i - 1;

		while ($j >= 0 && $arr[$j] > $key) {
			$arr[$j + 1] = $arr[$j];
			$j--;
		}
		$arr[$j + 1] = $key; }}Copy the code

The complexity of insertion sort

Insertion sort has a time complexity similar to bubble sort. The difference with bubble sort is that the number of exchanges is much lower than bubble sort.

best time complexity Ω (n)
worst time complexity O(n2)
average time complexity Θ (n2)
space complexity (worst case) O(1)

The idea of divide and conquer in sorting

So far, we’ve looked at some sorting algorithms that sort the full list each time. We have to deal with a larger set of numbers at a time. We can solve this problem by trying to make the data set smaller. The idea of divide and conquer has helped us a lot. In this way, we divide a problem into two or more subproblems or sets, and then combine all the results of the subproblems to get the final result. This is known as the divide and conquer method, and the divide and conquer method allows us to effectively solve the sorting problem and reduce the complexity of the algorithm. Two of the most popular sorting algorithms are merge sort and quicksort, which use dime-and-conquer algorithms to sort data and are therefore considered the best sorting algorithms.

Merge sort

As we already know, merge sort uses divide and conquer to solve the sorting problem, and we use two procedures to solve this problem. The first is to divide the problem set into problems small enough to be easily solved, and then combine the results. We’re going to do the divide-and-conquer part recursively. The figure below shows how the divide-and-conquer approach can be used.

Based on the previous image, we can now begin to prepare our code, which will have two parts.

Merge sort * core: merge of two ordered subsequences (functionMerge) * time complexity O(nlogn) * space complexity O(n) * John von Neumann * is second only to quicksort. It is a stable sorting algorithm, which is generally used to sort the whole unordered, but the sequence of the relative ordered subterms * is generally not used for internal (memory) sorting, and is generally used for external sorting */function mergeSort($arr)
{
    $lenght = count($arr); 
    if ($lenght= = 1)return $arr;
    $mid = (int)($lenght/ 2); // Split the array to be sorted in half$left = mergeSort(array_slice($arr, 0, $mid));
    $right = mergeSort(array_slice($arr.$mid));

    return merge($left.$right);
}

function merge(array $left, array $right) {// Initialize the two Pointers$leftIndex = $rightIndex = 0;
    $leftLength = count($left);
    $rightLength = count($right); // Temporary space$combine= []; // Compare the elements of the two Pointerswhile ($leftIndex < $leftLength && $rightIndex < $rightLength) {// If the left element is greater than the right element, put the right element in a separate array and move the right pointer backwardif ($left[$leftIndex] > $right[$rightIndex]) {
            $combine[] = $right[$rightIndex];
            $rightIndex+ +; }else{// If the right element is greater than the left element, put the left element in a separate array and move the left pointer backward$combine[] = $left[$leftIndex];
            $leftIndex+ +; }} // Put all the values from the right array into the returned array, and then put the values from the left array into the returned arraywhile ($leftIndex < $leftLength) {
        $combine[] = $left[$leftIndex];
        $leftIndex+ +; } // Put all of the left array into the returned array, and then put the values of the right array into the returned arraywhile ($rightIndex < $rightLength) {
        $combine[] = $right[$rightIndex];
        $rightIndex+ +; }return $combine;
}
Copy the code

We partition the array until it reaches a size of one. We then start to merge the results using the merge function. In the merge function, we have an array to store the result of the merge. Because of this, merge sort is actually more spatially complex than the other algorithms we’ve seen so far.

The complexity of merge sort

Since merge sort follows a divide and conquer approach, we have to solve these two complex problems. For an array of n sizes, we first need to split the array into two parts and then merge them to get an array of n sizes. We can look at the diagram below

The time required to solve each layer is cn, so if there are l layers, then the total time complexity will be ln. Because there’s logn plus 1, so it’s going to be cn logn plus 1. We remove the constant order and the linear order, and the end result is that the time complexity is O(nlog2n).

best time complexity Ω (nlogn)
worst time complexity O(nlogn)
average time complexity Θ (nlogn)
space complexity (worst case) O(n)

Quick sort

Quicksort is an improvement on bubble sort. Its basic idea is: by a trip to sort to sort data split into separate two parts, one part of all the data are smaller than the other part of all data, and then of the two parts data respectively according to the method of quick sort, the entire sorting process can be recursive, to achieve the whole data into an orderly sequence.

function qSort(array &$arr, int $p, int $r)
{
	if ($p < $r) {
		$q = partition($arr, $p, $r);
		qSort($arr, $p, $q);
		qSort($arr, $q + 1, $r); }}function partition(array &$arr, int $p, int $r)
{
	$pivot = $arr[$p];
	$i = $p - 1;
	$j = $r + 1;

	while (true) {
		do {
			$i++;
		} while ($arr[$i] < $pivot);

		do {
			$j--;
		} while ($arr[$j] > $pivot);

		if ($i < $j) {
			list($arr[$i], $arr[$j]) = [$arr[$j], $arr[$i]];
		} else {
			return$j; }}}Copy the code

The complexity of quicksort

In the worst case, quicksort has the same time complexity as bubble sort, and the choice of pivot is important. Here is the complexity analysis of quicksort.

best time complexity Ω (nlogn)
worst time complexity O(n2)
average time complexity Θ (nlogn)
space complexity (worst case) O(logn)

For quicksort optimization, interested old iron can click here to view.

Bucket sort

Bucket sort, or so-called box sort, works by dividing an array into a finite number of buckets. Each bucket is sorted separately (it is possible to use another sorting algorithm or continue using bucket sort recursively).

/** * Bucket sort * is not a comparison-based sort * T(N, M) = O(M + N) N is the number of sorted data and M is the number of data values * Cardinal sort needs to be considered when M >> N */

function bucketSort(array &$data)
{
    $bucketLen = max($data) - min($data) + 1;
    $bucket = array_fill(0, $bucketLen, []);

    for ($i = 0; $i < count($data); $i++) {
        array_push($bucket[$data[$i] - min($data)], $data[$i]);
    }

    $k = 0;

    for ($i = 0; $i < $bucketLen; $i++) {
        $currentBucketLen = count($bucket[$i]);

        for ($j = 0; $j < $currentBucketLen; $j++) { $data[$k] = $bucket[$i][$j]; $k++; }}}Copy the code

The PHP implementation of radix sort, interested students can also visit this page to view.

The complexity of quicksort

The time complexity of bucket sort is better than other comparison-based sorting algorithms. Here is the complexity of bucket sort

best time complexity Ω (n + k)
worst time complexity O(n2)
average time complexity Θ (n + k)
space complexity (worst case) O(n)

PHP’s built-in sorting algorithm

PHP has a rich library of predefined functions, including different sorting functions. It has different functions to sort items in an array, you can choose to sort by value or by key/index. We can also keep the array values associated with their respective keys when sorting. Here is a summary of these functions

The function name function
sort() Arrange arrays in ascending order. The value/key association is not reserved
rsort() Sort arrays in reverse/descending order. The index/key association is not reserved
asort() Sort arrays while maintaining index associations
arsort() Sort arrays in reverse and maintain index associations
ksort() Sort arrays by keyword. It’s the key to keeping the data relevant. This is useful for associative arrays
krsort() Sort the array keys in order
natsort() Use a natural order algorithm to sort the array and preserve the value/key association
natcasesort() Use a case-insensitive “natural order” algorithm to sort the array and preserve the value/key association.
usort() Use user-defined comparison functions to sort arrays by value, and do not maintain value/key associations. The second argument is the callable function for the comparison
uksort() The array is sorted using user-defined comparison function keys, and the value/key association is not maintained. The second argument is the callable function for the comparison
uasort() Use user-defined comparison functions to sort arrays by value and maintain value/key associations. The second argument is the callable function for the comparison

Constants under sort(), rsort(), ksort(), krsort(), asort(), and arsort() can be used

  • SORT_REGULAR – Normal comparison unit (does not change type)
  • SORT_NUMERIC – the unit is compared as a number
  • SORT_STRING – Units are compared as strings
  • SORT_LOCALE_STRING – Compares cells as strings based on the current locale setting, which can be changed using setlocale().
  • SORT_NATURAL – and natsort() are similar to sorting strings in “natural order” for each cell. New in PHP 5.4.0.
  • SORT_FLAG_CASE – Can be merged with SORT_STRING OR SORT_NATURAL (OR bit operation) to sort strings case-insensitive.

Complete content

All the algorithms referenced in this article are implemented at this address, and the main content is to summarize the underlying data structures and algorithms using PHP syntax. Welcome to collect old iron ~