Sorting algorithm

The formulas in this article may not be resolvable, so check them out on my tech blog’s website

  • For more information, please follow my official account: nezha_blog
  • My tech blog: nezha.github. IO

Bubble sort

The principle of

Two people compare the sorting code of adjacent records, if the reverse order occurs, then swap; There are two ways to bubble, one is to bubble the smaller elements to the front, and the other is to bubble the larger elements to the back.

performance

The time complexity is $O(N^2)$and the space complexity is $O(1)$. Sorting is stable. The number of sort comparisons is independent of the initial sequence, but the number of swaps is independent of the initial sequence.

To optimize the

If the initial sequence is sorted, bubble sort still compares $O(N^2)$times, but does not swap. According to this, we can optimize and set a flag. When no exchange occurs in a sequence, the sequence has been sorted, but the time complexity of the sorted sequence has not changed by an order of magnitude after optimization.

# # # code

public static void bubbleSort(int[] arr) {
    for (int i = 0; i < arr.length - 1; i++) {
        boolean flag = true;
        for (int j = arr.length - 1; j > i; j--) {
            if (arr[j] < arr[j - 1]) {
                int tmp = arr[j];
                arr[j] = arr[j - 1];
                arr[j - 1] = tmp;
                flag = false; }}if (flag) return; }}Copy the code

Insertion sort

The principle of

Select a sequence of data to be sorted and insert it into the previously sorted sequence.

performance

The time complexity is $O(N^2)$and the space complexity is $O(1)$. The algorithm is stable and the number of comparisons and swaps are related to the initial sequence.

To optimize the

Direct insertion sort Each time forward insertion, is in order to find the forward, can be optimized here, forward to find the appropriate insertion position by binary search, that is, half-fold insertion.

The average performance is faster, the time complexity is reduced to $O(NlogN)$, the sorting is stable, but the number of comparisons is independent of the initial sequence, always requires $foor(log(I))+1$sort comparison.

code

public static void insertSort(int[] arr) {
    for (int i = 1; i < arr.length; i++){
        out:
        for (intj=i; j>0; j--){if (arr[j] < arr[j-1]) {int tmp = arr[j];
                arr[j] = arr[j-1];
                arr[j-1] = tmp;
            }else breakout; }}}Copy the code
public static void insertBinarySort(int[] arr){
    for (int i = 1; i < arr.length; i++){
        if (arr[i] < arr[i-1]) {int temp = arr[i];
            int low = 0, high = i - 1, mid;
            while (low <= high){
                mid = (low + high) / 2;
                if (temp < arr[mid]){
                    high = mid - 1;
                }else {
                    low = mid + 1; }}for (int j = i; j >low; j--){
                arr[j] = arr[j - 1]; } arr[low] = temp; }}}Copy the code

Hill sorting

The principle of

The improved version of insert sort is an improved method based on the following two properties of insert sort:

  • Insertion sort is very efficient for almost ordered data, and can achieve the efficiency of linear sort.
  • However, the insertion sort can only move the data one bit each time it is inserted forward, which is inefficient.

So the idea of Hill sort is:

  • Let’s take the right onegap<nAs an interval, all elements are divided into GAP subsequence, and all elements with gap distance are put into the same subsequence, and then each subsequence is directly inserted into sort.
  • Narrow the gap, e.ggap=ceil(gap/2)And repeat the above subsequence partitioning and ordering
  • Until, finally, when gap=1, all elements are placed in the same sequence for insertion sort.

# # # performance

At the beginning, the gap value is large, there are fewer elements in the sub-sequence, and the sorting speed is fast, which overcomes the shortcoming of direct insertion sorting. Secondly, after the gap value becomes smaller, although there are more elements in the sub-sequence, most of the elements are basically ordered, so it inherits the advantage of direct insertion sorting, which can be sorted at a nearly linear speed.

# # # code

public static void shellSort(int[] arr) {
    int gap = Math.round(arr.length / 2);
    while (gap > 0) {
        for (int i = 0; i<gap; i++){for (intj = i + gap; j<arr.length; j+=gap){if (arr[j] > arr[j-gap]){
                    int temp = arr[j];
                    int k = j - gap;
                    while (k >= 0 && arr[k] > temp)
                    {
                        arr[k + gap] = arr[k];
                        k -= gap;
                    }
                    arr[k + gap] = temp;
                }
            }
        }
    }
}
Copy the code
public static void shellSort2(int[] arr){
    for (int gap = arr.length / 2; gap > 0; gap /= 2) {for (int i = 0; i < arr.length; i = i + gap){
            int temp = arr[i];
            int j;
            for(j = i; j >= gap && temp < arr[j-gap]; j -= gap){ arr[j] = arr[j - gap]; } arr[j] = temp; }}}Copy the code

Selection sort

The principle of

Each time the minimum value is found in the unsorted sequence, it is recorded and finally stored at the end of the sorted sequence

performance

The time complexity is $O(N^2)$, the space complexity is $O(1)$, the sorting is unstable (swapping the minimum value to the end of the sorted), each time can determine the final position of an element, the number of comparisons is independent of the initial sequence.

code

public static void selectSort(int[] arr){
    int len = arr.length;
    // Select one minimum at a time
    for (int i = 0; i < len-1; i++){     // Just select n-1 times
        int min = i;
        for (int j = i+1; j < len; j++){
            if(arr[min]>arr[j]){ min = j; }}if(min ! = i){inttemp = arr[i]; arr[i] = arr[min]; arr[min] = temp; }}}Copy the code

Quick sort

The principle of

The idea of divide and rule:

  • Divide: Find pivot element pivot and Divide array A[P… R] into A[P… PivotPOS -1] and A[P. pivotpos+1…q], with the left elements smaller than pivot and the right elements larger than pivot.
  • Conquer: Sort two partitioned arrays recursively
  • Combine: Because of the function of the benchmark, the two subarrays are ordered in place without the need for merge operations.

performance

The average time complexity of fast sorting is $O(NlogN)$and the space complexity is $O(logN)$, but the worst case is $O(N^2)$and the space complexity is $O(N)$; The sorting is unstable, but the final position of an element in the sequence can be determined every time, and the complexity is related to the initial sequence.

code

    public static void swap(int i, int j, int[] arr) {
        int temp = arr[i];
        arr[i] = arr[j];
        arr[j] = temp;
    }

    public static void sortQuick(int[] quickArray) {
        int[] arr = quickArray;
        quickSort(0, arr.length - 1, arr);
    }
    
    public static void quickSort(int start, int end, int[] arr) {
        if (start < end) {
            int pivot = arr[start];
            int left = start;
            int right = end;
            while(left ! = right) {while (arr[right] >= pivot && left < right) right--;
                while (arr[left] <= pivot && left < right) left++;
                swap(left, right, arr);
            }
            arr[start] = arr[left];
            arr[left] = pivot;
            quickSort(start, left - 1, arr);
            quickSort(left + 1, end, arr); }}Copy the code

Merge sort

The principle of

First consider merging two ordered arrays. The basic idea is to compare the first number of the two arrays, which is smaller than the first one, after which the corresponding pointer will be moved back one. Then compare until one array is empty, and then copy the rest of the other array.

Consider recursive decomposition. The basic idea is to split the array into left and right. If the two arrays have ordered data inside them, then you can merge the two arrays in the same way as above. How do I make these two arrays internally ordered? You can divide again until the decomposed group contains only one element, at which point the group is considered to be in order. Then merge and sort the two adjacent groups.

performance

The time complexity is always $O(NlogN)$, and the space complexity is always 4O(N)$. The algorithm is independent of the initial sequence, and the sorting is stable.

# # # code

    public static void mergeSort(int[] arr){
        mergeSortDiv(arr,0,arr.length-1);
    }

    public static int[] mergeSortDiv(int[] arr,int low,int high){
        int mid = (low + high) / 2;
        if (low < high) {
            / / the left
            mergeSortDiv(arr, low, mid);
            / / right
            mergeSortDiv(arr, mid + 1, high);
            // merge left and right
            merge(arr, low, mid, high);
        }
        return arr;
    }

    public static void merge(int[] nums, int low, int mid, int high){
        int[] temp = new int[high - low + 1];
        int i = low;/ / pointer to the left
        int j = mid + 1;/ / right pointer
        int k = 0;
        // Move the smaller number into the new array first
        while (i <= mid && j <= high) {
            if (nums[i] < nums[j]) {
                temp[k++] = nums[i++];
            } else{ temp[k++] = nums[j++]; }}// Move the left number into the array
        while (i <= mid) {
            temp[k++] = nums[i++];
        }
        // Move the remaining number on the right side into the array
        while (j <= high) {
            temp[k++] = nums[j++];
        }
        // Overwrite the nums array
        for (int k2 = 0; k2 < temp.length; k2++) { nums[k2 + low] = temp[k2]; }}Copy the code

Heap sort

The principle of

Heap sort is frequently used in top K problems. Heap sort is implemented using binary heap data structure, although it is essentially a one-dimensional array. A binary heap is a nearly complete binary tree.

Binary heap has the following properties:

  • The parent node’s key value is always greater than or equal to (less than or equal to) the key value of any child node.
  • The left and right subtrees of each node are a binary heap (both maximum heap and minimum heap).

Steps:

  1. Construct the maximum heap (Build_Max_Heap) : if the array subscript range is 0 to n, the elements starting at subscript n/2 are the large root heap, considering that the single element is the large root heap. So you just build the big root heap from n over 2-1, and then you make sure that by the time you get to a node, its left and right subtrees are already big root heaps.

  2. HeapSort: Because the heap is simulated with arrays. When you get a big root heap, the inside of the array is not ordered. So you need to order the heaped-up array. The idea is to remove the root node and do the maximum number of adjustments recursively. The heap[0] is swapped with the heap[n-1] for the first time, and the heap[0…n-2] is adjusted to the maximum heap. The heap[0] is exchanged with the heap[n-2] a second time, and the heap[0…n-3] is adjusted to the maximum heap. Repeat this operation until the heap[0] and heap[1] are swapped. Each time the largest number is merged into the next ordered range, so the entire array is sorted.

  3. Maximum Heap adjustment (Max_Heapify) : This method is provided for both procedure calls. The goal is to adjust the end child of the heap so that the child is always smaller than the parent.

performance

The time complexity is $O(NlogN)$, and the space complexity is $O(1)$, because the sorting space used is still the initial sequence and no new space is opened. The algorithm is unstable and independent of the initial sequence.

Usage scenarios

When you want to know the maximum or minimum value, such as priority queuing, job scheduling, etc.

code

Sort algorithm merge sort (JAVA)- array form

Count sorting

The principle of

The number of occurrences of each element is calculated first, and then the absolute position of the element in the final sequence is calculated (the final position), and then the elements in the initial sequence are moved to the sorted array according to the absolute position of the element in the final.

performance

The time complexity is O(N+K), the space complexity is O(N+K), the algorithm is stable, independent of the initial sequence, can be sorted without comparison algorithm.

Bucket sort

The principle of

  • Divide M buckets evenly and independently according to the size range of elements to be arranged
  • Distribute N input elements into buckets
  • Then sort the elements in each bucket
  • At this point, the elements in each bucket are listed in order, which is sorted.

performance

Time complexity is $O (N + C) $, $O (C) = O (M log (N/M) (N/M)) = O (NlogN – NlogM) $, space complexity is $O (N + M) $, algorithm is stable, and has nothing to do with the initial sequence.

Usage scenarios

The idea of the algorithm is similar to the open hash method in the hash, when the conflict is put into the same bucket; It can be applied when the data quantity is evenly distributed or the interval quantity is emphasized.

Radix sort

The principle of

If d keywords exist, you can sort them by keyword. There are two ways:

  • MSD: Sort from the highest level first. On each keyword, count sort can be used
  • LSD: Sort from the lowest level first. Bucket sort can be used for each keyword

performance

The time complexity is O(D *(N+K)) and the space complexity is O(N+K).

conclusion


reference

[1] classical sorting algorithm to summarize and implementation, sorting algorithm based on Python summary, write very well

[2] Summary of sorting algorithms with Optimization