This post was originally posted on my blog

preface

This series includes ten classic sorting algorithms.

  • The language used is Java
  • The structure is: define abstract classesSortIt implements swap, size comparison and so on. For example, if you swap two values, just pass in the subscript. All other concretely sorted classes inherit from abstract classesSort. So we can focus on the algorithm itself.
/* * Return value is 0, indicating array[i1] == array[i2] * return value is less than 0, indicating array[i1] < array[i2] * return value is greater than 0, indicating array[i1] < array[i2] * return value is greater than 0. Array [i1] > array[i2] */ protected int CMP (int i1, int i2) {return array[i1].compareTo(array[i2]);
	}
	
	protected int cmp(T v1, T v2) {
		return v1.compareTo(v2);
	}
	
	protected void swap(int i1, int i2) {
		T tmp = array[i1];
		array[i1] = array[i2];
		array[i2] = tmp;
	}

Copy the code
  • Bubble, select, insert, merge, hill, and heap sort are all comparison based sorts

    • Minimum average time complexity O(nlogn)
  • Counting sort, bucket sort, and radix sort are not comparison – based sorts

    • Using space for time, sometimes the average time complexity can be less than order nlogn.

What is counting sort

  • Counting sort is a stable linear time sort algorithm. The algorithm was proposed by Harold H. Seward in 1954. Counting sort uses an extra array C, where the ith element is the number of elements in array A whose value is equal to I. We then arrange the elements in A into the correct position according to array C

Algorithm thought

When the input element is n integers between 0 and k, it runs O(n+k). Counting sort is not comparison sort, which is faster than any comparison sort algorithm.

Core idea: count the number of occurrences of each integer in the sequence, and then deduce the index of each integer in the ordered sequence

  • Since the length of the array C used to count depends on the range of data in the array to be sorted (equal to the difference between the maximum and minimum value of the array to be sorted plus 1), counting sort takes a lot of time and memory for arrays with a large range of data. For example, counting sort is the best algorithm for sorting numbers between 0 and 100, but it is not suitable for sorting names alphabetically. However, counting sort can be used in radix sort algorithms to more efficiently sort large arrays of data.

  • Generally speaking, for example, there are 10 people of different ages, and 8 of them are statistically younger than A, so A’s age ranks ninth. By using this method, the position of everyone else can be obtained, and the order can be sorted. Of course, repeated ages require special treatment (to ensure stability), which is why we end up reverse-populating the target array and subtracting the statistics of each number by one. The steps of the algorithm are as follows:

    • Find the largest and smallest elements in the array to be sorted
    • Count the number of occurrences of each element of value I in the array, stored in the i-th entry of array C
    • Add up all the counts (starting with the first element in C, adding each term to the previous one)
    • Reverse populating the target array: place each element I in the C[I] entry of the new array, subtracting C[I] by one for each element

Algorithm stability

  • Counting sort is a stable sort algorithm.

Is it in place

  • What is the in situ algorithm?
    • Not dependent on additional resources or on a few additional resources, relying only on the output to override the input
    • The spatial complexity of 𝑂(1) can be considered as in situ algorithm
  • The non-in-situ algorithm is called not-in-place or out-of-place
  • Counting sorts are not in-place

Space-time complexity

  • Best, Worst, average time complexity: O(n+k)

  • Space complexity: O(n+ K)

code

Use the simplest implementation, first find the maximum value, use it to create an array, then iterate over it, count the number of occurrences of elements, and finally assign values in order

public class CountingSort extends Sort<Integer>{

	protected void sortInt Max = array[0];for (int i = 1; i < array.length; i++) {
			if(array[i]>max) { max= array[i]; Int [] counts = new int[Max +1]; // Count the number of occurrences of each integerfor(int i = 0; i < array.length; i++) { counts[array[i]]++; Int index = 0; // int index = 0;for (int i = 0; i < counts.length; i++) {
			while (counts[i]-->0) {
				array[index++] = i;
				
			}
		}// O(n)
		
		
	}
		
}
Copy the code

To optimize the

There are some problems with the above code

  • Negative integers cannot be sorted
  • Waste of memory
  • Not stable sort

Optimization idea:

  • Assuming that the minimum value in array is min and the maximum value is Max, we start with min as index 0
    • For example, if the array is -2, 4, and 5, the index -2 is 0, the index 4 is 6, and the index 5 is 7
  • Define a COUNTS array that contains all of the preceding counts. This will give you the element’s position in the ordered sequence

The optimized code is

protected void sort2Int Max = array[0]; int min = array[0];for (int i = 1; i < array.length; i++) {
			if (array[i]>max) {
				max= array[i];
			}
			if(array[i]<min) { min= array[i]; Int [] counts = new int[max-min+1]; // Count the number of occurrences of each integerfor(int i = 0; i < array.length; i++) { counts[array[i]-min]++; }// O(n) // summation timesfor(int i = 1; i < counts.length; i++) { counts[i]+= counts[i-1]; Int [] newArray = new int[array.length];for(int i = array.length-1; i >=0 ; Int index = array[I]-min; int index = array[I]-min; //count [index]; NewArray [--counts[index]] = array[I]; } // Assign an ordered array to arrayfor(int i = 0; i < newArray.length; i++) { array[i] = newArray[i]; }}Copy the code

The results of

Data sources:

Generate 20000 data randomly from 1 to 80,000 to test

Integer[] array = Integers.random(20000, 1, 80000);

The results are as follows

[CountingSort] Stability: true Time: 0.005s(5ms) Comparison count: 0 Exchange count: 0

[QuickSort] Stability: false Time: 0.009s(9ms) Comparison count: 332,700 Exchange count: 13,200

[MergeSort] Stability: true Time: 0.01s(10ms) Comparison times: 261,000 exchange times: 0

[ShellSort] Stability: false Time: 0.016s(16ms) Comparison times: 433,400 exchange times: 0

【HeapSort】 Stability: false Time: 0.023s(23ms) Comparison number: 510,900 exchange number: 20000

Stability: true Time: 0.514s(514ms) Comparison times: 200 million exchange times: 200 million

[InsertionSort1] Stability: true Time-consuming: 1.362s(1362ms) Comparison times: 99351,500 exchange times: 99331,500

[BubbleSort] Stability: True Time: 2.529s(2529ms) Compare times: 200 million swap times: 99331,500

And you can see that counting sort in this case is even faster than quicksort, merge sort.

Code Address:

  • The code in git is at github