This is the 15th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

Introduction of linear sorting algorithm

  1. Linear sort algorithm includes bucket sort, counting sort and radix sort.
  2. The time complexity of linear sorting algorithm isO(n).
  3. These three sorting algorithms do not involve the comparison between elements, and are non-comparison based sorting algorithms.
  4. The requirements for sorting data are very strict, so it is important to master the applicable scenarios of these three sorting algorithms.

Bucket sort

2.1 Algorithm Principle

  1. The data to be sorted is sorted into several ordered buckets, and the data in each bucket is sorted separately.
  2. After the bucket is sorted, the data in each bucket is taken out in sequence to form an orderly sequence.

2.2 Conditions of use

  1. The data to be sorted needs to be easily divided into m buckets, and buckets have a natural size order.
  2. Data is evenly distributed across buckets.

2.3 Application Scenarios

  1. Bucket sort is suitable for external sort.
  2. External sort means that the data is stored on an external disk and there is a large amount of data, but the memory is limited and the entire data cannot be loaded into memory.

2.4 Application Cases

Requirement Description:

There is 10GB of order data, sorted by order amount (assuming positive integers) but limited memory, only a few hundred MB

Solution:

  • Scan the file to see the data range of the order amount, such as 1 yuan to 100,000 yuan, then divide 100 barrels.
  • The first bucket stores orders between 1 and 1000 yuan, the second bucket stores orders between 1001 and 2000 yuan, and so on.
  • Each bucket corresponds to a file and is numbered and named according to the size of the bucket (00,01,02… , 99).
  • Put 100 small files in memory one by one and sort them by quicksort.
  • After all the files are sorted, you only need to read each small file in ascending order and write it to the large file.

Note: If a single file cannot be loaded into the memory, proceed with the preceding procedure for the file.

(Counting sort)

Algorithm principle

  1. Counting is really just a special case of bucket sorting.
  2. If the range of n data to be sorted is not large, for example, the maximum value is K, it is divided into K buckets
  3. The data values in each bucket are the same, eliminating the time spent sorting buckets.

Case analysis

Requirement Description:

Rank candidates according to their scores.

Solution:

  • Suppose there are only eight examinees with scores between 0 and 5, and the scores are stored in an arrayA[8] = [2,5,3,0,2,3,0,3].
  • Use an array of size 6C[6]Represents the bucket, and the subscript corresponds to the fraction, i.e0,1,2,3,4,5.
  • C[6]It stores the number of examinees, which can be obtained by traversing one examinee scoreC[6] = [2, 0, 2, 3, 1].
  • Sum the C[6] array sequentiallyC[6]= 2,2,4,7,7,8.c[k]The store isLess than or equal toNumber of examinees with k.
  • An array ofR[8] = [0,0,2,2,3,3,3,5]Store candidates’ rankings. So how do you getR[8]?
    • From the back to the former, in turn, scanning array. A, such as scanning and 3, can from the array subscript 3 value 7 C, that is to say, so far, myself included, the examinee has seven score less than or equal to 3, that is to say, 3 is the seventh element of the array R (that is, the location of the array subscript is 6 in the R). When 3 is placed in R, there are six elements less than or equal to 3 left, and C[3] is subtracted by 1 to become 6.
    • Similarly, when a second candidate with a score of 3 is scanned, it is placed at the position of the sixth element in array R (the subscript 5 position). After scanning array A, the data in array R is sorted in order of fractions.

Conditions of use

  1. It can only be used in scenarios with small data range. If the data range K is much larger than the data n to be sorted, it is not suitable for counting sort.
  2. Counting sort can only sort non-negative integers. Other types need to be converted to non-negative integers without changing their relative sizes.
  3. For example, if the test score is accurate to the last decimal place, you need to multiply all the scores by 10 to round them up.

4. Radix sort

Algorithm principle (take sorting 100,000 mobile phone numbers as an example to illustrate)

  1. Compare the size of two mobile phone numbers A and B. If a is already larger than B in the first few digits, the next few digits need not be looked at.
  2. Using the idea of a stable sorting algorithm, you can sort the phone number by the last bit, then reorder it by the second to last, and so on, finally by the first bit.
  3. After 11 sorting, the phone number became in order.
  4. Each sort has a small range of ordered data and can be done using bucket sort or counting sort.

Conditions of use

  1. Data can be divided into independent “bits” for comparison;
  2. There is a progressive relationship between bits. If the high position of data A is larger than that of data B, there is no need to compare the remaining positions.
  3. The data range of each bit should not be too large, and linear sorting should be possible, otherwise the time complexity of radix sorting cannot reach O(n).