Sorting algorithm

Sorting algorithm is a relatively simple algorithm, from the beginning of our contact with computer programming began to contact may be sorting or search algorithm, but because sorting in some other algorithms are more used, so in order to improve performance has studied a variety of sorting algorithm. At present, the sorting algorithm is mainly divided by time complexity, space complexity, stability and so on. Next, we analyze them respectively.

Stability algorithm

For example, for [1,2,3,4,1],a[0]=a[4],a[0] before a[4], stable sort after the completion of a[0] is still in front of a[4], Otherwise, it is an unstable sorting algorithm.

Bubble sort

The basic principle of

Bubble Sort is a relatively simple sorting algorithm. The basic principle is to select a number as the comparison standard, traverse the entire array to compare the size of two numbers, if the order is not the swap, until there is no more number to swap. Bubble sort is a stable sorting algorithm. Bubble sort works as follows:

  1. Compare two adjacent elements. Swap as needed, putting the larger one behind if the order is needed, and the smaller one behind for flashbacks.
  2. Do the same for each set of adjacent elements. When we do that, the last element is going to be the largest number.
  3. The loop repeats the above steps for all but the selected elements until there are no more pairs of numbers to compare, indicating that the sorting is complete.

code

public static void bubbleSort(int[] arr){
	for(int i=0; i<arr.length; i++){for(int j=0; j<arr.length-i-1; j++){if(arr[j] > arr[j+1]){ swap(arr, j, j+1); }}}}Copy the code

The complexity of the

If the initial state of the sequence is positively ordered, the sort can be done in one scan without swapping. The sorting is completed after n cycles, so the time complexity is O(n). No auxiliary space is used in the whole process, and the space complexity is O(1).

Selection sort

Selection sort is a very simple sorting algorithm. It requires that the smallest (largest) element is selected from the elements to be sorted, stored in the starting position, and then from the remaining unsorted elements continue to find the smallest (largest) element, and then placed after the previous sorted element. And so on, until all the data elements to be sorted are sorted. Selection sort is an unstable sort method.

The selection sort algorithm works as follows:

  1. The first time through the entire data sequence, find the largest (small) number. And put that number in the first place.
  2. Repeat the above operation for the rest of the data sequence, excluding the sorted positions.

code

public static void insertSort(int[] arr){
	for(int i=0; i<arr.length; I ++){// After one run the smallest number reaches the position with the subscript Ifor(int j=i+1; j<arr.length; j++){if(arr[i] > arr[j]){ swap(arr, i, j); }}}}Copy the code

The complexity of the

If the data itself is ordered, 0 swaps; In the worst case n-1 swaps are required; The number of comparison operations is fixed as N^2/2, the time complexity is O(N^2), and the space complexity is O(1).

Direct insertion sort

Insert sort is a relatively simple sorting method. Insert sort divides the array to be sorted into two parts, one is the sorted part and the other is the sorted part. At first only the first number is sorted. Then take out a number from the part to be sorted each time, compare with the sorted part of the data, select exactly the first number is smaller than the number, the last number is larger than the number (except the first one), put the number in this position. The array is in order after we iterate through it.

The selection sort algorithm works as follows:

  1. Select the first number as the sorted part, take the second number and compare with the first number, if the first number is greater than the same, change places, less than. The first two numbers are sorted as part of the process.
  2. Take a number from the part to be sorted again and repeat the appeal step to find the position of the number. In the process of scanning from back to front, you need to repeatedly move the sorted elements backward to provide space for inserting the latest elements.
  3. Repeat the above steps until all data has been traversed to indicate that the array is in order.

code

public static void insertSort(int[] nums){
	int i,j;
	for(i=1; i<nums.length; i++){ int temp = nums[i]; // Move the element backfor(j=i-1; j>=0&&temp<nums[j]; j--){ nums[j+1] = nums[j]; } nums[j+1] = temp; }}Copy the code

The complexity of the

When the sequence of N elements is arranged in ascending or descending order, the best case of insertion sort is that the sequence is already in order. In this case, the comparison operation needs to be performed n-1 times. The worst case is if the sequence is in reverse order, then there are n(n-1)/2 comparisons to be made. On average, the insertion sort algorithm is O(n^2). Therefore, insertion sort is not suitable for sorting applications with large amounts of data. But insertion sort works well when the amount of data to sort is small or if the input elements are known to be roughly in order.

Insertion sort with sentry

In insertion sort, we see that there are two comparison operations for each comparison j>=0&&temp<nums[j], that is, to ensure not to cross the boundary and judge whether the data meets the conditions, assuming that in the case of reverse order, the number of comparison will almost double. Here we use a sentinel to eliminate multiple comparisons.

code

public static void insertWithSentinelSort(int[] nums){
	int i,j;
	for(i=1; i<nums.length; Nums [0] = nums[I]; nums[I]; nums[I]; // Move the element back // Just compare the data to see if it meets the criteriafor(j=i-1; nums[j]>nums[0]; j--){ nums[j+1] = nums[j]; } nums[j+1] = nums[0]; }}Copy the code

The advantage of adding sentries is to reduce the original comparison times and improve the efficiency of the algorithm.

Hill sorting

Hill sort is a more efficient and improved version of insertion sort. Hill sort is an unstable sort algorithm.

Hill sort is to group the records according to a certain step size of the subscript and sort each group of data by direct insertion sort algorithm. As the step size decreases, each group contains more and more keywords, and when the step size is 1, it is exactly an insertion sort. At this time, the whole data sequence has been basically ordered, and insertion sort has high efficiency in the operation of almost ordered data, which can achieve the efficiency of linear sort. So the overall efficiency of Hill sorting is higher.

Hill sort steps:

  1. Select step size and group data according to step size
  2. The loop sorts each group
  3. Change the step size (usually half, or can be specified through the step size array), repeat 1-2
  4. Stop sorting until the step size is 1

code

public static void shellSort(int[] nums){
	int size = nums.length/2;
	int i,j;
	while(size>=1){
		for(i=0; i<nums.length; i++){for(j=i; j+size<nums.length; j+=size){if(nums[j]>nums[j+size]){ swap(nums, j, j+size); } } } size/=2; }}Copy the code

The complexity of the

The time complexity analysis of Hill sort is complicated because it has a direct relationship with the step size selected. There is no unified conclusion on the selection of the step size. It only needs to make the last step size 1. The time complexity of Hill sort varies from O (n^1.3) to O (n^2) according to the step size selected.

Quick sort

Quicksort is an improvement on bubbling sort.

The basic steps of fast sorting:

  1. Select a number (usually the first one) from the sequence to be sorted, and sort it once. Place the number larger than the number in front of it, and the number smaller than the number behind it.
  2. The above operation divides the sequence into two separate parts and performs the above operation recursively until the sequence can no longer be divided.
  3. It’s ordered after the last sort.

code

public static void quickSort(int[] nums, int low, int high){
	if(low<high){ int partation = partition(nums, low, high); QuickSort (nums, 0, low-1); quickSort(nums, low+1, high); Public static int partition(int[] nums, int low, int high){public static int partition(int[] nums, int low, int high){while(low<high){
		while(low<high&&nums[high]>=pivo){
			high--;
		}
		nums[low] = nums[high];
		while(low<high&&nums[low]<=pivo){
			low++;
		}
		nums[high]=nums[low];
	}
	nums[low] = pivo;
	return low;
}
Copy the code

The complexity of the

Time complexity

In the best case, Partition is evenly divided every time, and if n keywords are sorted, the depth of recursion is log2n+1, that is, only log2n recursion is required. The time complexity is O(nlogn).

The worst case is when the column to be sorted is the reverse of the direction in which it needs to be sorted. Each partition yields only a subsequence with one less record than the previous partition. Fast row degenerates into bubble sort. Time is O(n^2).

The average complexity of quicksort is O(nlogn), so it takes a long time to prove, so I’ll just post a link.

Spatial complexity

The space used by quicksort, according to the code we implemented above, will only use fixed extra space before any recursive calls. However, if it needs to generate O (logn) nested recursive calls, it needs to store a fixed amount of information in each of them. Because the best case requires at most O(logn) nested recursive calls, it requires O(logn) space. The worst case is O(n) nested recursive calls, so O(n) space is required.

Merge sort

Merging is the merging of two or more ordered sequences into one ordered sequence.

Merge sort steps:

  1. Apply for a space of the same length as the sequence to be sorted. This space is used to store the merged sequence
  2. Sets two numbers to point to positions in an array, starting at the start of two sorted sequences
  3. Compares the elements to which the two Pointers point, selects the smaller element into the merge space, and moves the pointer to the selected number to the next position
  4. Repeat step 3 until a pointer reaches the end of the specified sequence
  5. Copies all remaining elements of the other sequence directly to the end of the merged sequence

code

public static void mergeSort(int[] nums, int[] temp, int left, int right){
	if(left<right){
		int mid = (left+right)/2;
		mergeSort(nums, temp,left,mid);
		mergeSort(nums, temp,mid+1,right);
		merge(nums,temp, mid, left, right);
	}
}
public static void merge(int[] nums, int[] temp, int mid, int left, int right){
	int i=left,j=mid+1,k=0;
	while(i<=mid&&j<=right){
		if(nums[i]<nums[j]){
			temp[k++] = nums[i++];
		} else{ temp[k++] = nums[j++]; }}while(i<=mid){
		temp[k++] = nums[i++];
	}
	while(j<=right){ temp[k++] = nums[j++]; } // Copy all elements of temp into the original array // We must copy the values of the original sorted array back // otherwise we will have errors in sorting the long array // for example, 4, 1, 2, 3 are sorted into 1, 4, and 2, 3 // If we do not copy the values back, then we will merge them into 2, 3  4 1 k=0;while(left<=right){ nums[left++] = temp[k++]; }}Copy the code

The complexity of the

Merge sort is an efficient and stable algorithm. But you need extra space.

The comparison of merge sort merges hierarchically. The first time the sequence is divided into two parts, and the second time the two parts obtained from the first time are divided into two parts. The final split and merge is like a binary tree. Its average time complexity is O(nlogn). Space complexity because it needs additional auxiliary space of length N, its space complexity is O(n).

Sort large amount of data

The code shown above is also known as 2-way merge sort, and its core idea is to merge two ordered sequences that ring back and forth in an array into one ordered sequence. But we don’t actually use this sort of sorting.

However, merge sort is used in many scenarios, especially when sorting a large number of sequences. For example, currently we have a large amount of data stored in text, and now we need to sort it. Due to the limitation of memory, there is no way to load all the data at one time. At this time, we can use merge sort to divide the large file into several small files, and sort the data of these small files respectively. And the sorting process can use multi-thread and other means to improve the efficiency of the algorithm.

TIMSort

In the JDK, the Arrays utility classes give us a variety of improved sorting methods: arrays.sort used merge sort in JDK1.6 and before, and TimSort in 1.7.

TimSort algorithm is a hybrid sorting algorithm originating from merge sort and insert sort. It is designed to have better performance in various data in the real world. The basic working process is:

  1. Scan the array, determine the monotone ascending segment and the strict monotone descending segment, and reverse the strict descending segment;
  2. The minimum basic fragment length is defined, and monotone fragment shorter than this is set to be longer than this by insertion sorting.
  3. Some adjacent fragments are repeatedly merged, avoiding fragments with very different lengths until the whole sorting is completed. The segmentation selection strategy can ensure O(n log n) time complexity.

The reader’s welfare

How to improve code quality? — Summary of r&d experience from Ali P8 architects

Ali P8 shares the learning path of Java architects, the sixth point is particularly important

Eight tools every Java Developer should know

Want to interview a Java architect? Do you know the basics?

Draw a map to your core competency, turning midlife crisis into a gas station

There’s no midlife crisis, but setting goals as a plan

Being laid off is not the focus of winter, the focus is how to break the career bottleneck