# preface

There are many sorting in life, such as queuing for exercise, the short one in the front, the high one in the back, and then playing poker and inserting cards.

# Sorting concept

Sorting: the so-called sorting is the operation of arranging a string of records incrementally or decrementally according to the size of one or some keywords.

Stability: if there are multiple records with the same keyword in the record sequence to be sorted, the relative order of these records remains unchanged after sorting, that is, in the original sequence, r[i]=r[j], and r[i] is before r[j], while in the sorted sequence, r[i] is still before r[j], then this sorting algorithm is said to be stable; Otherwise, it is called unstable. (pay attention to the bold part)

Internal sorting: sorting in which all data elements are placed in memory.

External sorting: there are too many data elements to be placed in memory at the same time. According to the requirements of the sorting process, the sorting of data cannot be moved between internal and external memory.

# Common sorting in life

- Ranking of total school scores

- Taobao shopping order

# Insert sort

Introduction: the insertion sort is the same as the playing cards we often play in our life. The following is the appearance after sorting out the order. We know that this is a flush. If you get a new card 9 in your hand, how will you insert it?

Basic idea: insert a record (card) into an ordered sequence that has been arranged, so as to obtain a new ordered sequence with the number of records increased by 1.

## 1. Insert sorting directly

When inserting the I (I > = 1) th element, the preceding array[0],array[1],..., array[i-1] have been arranged in order. At this time, compare the sorting code of array[i] with the sorting code of array[i-1],array[i-2],... To find the insertion position, that is, insert array[i], and move the element order at the original position backward.

void InsertSort(int* a, int n) { // Note the number of i-1 before the number of cycles n-1 for (int i = 0; i < n - 1; i++) { int end = i; // The first i-1 numbers have been sorted int tmp = a[end + 1]; //Save the ith number, gap = 1, and insert the sort directly while (end >= 0) { if (tmp < a[end]) { a[end + 1] = a[end];//Exchange when less than --end;//-1 fallback } else break; } a[end + 1] = tmp; //If it is swapped, the empty position is given to the newly inserted number } }

Insert sort summary directly:

- The closer the element set is to order, the higher the time efficiency of the direct insertion sorting algorithm
- The time complexity is o(n^2), the worst time complexity is o(n^2), and the best time complexity is o(n)

Worst case: 2 +... + n = (n+2)(n-1) / 2, o(n^2)

The best case: in order, go directly to the process, which is o(n) - The space complexity is o(1) without any auxiliary space
- It is a stable sorting algorithm (the order of the same values does not change)

## 2. Hill sort

First, divide the whole record sequence to be sorted into several subsequences for direct insertion sorting. The specific algorithm description is as follows:

- Select an incremental sequence t1, t2,..., tgap, where ti > TJ, tgap=1;
- Sort the sequences by gap times according to the number of incremental sequences;
- For each sorting, the sequence to be sorted is divided into several subsequences with length m according to the corresponding increment ti, and each sub table is directly inserted and sorted. Only when the increment factor is 1, the whole sequence is treated as a table, and the table length is the length of the whole sequence.

The dynamic diagram process is as follows:

void ShellSort(int* a, int n) { //Sort in gap units int gap = n; while (gap > 1) { gap = gap / 3 + 1;//Generally, this is faster. Specific readers can read the reference materials or measure the speed themselves //gap = gap / 2; // Number of transposition n-gap for (int i = 0; i < n - gap; i++) { int end = i; // The first i-1 numbers have been arranged int tmp = a[end + gap]; //Save the ith number, gap = 1 is the direct insertion sort while (end >= 0) { if (tmp < a[end]) { a[end + gap] = a[end];//Less than exchange end -= gap;//-1 fallback } else break; } a[end + gap] = tmp; //If it is swapped, the empty position is given to the newly inserted number } } }

Hill sort summary:

- Hill sort is an optimization of direct insertion sort.
- When gap > 1, the array is pre sorted. The purpose is to make the array closer to order. When gap == 1, the array is close to order, which will be very fast. In this way, the overall optimization effect can be achieved. After we implement it, we can compare the performance test.
- The time complexity of hill sorting is not easy to calculate, because there are many gap value methods, which makes it difficult to calculate. Therefore, the time complexity of hill sorting given in many books is not fixed (readers can look through books)

The value here is taken according to the method proposed by Knuth, and Knuth has conducted a large number of test statistics. For the time being, we will calculate according to: n^1.25~ 1.6n^1.25 - Stability: unstable (the order of the same values changes before and after sorting)

# Select sort

Basic idea: select the smallest (or largest) element from the data elements to be sorted each time and store it at the beginning of the sequence until all the data elements to be sorted are finished.

## 1. Direct selection sort

- Select the data element with the largest (smallest) key in the element set array[i] – array[n-1]
- If it is not the last (first) element in the group, it is exchanged with the last (first) element in the group
- In the remaining array[i] – array[n-2] (array[i+1] – array[n-1]) set, repeat the above steps until there is 1 element left in the set

void SelectSort(int* a, int n) { int begin = 0, end = n - 1; while (begin < end) { int maxi = begin, mini = begin; for (int i = begin; i <= end; i++) { if (a[i] > a[maxi]) maxi = i;//Each time the index of the maximum value is locked, it is ready for exchange if (a[i] < a[mini]) mini = i;//Every time the index of the minimum value is locked, it is ready for exchange } Swap(&a[mini], &a[begin]); // Put the minimum value first if (begin == maxi)//begin may be equal to maxi, which needs to be checked maxi = mini; Swap(&a[maxi], &a[end]); // Put the maximum value behind ++begin;//Select the maximum and minimum values, narrow the range and continue sorting --end; } }

begin == maxi, as shown in the figure below. The code has been checked here. You should pay attention to it

Insert sort summary directly:

- It is very easy to understand the direct selection sorting thinking, but the efficiency is not very good. It is rarely used in practice
- Time complexity: O(N^2)
- Space complexity: O(1)
- Stability: unstable (the order of the same values changes before and after sorting)

## 2. Heap sort

Heap sort refers to a sort algorithm designed by using the data structure of heap tree (heap). It is a kind of selective sort. It selects data through heap. It should be noted that a large heap should be built in ascending order and a small heap should be built in descending order. (here, readers need to know the idea of heap deletion, that is, downward adjustment method)

//Adjust down to void AdjustDwon(int* a, int n, int root) { int child = 2 * root + 1; while (child < n) { //Judge whether the right child exists and is larger than the left child if (child + 1 < n && a[child] < a[child + 1]) { ++child;//Adjust child comparison to right child } // Build a lot in ascending order, find an exchange larger than the parent node, and continue to find the left and right children if (a[child] > a[root]) { Swap(&a[child], &a[root]);//Exchange father and child root = child;//Equal to the child becoming a new father child = 2 * root + 1;//Find the next child } else { break;//Less than direct termination, no adjustment is required } } } void HeapSort(int* a, int n) { //Times, only find n-1-1 / 2 times, one size of the left and right children for (int i = (n - 1 - 1) / 2; i >= 0; --i) { AdjustDwon(a, n, i);//Adjust to a large number first } int end = n - 1; //Adjust downward to replace the first node with the last node //Adjust downward and sort while (end > 0) { Swap(&a[0], &a[end]);//Place maximum at the end AdjustDwon(a, end, 0);//Continue to adjust to a large number. After cycling in turn, it will become orderly --end;//-1 } }

Summary of characteristics of direct selection sorting:

- Heap sorting uses heap to select numbers, which is much more efficient.
- Time complexity: O(N*logN), tree height logN, number of cycles N
- Space complexity: O(1)
- Stability: unstable (the order of the same values changes before and after sorting)

# Exchange sort

Basic idea: the so-called exchange is to exchange the positions of the two records in the sequence according to the comparison results of the key values of the two records in the sequence. The characteristics of exchange sorting are: move the records with larger key values to the tail of the sequence and the records with smaller key values to the front of the sequence.

## 1. Bubble sorting

Basic idea:

- Compare two elements. If the first element is larger than the second, exchange them
- Compare the two adjacent numbers until the last pair, and the maximum value is at the back
- The comparison times are - 1. After repeating the above steps, the sorting effect can be achieved

void BubbleSort(int* a, int n) { for (int i = n; i > 0; --i)// Cycle number control { int exchange = 0; // Judge whether it is orderly for (int j = 1; j < i; j++) { if (a[j - 1] > a[j]) // Compare { Swap(&a[j - 1], &a[j]);// Greater than exchange exchange = 1;// Determine whether a comparison has been made } } // If the next trip is for exchange, it is orderly and optimal at this time if (exchange == 0) break; } }

Summary of bubble sorting characteristics:

- Bubble sort is a sort that is very easy to understand
- Time complexity: O(N^2)
- Space complexity: O(1)
- Stability: stable

Note: the existence of exchange is to judge whether these elements have been directly compared during a trip. If there is no comparison, it is in order and does not have to cycle. At this time, the time complexity is the best, which is o(n).

## 2. Quick sort

Basic idea: any element in the element sequence to be sorted is taken as the reference value, and the set to be sorted is divided into two subsequences according to the sorting code. All elements in the left subsequence are less than the reference value, and all elements in the right subsequence are greater than the reference value, and then the process is repeated in the left and right subsequences until all elements are arranged in corresponding positions.

void QuickSort(int* a, int left, int right) { if (left < right) { int keyi = PartSort3(a, left, right);// Divide left and right QuickSort(a, left, keyi - 1); //Left subsequence QuickSort(a, keyi + 1, right); // Right subsequence } }

Tip: the above is the main framework for fast sorting recursive implementation. It is found that it is very similar to the binary tree preorder traversal rules. When writing the recursive framework, students can think about the binary tree preorder traversal rules and write them quickly. The postorder only needs to analyze how to divide the data in the interval according to the benchmark value.

### 1. hoare version

Basic idea: the hoare method is to find the large one on the right and the small one on the left, and then exchange the two values when they are found, and then exchange the values of the left position and the keyi position. Finally, return the index "left" and recurse in turn to complete the sorting.

int PartSort1(int* a, int left, int right) { //If it is ordered, the time complexity is o(n^2), and the efficiency is relatively low, so it is necessary to take three numbers int mid = GetMidIndex(a, left, right); Swap(&a[left], &a[mid]);//Put the median in the left position int keyi = left;//The benchmark value needs to be a median or the sorting speed is very slow while (left < right) { //Go left and find the small one while (left < right && a[right] >= a[keyi]) --right; //Go right and find the big one while (left < right && a[left] <= a[keyi]) ++left; //Find exchange Swap(&a[left], &a[right]); } Swap(&a[keyi], &a[left]); // Meet and exchange again return left; }

- In the hoare version, we have three data fetching. Why is there three data fetching (taking the middle number)?

Three data fetching is to prompt the sorting speed. If a sequence is ordered, such as 1 2 3 4 5 6

1. If we don't perform triple counting, the benchmark value of the first pass is 1. Take the hoare version as an example. At this time, we look for the large one on the right and the small one on the left. First, we go from the right to the small one on the left, but there is no smaller than 1. Then, we need to go n times for 6 and n-1 times for 5, which is n^2 in turn. At this time, the time complexity is o(n^2), and the efficiency is very slow.

2. If the benchmark value is 3, look for the large one on the right and the small one on the left. It takes four times to go to the right and two times to go to the left. The value of left found at this time is the index of the middle number 3. Return at this time, and then continue recursion, which will return to the previous traversal of the binary tree. The time complexity is o(n*logn). Compared with not taking three numbers, High efficiency

### 2. Excavation method

Basic idea: first, take three numbers, find the middle value as the key value, and the pit hole is the index of the key value. Then, sort, find the value * * (a[hole] = a[right]) to the pit index, assign the index of the value to the pit hole (hole = right), and then find the large * * to the right, and find the value * * (a [hole] = a) to the pit index [left]), the index also gives the pit location hole (hole = left), and finally the key value is assigned to the value at the pit location (a[hole] = key) * *. After returning the hole, it can recurse successively until it is in order

int PartSort2(int* a, int left, int right) { int mid = GetMidIndex(a, left, right); Swap(&a[left], &a[mid]); // Exchange in three data fetches int key = a[left]; //key value int hole = left; //Pit position while (left < right) { while (left < right && a[right] >= key) { --right; } a[hole] = a[right]; //Change the value to the pit position hole = right; //Pit index replacement while (left < right && a[left] <= key) { ++left; } a[hole] = a[left]; hole = left; } a[hole] = key; return hole; //Return of pit position taken }

It can be seen that the idea of pit digging method is similar to that of hoare method. They all find a value as the benchmark value, then find the small one on the left and the large one on the right, and finally achieve order.

### 3. Front and back pointer method

Basic idea: during initialization, the prev pointer points to the beginning of the sequence, the cur pointer points to the next position of the prev pointer, cur goes left to find the value less than the key, finds the stop, prev goes left, then exchanges the values at cur and prev, the loop ends, exchanges the values at prev and key, returns prev, and finally continues to recurse the left and right sequence to achieve order.

// Quick sort before and after pointer method int PartSort3(int* a, int left, int right) { int keyi = left;//The value at the beginning of the sequence as the key int prev = left;//Start points to the beginning of the sequence int cur = prev + 1; // Start pointing to the next position after prev while (cur <= right)//Continue execution only when the cycle conditions are met { // Go left to find a value less than keyi, and prev+1 should be less than cur if (a[cur] < a[keyi] && ++prev <= cur) { Swap(&a[cur], &a[prev]);//exchange } ++cur;//Keep going } Swap(&a[keyi], &a[prev]);//Exchange values at prev and key return prev; }

The front and back pointer method is more difficult to understand than the hoar version, but it can also achieve sorting in the end. It is roughly the same. Readers should master all of them according to what they like.

### Non recursive framework (stack implementation)

// Non recursive implementation of quick sort void QuickSortNonR(int* a, int left, int right) { Stack s; StackInit(&s); StackPush(&s, right);// Press the stack first on the right and press the stack on the left StackPush(&s, left); while (!StackEmpty(&s)) { int begin = StackTop(&s); StackPop(&s); int end = StackTop(&s); StackPop(&s); int keyi = PartSort1(a, begin, end);//Get the key value first if (keyi + 1 < end) { //Sort the sequence on the right StackPush(&s, end); StackPush(&s, keyi + 1); } if (keyi - 1 > begin) { //Sort the left sequence StackPush(&s, keyi - 1); StackPush(&s, begin); } } StackDestroy(&s);// Remember to destroy the stack }

Summary:

- The overall comprehensive performance and usage scenarios of quick sort are relatively good, so we dare to call it quick sort
- Time complexity: O(N*logN)
- Space complexity: O(logN)
- Stability: unstable

# Merge sort

Basic idea: merge sort is an effective sorting algorithm based on merge operation. The algorithm adopts divide and conquer A very typical application of. Merge the ordered subsequences to obtain a completely ordered sequence; that is, order each subsequence first, and then order the subsequence segments. If two ordered tables are merged into one ordered table, it is called two-way merging. The core steps of merging sorting:

The whole sequence is divided into small sequences. The difference between sorting and fast sorting is that merging is sorting from inside to outside, and fast sorting is sorting from outside to inside. Therefore, the non recursive method behind merging sorting cannot use stack for sorting (very troublesome), which is simpler than using loop.

Recursion:

void _MergeSort(int* a, int left, int right, int* tmp) { if (left >= right) return; //Stop recursion when the interval becomes 11 int mid = left + (right - left) / 2; // Intermediate interval _MergeSort(a, left, mid, tmp); // The first interval is recursively divided into multiple cells _MergeSort(a, mid + 1, right, tmp); // Post interval // Interval index range int begin1 = left, end1 = mid; // Front interval index int begin2 = mid + 1, end2 = right; // Back inter zone index int index = left; //The index is used to sort while (begin1 <= end1 && begin2 <= end2) { //Load small into tmp array if (a[begin1] <= a[begin2]) { tmp[index] = a[begin1];//New array tmp packed ordered values index++; begin1++; } else { tmp[index] = a[begin2]; index++; begin2++; } } //If the previous interval is not traversed, continue to insert while (begin1 <= end1) { tmp[index++] = a[begin1++]; } //If the post interval is not traversed, continue to insert while (begin2 <= end2) { tmp[index++] = a[begin2++]; } //The interval of the first call to the function is 0 -- n-1, not 0 -- n. note for (int i = left; i <= right; ++i) { a[i] = tmp[i]; //Copy to original array } } //Recursion in void MergeSort(int* a, int n) { int* tmp = (int*)malloc(sizeof(int) * n);//Open up a piece of memory assert(tmp != NULL);//Check whether the development is successful _MergeSort(a, 0, n - 1, tmp); free(tmp);//Remember to release }

Non recursive:

//Here is the same as above, except that it does not enter the inner layer recursively void _MergeSortNonR(int* a, int left, int right, int* tmp) { if (left >= right) return; int mid = left + (right - left) / 2; // Interval index range int begin1 = left, end1 = mid; // Front interval index int begin2 = mid + 1, end2 = right; // Back inter zone index int index = left; //The index is used to sort while (begin1 <= end1 && begin2 <= end2) { if (a[begin1] <= a[begin2]) { tmp[index] = a[begin1];//New array tmp packed ordered values index++; begin1++; } else { tmp[index] = a[begin2]; index++; begin2++; } } while (begin1 <= end1) { tmp[index++] = a[begin1++]; } while (begin2 <= end2) { tmp[index++] = a[begin2++]; } //The interval of the first call to the function is 0 -- n-1, not 0 -- n. note for (int i = left; i <= right; ++i) { a[i] = tmp[i]; //Copy to original array } } // Non recursive implementation of merge sort void MergeSortNonR(int* a, int n) { int* tmp = (int*)malloc(sizeof(int) * n); //Temporary array assert(tmp != NULL);//Check whether the development fails int gap = 1; //Interval range control while (gap < n) { for (int i = 0; i < n; i += gap) { //Left and right range control int left = i, right = 2 * gap + i - 1; // Note that the range of right needs to be checked if (right >= n) right = n - 1;//Range up to n - 1 _MergeSort(a, left, right, tmp); } gap *= 2; } free(tmp); //Release the dynamic space opened up }

Summary of characteristics of merge sort:

- The disadvantage of merging is that it requires O(N) space complexity. The thinking of merging sorting is more to solve the problem of external sorting in the disk.
- Time complexity: O(N*logN), properties of binary tree
- Space complexity: O(N)
- Stability: stable

# Count sort

Basic idea: counting sorting, also known as pigeon nest principle, is a deformation application of hash direct addressing method. Operation steps:

- Count the number of occurrences of the same element
- According to the statistical results, the sequence is recovered into the original sequence

The basic idea is very simple, but we need to do some processing here. See the figure for details

// Count sort void CountSort(int* a, int n) { int min = a[0]; int max = a[0]; for (int i = 1; i < n; ++i) { if (a[i] < min) min = a[i]; // Count the range from the minimum value to the maximum value, and open up a range space to avoid waste if (a[i] > max) max = a[i]; } int range = max - min + 1; //Number of spaces opened //int* count = malloc(sizeof(int) * range); int* count = calloc(range, sizeof(int)); //It is best to use calloc, which is automatically initialized to 0 assert(count != NULL); for (int i = 0; i < n; i++) { count[a[i] - min]++;// Counts the number of occurrences of the element } int i = 0; // Load the elements that appear into the array in turn for (int j = 0; j < range; j++) { while (count[j]--) { a[i++] = j + min; } } free(count); //Remember to release the open space }

# Complexity and stability analysis of sorting algorithm

The time complexity needs to be explained. You can refer to the following pictures to summarize it and gain more (remember: it is forbidden to remember the time complexity and space complexity)