I am doing competition programming with Java
, but the worst calculation amount of the function ʻArrays.sort that sorts the array of primitive type provided in the standard library of
Java` is $ O (N ^ 2). I heard that it is $, and I implemented my own sort function because it was bad (however, if it is not primitive, it seems that Merge Sort of $ O (N \ log N) $ is used at worst. Stable This is also subject to change as it is only guaranteed to be a sort).
(Addition 2020/04/14 15:27) It seems that Java 14 has been modified so that the worst computational complexity is $ O (N \ log N) $ as follows. However, it seems that it will take some time before Java 14 can be used in competition pros, so self-made sorting Looks good to have. [Java 14 Arrays.sort (int [])](https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/Arrays.html#sort #sort ( Quoted from int []))
The sorting algorithm is a Dual-Pivot Quicksort by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm offers O(n log(n)) performance on all data sets, and is typically faster than traditional (one-pivot) Quicksort implementations.
(End of postscript)
Also, I wanted to beat ʻArrays.sort` if I wrote it anyway, so I implemented a multi-pattern sort algorithm with a device for speeding up and compared the speed.
This time, we implemented and measured the following three types of sorting algorithms, which make them fight against ʻArrays.sort`.
The array size used for the test was $ N = 10 ^ 5, 10 ^ 6, 10 ^ 7 $, and the average value of multiple execution times was measured. Due to the execution time, $ N = 10 ^ 5 $ So we can only measure $ 1000 $ times, $ 10 ^ 6 $ for $ 100 $ times, $ 10 ^ 7 $ for $ 10 $ times. The test cases are java.util.Random
'snextInt ()
andnextLong (). It was generated using
, and in order to eliminate the difference due to input, the generated array was copied and each sort was executed and measured for the array with the same contents.
It is known that the processing for sorted arrays is explosive as a feature of ʻArrays.sort`, but I felt it was non-essential, so this time I aimed to win in a random case.
Merge Sort is an algorithm that briefly describes: The detailed explanation is omitted because many people have already written it in an easy-to-understand manner.
Step 1. Divide the array into the first half and the second half. Step 2. Sort the first and second half of the array with a recursive call. Step 3. Merge the sorted first half and second half well to sort the whole thing.
Regarding the complexity of this algorithm, we can see that the recurrence formula is $ O (N \ log N) $ when the size of the array is $ N $. The lower limit of the complexity of sorting based on comparison is $ O. It is known to be (N \ log N) $, which means it is fast on order.
However, when I actually wrote and operated this algorithm, the speed was far below that of ʻArrays.sort` as shown below. The implementation of this Merge Sort is the recursive version of Merge Sort + shown later. Just delete the Insertion Sort part from the Insertion Sort code and add the termination condition, which is almost the same, so omit it.
int array sort | |||
---|---|---|---|
Arrays.sort (Standard library) | 5.979 | 73.396 | 891.750 |
MergeSort (Recursion) | 9.373 | 117.921 | 1238.865 |
One of the causes is the behavior when the array is divided into small pieces. Generally, if the order is improved, the calculation and the way of holding the data become complicated, so the overhead is the order for the small size. When it comes to Merge Sort, the orderly inferior Insertion Sort (insertion sort, computational complexity $ O (N ^ 2) $) is faster when it comes to merge sort. ..
Therefore, set a boundary value for the array size, and if an array with a size smaller than that boundary value is passed, rewrite Merge Sort to execute Insertion Sort. The code with this rewrite is shown below. In the code below, all ranges are treated as half-open intervals.
RecursionalMergeSort.java
//Boundary value to switch from Merge Sort to Insertion Sort
private static final int INSERTION_SORT_THRESHOLD = 60;
//Functions visible from the outside
public static final void sort(int[] a, int from, int to) {
sort(a, from, to, new int[a.length >> 1]);
}
// Merge Sort +Processing body of Insertion Sort
//work is a work area for saving the original array when merging the first half and the second half..
//work is reused to save memory usage.
private static final void sort(int[] a, int begin, int end, int[] work) {
//Performs an insertion sort if the length of the given array is less than or equal to the boundary value
if (end - begin <= INSERTION_SORT_THRESHOLD) {
insertionSort(a, begin, end);
return;
}
//Subscript in the middle of the array.
// >>1 represents dividing by 2.
int mid = (begin + end) >> 1;
//Recursively sort the first half of the array.
sort(a, begin, mid, work);
//Recursively sort the second half of the array.
sort(a, mid, end, work);
//First half array length
int len = mid - begin;
//Copy the first half of the array to the work area.
//Copying an array is a System rather than a for statement.arraycopy works faster
System.arraycopy(a, begin, work, 0, len);
//i is the subscript of the storage location.
//wi is work(=First half of the array)Indicates which subscript you are looking at.
//ti represents which subscript in the second half of the array you are looking at.
//Subscripts in the second half of the array start with mid.
for (int i = begin, wi = 0, ti = mid;; i++) {
if (ti == end) {
//If you have seen all the second half,Copy the elements remaining in the work area
System.arraycopy(work, wi, a, i, len - wi);
//There are no more elements to merge, so exit the loop
break;
} else if (work[wi] > a[ti]) {
//Compare the element you are looking at in the first half with the element you are looking at in the second half.
//Store the smaller one,Shift the subscript back.
a[i] = a[ti++];
} else {
//Same as above.
a[i] = work[wi++];
//If you finish looking at the first half,The rest of the elements in the second half are already in the array
if (wi == len) {
break;
}
}
}
}
//Insertion Sort called when the array size is small.
private static final void insertionSort(int[] a, int from, int to) {
//Array a is from<= x <When arranging in ascending order in the range of i,
// a[i]Find where to insert by shifting the element back.
for (int i = from + 1; i < to; i++) {
//Value to insert
int tmp = a[i];
// a[i-1]If it is above, there is no need to search
if (a[i - 1] > tmp) {
//Insertion subscript
int j = i;
//The first time always satisfies the while conditional statement, so do-whil
do {
//Move the element back by one.
a[j] = a[j - 1];
//Insertion destination is earlier
j--;
//"Reach the beginning" or "a"[i]Discover the following elements "
} while (j > from && a[j - 1] > tmp);
// a[i]Put in the place where it should be inserted.
a[j] = tmp;
}
}
}
The comparison results are as follows. (Units are milliseconds, using the same test data)
int array sort | |||
---|---|---|---|
Arrays.sort (Standard library) | 5.979 | 73.396 | 891.750 |
MergeSort (Recursion) | 9.373 | 117.921 | 1238.865 |
MergeSort (Recursion) + Insertion Sort | 6.557 | 82.228 | 991.833 |
It's still not as fast as ʻArrays.sort, but it's been accelerated to a practically acceptable speed. However, our goal is to implement a faster sort than ʻArrays.sort
, so it's even faster.
A recursive call may be the cause of the slowness. It is said that rewriting the recursive call to for
or while
will improve the speed. Therefore, the next step is to implement Merge Sort that does not use recursion.
In the recursive version of Merge Sort, it was implemented with an image (= top-down) descending from top to bottom (from larger to smaller), but in the non-recursive version, it is implemented in the opposite direction (= bottom-up). That is, we first decompose the arrays into smaller sizes and then sort the arrays. We will call this sorted subcolumn a block. Then we will call this block in the same way as in the previous implementation. By merging two by two and increasing the size, the entire column is finally sorted. The implementation is shown below, but the code and comments common to the recursive version are omitted as appropriate.
NonRecursionalMergeSort.java
//Because it is non-recursive,The work area array is generated internally(You don't have to pass it as an argument).
public static final void sort(int[] a, int begin, int end) {
//First of all,Perform Insertion Sort for each small block.
//this time,Use the boundary value defined in the previous implementation as the smallest unit of block
//i is the first subscript of the block
for (int i = begin;;) {
//j is the subscript at the end of the block(However,Since it is a half-open interval, j is not included.)
int j = i + INSERTION_SORT_THRESHOLD;
//The array length is not always divisible by the size of the block.
//So,Determine if the block is wide enough.
if (j < end) {//If you can get enough width
//Execute Insertion Sort
insertionSort(a, i, j);
} else {//If you don't have enough width(=When you reach the end of the array)
//Run Insertion Sort from i to end.
insertionSort(a, i, end);
//I've reached the end so I'm out of the loop.
break;
}
//The starting point of the next block is j. (i,Note that j was a half-open section)
i = j;
}
//The length of the range of arrays to sort.
int len = end - begin;
//Work area.Same as before,Used to save the first half of the array.
//The boundary is not always in the middle because the block division produces fractions..
//Therefore,The area to be secured becomes wider.
int[] work = new int[len];
//block is the size of the block.Every time you merge, the size doubles,
//Update formula is block<<=1.
for (int block = INSERTION_SORT_THRESHOLD; block <= len; block <<= 1) {
//Because I merge two blocks at a time,Calculate the size of two blocks.
int twoBlocks = block << 1;
//from is a subscript that points to the beginning of the first of the two blocks to merge.
//Therefore,In the update formula, add the size of two blocks.
//max is the maximum value that can be taken from(-1).
//There is no need to merge if the fraction is less than one interval.,
// max = end -The length of one block is subtracted as a block.
for (int from = begin, max = end - block; from < max; from += twoBlocks) {
//Boundary between two blocks(Point to the beginning of the second block).
int mid = from + block;
//End of second block.Take min, paying attention to the end of the interval.
int to = Math.min(from + twoBlocks, end);
//The merge of the two blocks is the same as the recursive version, so the explanation is omitted..
System.arraycopy(a, from, work, 0, block);
for (int i = from, wi = 0, ti = mid;; i++) {
if (ti == to) {
System.arraycopy(work, wi, a, i, block - wi);
break;
} else if (work[wi] > a[ti]) {
a[i] = a[ti++];
} else {
a[i] = work[wi++];
if (wi == block) {
break;
}
}
}
}
}
}
//The insertionSort method is the same as the recursive version, so it is omitted..
The comparison result is as follows.
int array sort | |||
---|---|---|---|
Arrays.sort (Standard library) | 5.979 | 73.396 | 891.750 |
MergeSort (Recursion) | 9.373 | 117.921 | 1238.865 |
MergeSort (Recursion) + Insertion Sort | 6.557 | 82.228 | 991.833 |
MergeSort (Non-recursive) + Insertion Sort | 3.926 | 43.144 | 477.579 |
It seems that it was possible to speed up considerably just by rewriting it non-recursively. It is superior to ʻArrays.sort`.
However, there are actually faster algorithms for sorting arrays that have ʻintor
long` as elements.
Here, we implement an algorithm called Radix Sort. The amount of calculation of Radix Sort is $ O (k \ ast N) with the word length of scale notation (for example, $ 10 $ base notation) as $ k $. ) $.
Radix Sort is not a "comparison" sorting algorithm. What it means is that when an element can be represented in some radix notation, it only focuses on the value of each digit without a direct comparison. You can sort by. ʻIntand
long are represented by bit columns, and this bit column is nothing but radix sort. Therefore, columns consisting of ʻint
and long
are sorted by Radix Sort. It will be sorted. If you look at the implementation shown below, you can see that there is no comparison of the two elements of the array even though it is a sorting algorithm.
Specifically, the algorithm consists of the following k Steps, where k is the bit length. By performing each Step with $ O (N) $, the total $ O (k \ ast N) $ becomes Can be achieved.
Step 1. Prepare a bucket to put the element whose lowest bit is 0 and a bucket to put the element whose lowest bit is 1, and store the elements of the sequence. Step 2. Do the same operation for the second bit from the bottom as in Step 1. However, for any two elements that go into the same bucket, the order in which they are stored should not break the order of the buckets in step 1. Must be something. Step i. Sort for the i-th bit from the bottom in the same way. Step k. Sort for the kth bit from the bottom in the same way.
I find it difficult to understand the "however, ..." in Step 2, so I'll follow the flow with an example. In the following, a = [010, 011, 100, 101,000] is sorted according to the above procedure. Since the word length is 3, the sorting will be completed in Steps 1 to 3 below.
Step 1. Let a = [010, 011, 100, 101, 000]-> bucket = [[010, 100, 000], [011, 101]]. Return this to a and a = [010, 100, 000] , 011, 101] and updated.
Step 2. Let a = [010, 100, 000, 011, 101]-> bucket = [[100, 000, 101], [010, 011]]. Return this to a and a = [010, 100, 000] , 011, 101] and updated. At this time, [100, 000, 101] put in bucket 0 are put in the same order as a before processing. The same applies to bucket 1.
Step 3. Set a = [100, 000, 101, 010, 011]-> bucket = [[000, 010, 011], [100, 101]] and set this back to a, a = [000, 010, 011 Updated with 100, 101]. A is certainly sorted.
The above is the general flow of Radix Sort, but there is one big problem. It is the treatment of negative numbers. In many languages such as Java
, negative integers are represented using 2's complement representation, so If the above algorithm is applied as it is, a negative number with the most significant bit of 1 will be sorted as being larger than a positive number (in other words, sorting will be performed if it is regarded as ʻunsigned). Also, the magnitude relation of negative numbers in
signed and the magnitude relation of ʻunsigned
are just the opposite. Therefore, consider that the correct sort sequence can be obtained by following the steps below.
Step 1. Perform Radix Sort in descending order (as ʻunsigned). Step 2. Sorted in descending order by ʻunsigned
, so the negative number before and the positive number after it are fixed. Also, if you look at the magnitude relationship as signed
, the negative number and the positive number are positive. The numbers are arranged in descending order. Therefore, if only the negative number column is inverted and only the positive number column is inverted, the sorted column with signed
can be obtained.
Now you can write a Radix Sort that also supports negative numbers. However, in the implementation shown below, 8 bits are grouped together as one place to shorten the word length. In other words, we are doing Radix Sort using $ 2 ^ 8 = 256 $ radix sort notation.
RadixSort.java
//bucket size.
private static final int BUCKET_SIZE = 256;
//int is 32bit,Word length in 256 base is 32/ 8 = 4.
private static final int INT_RECURSION = 4;
//Lower 8 bits(Single digit in 256 base)Mask for extracting
private static final int MASK = 0xff;
//Functions visible from the outside
public static final void sort(int[] a, int from, int to) {
sort(a, from, to, 0, new int[to - from]);
}
//Bucket is represented by a one-dimensional array to save memory.
//l means that it is now Step l. (However, 0-indexed l= 0, 1, 2, 3)
private static final void sort(int[] a, final int from, final int to, final int l, final int[] bucket) {
//Since the mask is done after shifting to the right,Calculate the right shift amount.
final int shift = l << 3;
//Manage the number of pieces in each bucket.
//I want the starting point of each bucket, so I have it as a cumulative sum.
final int[] cnt = new int[BUCKET_SIZE + 1];
//Manage how many are placed in each bucket.
final int[] put = new int[BUCKET_SIZE];
for (int i = from; i < to; i++) {
// a[i] >>>Shift the desired 8 bits to the lowest level.
// (a[i] >>> shift) &MASK tells you which bucket to enter.
//However,The subscript will be shifted by one as it will be accumulated later..
cnt[((a[i] >>> shift) & MASK) + 1]++;
}
//If you take the accumulation, cnt[i]Stores the start point of the i-th bucket in.
for (int i = 0; i < BUCKET_SIZE; i++) {
cnt[i + 1] += cnt[i];
}
for (int i = from; i < to; i++) {
//Calculate which bucket to enter in the same way as before.
int bi = (a[i] >>> shift) & MASK;
//Stored in bith bucket.
//Do not overwrite what you have already placed, put[bi]Increase
bucket[cnt[bi] + put[bi]++] = a[i];
}
//I want to sort in descending order,Scan bucket in reverse order
//idx represents the next place to be stored on the array a.
for (int i = BUCKET_SIZE - 1, idx = from; i >= 0; i--) {
//The starting point of the i-th bucket
int begin = cnt[i];
//i-th bucket size.Calculated using the cumulative sum of cnt.
int len = cnt[i + 1] - begin;
//Copy the contents of the bucket back into the original array.
System.arraycopy(bucket, begin, a, idx, len);
//The next place to store is len ahead
idx += len;
}
//The next step is Step l+ 1.
final int nxtL = l + 1;
//Is there still a place left?
if (nxtL < INT_RECURSION) {
//To the next step.bucket is reused
sort(a, from, to, nxtL, bucket);
// i =If you have reached here with 0,Descending sort by unsigned is finished
if (l == 0) {
//Variables to reuse.
int lft, rgt;
//Negative number in the first half,Since there is a positive number in the second half, binary search for the boundary.
//Find a negative number. O(log N).
lft = from - 1; rgt = to;
while (rgt - lft > 1) {
int mid = (lft + rgt) >> 1;
if (a[mid] < 0) {
lft = mid;
} else {
rgt = mid;
}
}
//Negative number is the value of rgt.
final int negative = rgt;
//Subsequence of array a[from:negative-1](Closedsection)To reverse.
//Negative numbers are arranged in descending order in this section.
lft = from; rgt = negative - 1;
while (rgt > lft) {
int tmp = a[lft]; a[lft] = a[rgt]; a[rgt] = tmp;
lft++; rgt--;
}
//Subsequence of array a[negative:to-1](Closedsection)To reverse.
//Positive numbers are arranged in descending order in this section.
lft = negative; rgt = to - 1;
while (rgt > lft) {
int tmp = a[lft]; a[lft] = a[rgt]; a[rgt] = tmp;
lft++; rgt--;
}
}
}
}
The comparison result is as follows.
int array sort | |||
---|---|---|---|
Arrays.sort (Standard library) | 5.979 | 73.396 | 891.750 |
MergeSort (Recursion) | 9.373 | 117.921 | 1238.865 |
MergeSort (Recursion) + Insertion Sort | 6.557 | 82.228 | 991.833 |
MergeSort (Non-recursive) + Insertion Sort | 3.926 | 43.144 | 477.579 |
RadixSort | 1.000 | 11.124 | 133.713 |
It's an overwhelming victory. However, one thing to be aware of is that sorting the long
array takes a lot of time because the order is proportional to the word length. Actually, it is measured by the long
array. The result is that the performance of Radix Sort is degraded as shown below.
long array sort | |||
---|---|---|---|
Arrays.sort (Standard library) | 6.300 | 67.841 | 751.904 |
MergeSort (Recursion) | 9.110 | 111.099 | 1340.971 |
MergeSort (Recursion) + Insertion Sort | 6.594 | 76.131 | 922.294 |
MergeSort (Non-recursive) + Insertion Sort | 4.108 | 43.538 | 484.530 |
RadixSort | 2.684 | 29.359 | 339.683 |
Still, it is by far the fastest of all.
I think I was able to beat ʻArrays.sort` brilliantly. I don't think Radix Sort was fooled, but I'm very happy that I also won Merge Sort.
Recommended Posts