Sorting Algorithm Shootout
Since I did my Sort Optimization post, I’ve been keeping an eye on things that happen in the sorting world. Recently an article popped up on Reddit about someone wanting to replace the JDK sorting algorithm with a Dual Pivot Quick Sort. This lead to the discovery that Tim Sort would be replacing Merge Sort in the JDK starting with version 7. This probably got some attention because of the OpenJDK project. It’s nice to see that allowing more developers to work on different areas of the JDK. First I’ll do a quick overview of the algorithms, then show some benchmarks. All algorithms are written in Java.
JDK 6 Sort
The JDK6 implements a fairly standard Merge Sort. It will switch to an insertion sort at a specific depth.
This is the implementation of quicksort I outlined in the earlier blog post. It performed admirably at the time, but how will it hold up against tougher competition. It’s pretty much an iterative quicksort, that short-circuits to a shell sort if it’s going too deep.
Original QSort Post:
This is an optimized in place variation of a merge sort. Tim Peters developed this sorting algorithm for the Python programming language. It is in use by Python and will be used by Java starting with JDK 7. It takes advantage of partially sorted parts of the list.
Dual Pivot Quick sort
This is a newcomer to the sorting table. Developed by Vladimir Yaroslavskiy for the inclusion into the Java language. The premise is the same as quick sort, only it will choose two pivot points rather than one. He did a full writeup detailing the algorithm, and its benefits. I did modify it to take the comparable interface, and Vladimir explicitly said this was not the intended target of the algorithm. He has stated it is designed to work directly with primitive types. I don’t see how doing an int comparison vs a Integer.compareTo() would be different, as long as they are used uniformly between all algorithms. Since my sorting algorithm works with comparable, as does Tim Sort, I chose to convert this algorithm to use the Comparable interface also.
These tables show the speedup relative to JDK 6 with and without warm up.
Here is the original text data if you’re interested in that. These are in simple table format. The columns store the runtime in seconds for each algorithm. The number in parenthesis is the speedup relative to JDK 6.
Tim Sort is definitely the way to go if you’re interested in a stable sorting algorithm. I was pretty amazed when I first looked at the results with how well it actually it did. It really takes advantage of any presorted parts of the lists. Overall, I’d say my optimized quicksort does fairly well, but maybe it could do better. I may have to look into that again.