<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" ><channel><title>Blog::Quibb &#187; shootout</title> <atom:link href="http://blog.quibb.org/tag/shootout/feed/" rel="self" type="application/rss+xml" /><link>http://blog.quibb.org</link> <description>Software development and more.</description> <lastBuildDate>Mon, 21 Nov 2011 05:12:26 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>Sort Optimization (Part 2) with JDK 6 vs JDK 7</title><link>http://blog.quibb.org/2009/12/sort-optimization-part-2-with-jdk-6-vs-jdk-7/</link> <comments>http://blog.quibb.org/2009/12/sort-optimization-part-2-with-jdk-6-vs-jdk-7/#comments</comments> <pubDate>Wed, 23 Dec 2009 15:00:28 +0000</pubDate> <dc:creator>Joe</dc:creator> <category><![CDATA[Java]]></category> <category><![CDATA[benchmarks]]></category> <category><![CDATA[shootout]]></category> <category><![CDATA[sorting]]></category><guid isPermaLink="false">http://blog.quibb.org/?p=94</guid> <description><![CDATA[In part 1, I went over my first foray into the world of sorting algorithms.  Since then, I&#8217;ve had some other ideas on how to improve my quicksort implementation.  One idea that I had while originally working on the sorting algorithm, was to rework the partition function to take into account duplicate elements.  I had [...]]]></description> <content:encoded><![CDATA[<p>In <a href="http://blog.quibb.org/2008/11/sort-optimization/">part 1</a>, I went over my first foray into the world of sorting algorithms.  Since then, I&#8217;ve had some other ideas on how to improve my quicksort implementation.  One idea that I had while originally working on the sorting algorithm, was to rework the partition function to take into account duplicate elements.  I had a few different working implementations, but all of them came with severe performance penalty.  I finally figured out a way to get performance close to the previous algorithm.</p><p>The partition function needs to perform the minimal number of swaps possible.  So moving towards the center from both ends and only swapping when both are out of order is the best approach I&#8217;ve found so far.  When grouping duplicate elements, they are swapped to the beginning of the partition area as they are found.  Then at the end, a pass is run to move them to their correct location in the final list.  Then instead of returning one number from the partition function, it returns two.  It returns the minimum and maximum indices on the range that has the pivot value.</p><p>Another area that I was able to get some performance gain out of was getting rid of the shell sort form the first algorithm.  While that was there to make sure the quicksort did not recurse too deeply, in practice the shell sort algorithm doesn&#8217;t run.</p><p><strong>Results</strong></p><p>Here are the results of JDK 6 MergeSort, <a href="http://en.wikipedia.org/wiki/Timsort">Tim Sort</a>, <a href="http://blog.quibb.org/2008/11/sort-optimization/">QSort</a>, QSortv2, and Dual Pivot sort 2 benchmarked on the same set of files.  Overall, the new version doesn&#8217;t outperform the old version, but I thought it was worth posting my findings.  On most data sets with duplicates it does perform better.  I ran these benchmarks on OpenJDK 7 because I was curious as to how they would compare to one another.</p><p>It&#8217;s important to note that the tables are speedup relative the Java implementation on the given JDK.  The graphs are the average runtimes for each algorithm.  The reason for doing the average runtime is that it could show the performance difference between Sun&#8217;s JDK 6 and OpenJDK 7 build 73.<br /><center></p><table><tbody><tr><td><p><div id="attachment_105" class="wp-caption alignnone" style="width: 122px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK6nowarm.png"><img class="size-medium wp-image-105" title="Sun JDK 6 without Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK6nowarm-112x300.png" alt="Sun JDK 6 without Warmup" width="112" height="300" /></a><p class="wp-caption-text">Sun JDK 6 without Warmup</p></div></td><td><p><div id="attachment_106" class="wp-caption alignnone" style="width: 106px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK7nowarm.png"><img class="size-medium wp-image-106 " title="OpenJDK 7 without Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK7nowarm-96x300.png" alt="Sun JDK 7 without Warmup" width="96" height="300" /></a><p class="wp-caption-text">OpenJDK 7 without Warmup</p></div></td></tr></tbody></table><p></center><br /><center></p><table><tbody><tr><td><p><div id="attachment_107" class="wp-caption alignnone" style="width: 174px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK61000warm.png"><img class="size-medium wp-image-107" title="Sun JDK 6 1000 Warmup Iterations" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK61000warm-164x300.png" alt="Sun JDK 6 1000 Warmup Iterations" width="164" height="300" /></a><p class="wp-caption-text">Sun JDK 6 1000 Warmup Iterations</p></div></td><td><p><div id="attachment_108" class="wp-caption alignnone" style="width: 151px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK71000warm.png"><img class="size-medium wp-image-108" title="OpenJDK 7 1000 Warmup Iterations" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK71000warm-141x300.png" alt="OpenJDK 7 1000 Warmup Iterations" width="141" height="300" /></a><p class="wp-caption-text">OpenJDK 7 1000 Warmup Iterations</p></div></td></tr></tbody></table><p></center><br /><div id="attachment_109" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/nowarmup.png"><img class="size-medium wp-image-109  " title="Sun JDK 6 vs OpenJDK 7 without Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/nowarmup-300x249.png" alt="JDK 6 vs JDK 7 with No Warmup" width="300" height="249" /></a><p class="wp-caption-text">Sun JDK 6 vs OpenJDK 7 without Warmup</p></div></p><div id="attachment_104" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/1000warm.png"><img class="size-medium wp-image-104 " title="Sun JDK 6 vs OpenJDK 7 1000 Iterations of Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/1000warm-300x248.png" alt="Sun JDK 6 vs OpenJDK 7 1000 Warmup Iterations" width="300" height="248" /></a><p class="wp-caption-text">Sun JDK 6 vs OpenJDK 7 1000 Iterations of Warmup</p></div><p><strong>Conclusions</strong></p><p>Overall the new version of the Qsort implementation doesn&#8217;t improve greatly over the previous implementation.  While it didn&#8217;t work out to be the performace improvement I was looking for.  I think the last graph with 1000 iterations of warmup for each algorithm is the most interesting.  The Qsort v2 implementation apparently doesn&#8217;t get handled any better by OpenJDK 7.  The partition function is larger after my changes, so perhaps it didn&#8217;t JIT very well.  What is interesting is the boost that Tim Sort saw with the change of JDK&#8217;s.  Running these benchmarks made me realize that upgrading my Java Runtime will increase the performance of all my Java applications.  It will be interesting to see if the performance carries over to Netbeans and Eclipse; I expect it will.</p> ]]></content:encoded> <wfw:commentRss>http://blog.quibb.org/2009/12/sort-optimization-part-2-with-jdk-6-vs-jdk-7/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Sorting Algorithm Shootout</title><link>http://blog.quibb.org/2009/10/sorting-algorithm-shootout/</link> <comments>http://blog.quibb.org/2009/10/sorting-algorithm-shootout/#comments</comments> <pubDate>Thu, 08 Oct 2009 14:00:55 +0000</pubDate> <dc:creator>Joe</dc:creator> <category><![CDATA[Java]]></category> <category><![CDATA[shootout]]></category> <category><![CDATA[sorting]]></category><guid isPermaLink="false">http://blog.quibb.org/?p=75</guid> <description><![CDATA[Since I did my Sort Optimization post, I&#8217;ve been keeping an eye on things that happen in the sorting world.  Recently an article popped up on Reddit about someone wanting to replace the JDK sorting algorithm with a Dual Pivot Quick Sort.  This lead to the discovery that Tim Sort would be replacing Merge Sort [...]]]></description> <content:encoded><![CDATA[<p><strong> </strong>Since I did my <a href="http://blog.quibb.org/2008/11/sort-optimization/">Sort Optimization</a> post, I&#8217;ve been keeping an eye on things that happen in the sorting world.  Recently an article popped up on Reddit about someone wanting to replace the JDK sorting algorithm with a Dual Pivot Quick Sort.  This lead to the discovery that Tim Sort would be replacing Merge Sort in the JDK starting with version 7.  This probably got some attention because of the <a href="http://openjdk.java.net/">OpenJDK</a> project.  It&#8217;s nice to see that allowing more developers to work on different areas of the JDK.  First I&#8217;ll do a quick overview of the algorithms, then show some benchmarks.  All algorithms are written in Java.</p><p><strong>JDK 6 Sort</strong></p><p>The JDK6 implements a fairly standard Merge Sort.  It will switch to an insertion sort at a specific depth.</p><p><strong>QSort</strong></p><p>This is the implementation of quicksort I outlined in the earlier blog post.  It performed admirably at the time, but how will it hold up against tougher competition.  It&#8217;s pretty much an iterative quicksort, that short-circuits to a shell sort if it&#8217;s going too deep.</p><p>Original QSort Post:<br /> <a href="http://blog.quibb.org/2008/11/sort-optimization/">Sort Optimization</a></p><p><strong>Tim Sort</strong></p><p>This is an optimized in place variation of a merge sort.  Tim Peters developed this sorting algorithm for the Python programming language.  It is in use by Python and will be used by Java starting with JDK 7.  It takes advantage of partially sorted parts of the list.</p><p>Available here:<br /> <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6804124">http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6804124</a><br /> <a href="http://hg.openjdk.java.net/jdk7/tl/jdk/rev/bfd7abda8f79">http://hg.openjdk.java.net/jdk7/tl/jdk/rev/bfd7abda8f79</a><br /> <strong><br /> Dual Pivot Quick sort</strong></p><p>This is a newcomer to the sorting table.  Developed by Vladimir Yaroslavskiy for the inclusion into the Java language.  The premise is the same as quick sort, only it will choose two pivot points rather than one.  He did a full writeup detailing the algorithm, and its benefits.  I did modify it to take the comparable interface, and Vladimir explicitly said this was not the intended target of the algorithm.  He has stated it is designed to work directly with primitive types.  I don&#8217;t see how doing an int comparison vs a Integer.compareTo() would be different, as long as they are used uniformly between all algorithms.  Since my sorting algorithm works with comparable, as does Tim Sort, I chose to convert this algorithm to use the Comparable interface also.</p><p>Available here:<br /> <a href="http://article.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628">http://article.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628</a></p><p><strong>Results</strong></p><p>These tables show the speedup relative to JDK 6 with and without warm up.</p><div id="attachment_83" class="wp-caption alignnone" style="width: 123px"><a href="http://blog.quibb.org/wp-content/uploads/2009/10/nowarm_server.png"><img class="size-medium wp-image-83" title="nowarm_server" src="http://blog.quibb.org/wp-content/uploads/2009/10/nowarm_server-113x300.png" alt="nowarm_server" width="113" height="300" /></a><p class="wp-caption-text">sorting algorithm speedup without warm up</p></div><div id="attachment_84" class="wp-caption alignnone" style="width: 175px"><a href="http://blog.quibb.org/wp-content/uploads/2009/10/10000warm_server.png"><img class="size-medium wp-image-84" title="10000warm_server" src="http://blog.quibb.org/wp-content/uploads/2009/10/10000warm_server-165x300.png" alt="sorting algorithm comparison with warmup" width="165" height="300" /></a><p class="wp-caption-text">sorting algorithm speedup with warm up</p></div><p>Here is the original text data if you&#8217;re interested in that.  These are in simple table format.  The columns store the runtime in seconds for each algorithm.  The number in parenthesis is the speedup relative to JDK 6.</p><p><a href="http://blog.quibb.org/wp-content/uploads/2009/10/results_nowarm_server.txt">Without Warmup</a></p><p><a href="http://blog.quibb.org/wp-content/uploads/2009/10/results_10000warm_server.txt">With Warmup</a></p><p>Tim Sort is definitely the way to go if you&#8217;re interested in a stable sorting algorithm.  I was pretty amazed when I first looked at the results with how well it actually it did.  It really takes advantage of any presorted parts of the lists.  Overall, I&#8217;d say my optimized quicksort does fairly well, but maybe it could do better.  I may have to look into that again.</p> ]]></content:encoded> <wfw:commentRss>http://blog.quibb.org/2009/10/sorting-algorithm-shootout/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Ruby Shootout: Fasta</title><link>http://blog.quibb.org/2008/11/ruby-shootout-fasta/</link> <comments>http://blog.quibb.org/2008/11/ruby-shootout-fasta/#comments</comments> <pubDate>Sun, 02 Nov 2008 13:39:34 +0000</pubDate> <dc:creator>Joe</dc:creator> <category><![CDATA[Ruby]]></category> <category><![CDATA[benchmarks]]></category> <category><![CDATA[optimization]]></category> <category><![CDATA[shootout]]></category><guid isPermaLink="false">http://blog.quibb.org/?p=3</guid> <description><![CDATA[Lately, I&#8217;ve been looking at the shootout Ruby benchmarks. I&#8217;d gotten into a habit of checking them every few months, rooting for my favorite languages. Not really understanding why some didn&#8217;t have the greatest showing on there. When the people running it upgraded their hardware, it seems as though Ruby fell off the list. While [...]]]></description> <content:encoded><![CDATA[<p>Lately, I&#8217;ve been looking at the <a href="http://shootout.alioth.debian.org/">shootout</a> Ruby benchmarks.  I&#8217;d gotten into a habit of checking them every few months, rooting for my favorite languages.  Not really understanding why some didn&#8217;t have the greatest showing on there.  When the people running it upgraded their hardware, it seems as though Ruby fell off the list.  While Ruby is a slow language, it deserves its spot (even if it is the bottom).  I looked at the submissions and saw a post saying if more Ruby benchmarks were updated it would be included again.</p><p>I&#8217;ve since updated 3 of the benchmarks; I sped up reverse-compliment to be 2x faster than the previous.  I&#8217;m not a Ruby expert by any means, but I was able to get a speedup.  I don&#8217;t know if the other languages are in the same state, but they may be&#8230;</p><p>Anyway, the Fasta benchmark is the one that I most recently updated.  It took me a while to get a decent speedup.  It turned out the Array#find is slower than Array#each with a break statement.  Making this switch is what got the biggest speedup out of the program.</p><p>I wrote a benchmark showing the difference in runtime:</p><div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'benchmark'</span>
&nbsp;
N = <span style="color:#006666;">300</span>
table = <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">0</span>..<span style="color:#006666;">300</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">to_a</span>
<span style="color:#CC00FF; font-weight:bold;">Benchmark</span>.<span style="color:#9900CC;">bmbm</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">8</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>x<span style="color:#006600; font-weight:bold;">|</span>
  x.<span style="color:#9900CC;">report</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Array.find&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span>
    N.<span style="color:#9900CC;">times</span> <span style="color:#9966CC; font-weight:bold;">do</span>
      table.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>find_num<span style="color:#006600; font-weight:bold;">|</span>
	table.<span style="color:#9900CC;">find</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>num<span style="color:#006600; font-weight:bold;">|</span>
	  num <span style="color:#006600; font-weight:bold;">&gt;</span> find_num
	<span style="color:#9966CC; font-weight:bold;">end</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#006600; font-weight:bold;">&#125;</span>
&nbsp;
  x.<span style="color:#9900CC;">report</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Array.each&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span>
    N.<span style="color:#9900CC;">times</span> <span style="color:#9966CC; font-weight:bold;">do</span>
      table.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>find_num<span style="color:#006600; font-weight:bold;">|</span>
	output = <span style="color:#0000FF; font-weight:bold;">nil</span>
	table.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>num<span style="color:#006600; font-weight:bold;">|</span>
	  <span style="color:#9966CC; font-weight:bold;">if</span> num <span style="color:#006600; font-weight:bold;">&gt;</span> find_num <span style="color:#9966CC; font-weight:bold;">then</span>
	    output = num
	    <span style="color:#9966CC; font-weight:bold;">break</span>
	  <span style="color:#9966CC; font-weight:bold;">end</span>
	<span style="color:#9966CC; font-weight:bold;">end</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#006600; font-weight:bold;">&#125;</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>The results:</p><pre>Ruby 1.8.6       user     system      total        real
Array.find  12.120000   0.020000  12.140000 ( 14.647705)
Array.each   8.150000   0.030000   8.180000 (  9.734463)

JRuby 1.1.4      user     system      total        real
Array.find   6.738000   0.000000   6.738000 (  6.738407)
Array.each   5.365000   0.000000   5.365000 (  5.364320)</pre><p>Now, I&#8217;m running this on an admittedly dated machine.  Again, I&#8217;m nowhere near a ruby expert, so if someone sees a way to make the benchmark more fair let me know, and I&#8217;ll update it.  I can&#8217;t say anything about how this would run in YARV when that is released.</p><p>Here is a link to the Ruby Fasta benchmark on the Language Shootout:</p><p><a title="Ruby Shootout: Fasta" href="http://shootout.alioth.debian.org/u32/benchmark.php?test=fasta&amp;lang=ruby&amp;id=1">Ruby Shootout: Fasta</a></p> ]]></content:encoded> <wfw:commentRss>http://blog.quibb.org/2008/11/ruby-shootout-fasta/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk
Page Caching using disk (enhanced)
Database Caching 2/14 queries in 0.015 seconds using disk
Object Caching 405/426 objects using disk

Served from: blog.quibb.org @ 2012-02-05 12:15:18 -->
