<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blog::Quibb</title>
	<atom:link href="http://blog.quibb.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.quibb.org</link>
	<description>Software development and more.</description>
	<lastBuildDate>Wed, 10 Mar 2010 03:20:15 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>JSR-166: The Java fork/join Framework</title>
		<link>http://blog.quibb.org/2010/03/jsr-166-the-java-forkjoin-framework/</link>
		<comments>http://blog.quibb.org/2010/03/jsr-166-the-java-forkjoin-framework/#comments</comments>
		<pubDate>Wed, 10 Mar 2010 02:53:22 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[sorting]]></category>
		<category><![CDATA[threading]]></category>

		<guid isPermaLink="false">http://blog.quibb.org/?p=153</guid>
		<description><![CDATA[The JSR-166 are concurrent utilities that were  included in Java 5.  The fork/join framework was a piece of it that  didn&#8217;t make it into Java 5.  After all this time the fork/join framework  is finally making it into JDK 7.  What surprised me about the framework  is that it is so [...]]]></description>
			<content:encoded><![CDATA[<p>The <a title="JSR-166" href="http://jcp.org/en/jsr/detail?id=166">JSR-166</a> are concurrent utilities that were  included in Java 5.  The fork/join framework was a piece of it that  didn&#8217;t make it into Java 5.  After all this time the fork/join framework  is finally making it into JDK 7.  What surprised me about the framework  is that it is so easy to use.</p>
<p>The fork/join framework is designed to make divide-and-conquer algorithms easy to parallelize.   More specifically, recursive algorithms where the control path branches  out over a few paths and they each process an equal part of the data  set.  The typical setup is a new class is created that extends either  the <a title="RecursiveAction" href="http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166ydocs/jsr166y/RecursiveAction.html">RecursiveAction</a> or <a title="RecursiveTask" href="http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166ydocs/jsr166y/RecursiveTask.html">RecursiveTask</a> class.  The parameters that were sent into the recursive function  become member variables in the newly defined class.  Then the recursive  calls are replaced by invokeAll(&#8230;) rather than the calls to the  function itself.</p>
<p>In writing this post, I kept going back for  forth on whether I should use Fibonacci numbers as an example or  something with more meat to it.  The computations done by each recursive  call of a Fibonacci numbers algorithm is too small to matter, not only  that, but there are much better non-parallel algorithms for Fibonacci numbers.  In the end, I decided on showing a merge sort.  It is used as the example in the fork/join documentation, but this will be a more complete example showing both the sequential algorithm and the changes made for the  parallel version of the algorithm.  You&#8217;ll see that it&#8217;s not that hard.</p>
<p>First  let me start by showing the source code for a typical MergeSort:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MergeSort <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> SIZE_THRESHOLD <span style="color: #339933;">=</span> <span style="color: #cc66cc;">16</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> sort<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        sort<span style="color: #009900;">&#40;</span>a, <span style="color: #cc66cc;">0</span>, a.<span style="color: #006633;">length</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> sort<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>hi <span style="color: #339933;">-</span> lo <span style="color: #339933;">&lt;</span> SIZE_THRESHOLD<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            insertionsort<span style="color: #009900;">&#40;</span>a, lo, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tmp <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>hi <span style="color: #339933;">-</span> lo<span style="color: #009900;">&#41;</span> <span style="color: #339933;">/</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
        mergeSort<span style="color: #009900;">&#40;</span>a, tmp, lo, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> mergeSort<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tmp, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>hi <span style="color: #339933;">-</span> lo <span style="color: #339933;">&lt;</span> SIZE_THRESHOLD<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            insertionsort<span style="color: #009900;">&#40;</span>a, lo, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        <span style="color: #000066; font-weight: bold;">int</span> m <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>lo <span style="color: #339933;">+</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #339933;">/</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
        mergeSort<span style="color: #009900;">&#40;</span>a, tmp, lo, m<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        mergeSort<span style="color: #009900;">&#40;</span>a, tmp, m <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span>, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        merge<span style="color: #009900;">&#40;</span>a, tmp, lo, m, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> merge<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> b, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> m, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>m<span style="color: #009900;">&#93;</span>.<span style="color: #006633;">compareTo</span><span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>m<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span>
            <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span>a, lo, b, <span style="color: #cc66cc;">0</span>, m<span style="color: #339933;">-</span>lo<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> j <span style="color: #339933;">=</span> m<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> k <span style="color: #339933;">=</span> lo<span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">// copy back next-greatest element at each time</span>
        <span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span>k <span style="color: #339933;">&lt;</span> j <span style="color: #339933;">&amp;&amp;</span> j <span style="color: #339933;">&lt;=</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>b<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span>.<span style="color: #006633;">compareTo</span><span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                a<span style="color: #009900;">&#91;</span>k<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> b<span style="color: #009900;">&#91;</span>i<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
                a<span style="color: #009900;">&#91;</span>k<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> a<span style="color: #009900;">&#91;</span>j<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">// copy back remaining elements of first half (if any)</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span>b, i, a, k, j<span style="color: #339933;">-</span>k<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> insertionsort<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> lo<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;=</span> hi<span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000066; font-weight: bold;">int</span> j <span style="color: #339933;">=</span> i<span style="color: #339933;">;</span>
            <span style="color: #003399;">Comparable</span> t <span style="color: #339933;">=</span> a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span>j <span style="color: #339933;">&gt;</span> lo <span style="color: #339933;">&amp;&amp;</span> t.<span style="color: #006633;">compareTo</span><span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>j <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> a<span style="color: #009900;">&#91;</span>j <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                <span style="color: #339933;">--</span>j<span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
            a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> t<span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Now here is the code for the parallel version of  MergeSort:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> ParallelMergeSort <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> ForkJoinPool threadPool <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ForkJoinPool<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> SIZE_THRESHOLD <span style="color: #339933;">=</span> <span style="color: #cc66cc;">16</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> sort<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        sort<span style="color: #009900;">&#40;</span>a, <span style="color: #cc66cc;">0</span>, a.<span style="color: #006633;">length</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> sort<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>hi <span style="color: #339933;">-</span> lo <span style="color: #339933;">&lt;</span> SIZE_THRESHOLD<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            insertionsort<span style="color: #009900;">&#40;</span>a, lo, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tmp <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span>a.<span style="color: #006633;">length</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
        threadPool.<span style="color: #006633;">invoke</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> SortTask<span style="color: #009900;">&#40;</span>a, tmp, lo, hi<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #008000; font-style: italic; font-weight: bold;">/**
     * This class replaces the recursive function that was
     * previously here.
     */</span>
    <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">class</span> SortTask <span style="color: #000000; font-weight: bold;">extends</span> RecursiveAction <span style="color: #009900;">&#123;</span>
        <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a<span style="color: #339933;">;</span>
        <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tmp<span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> lo, hi<span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">public</span> SortTask<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tmp, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">a</span> <span style="color: #339933;">=</span> a<span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">lo</span> <span style="color: #339933;">=</span> lo<span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">hi</span> <span style="color: #339933;">=</span> hi<span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">tmp</span> <span style="color: #339933;">=</span> tmp<span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        @Override
        <span style="color: #000000; font-weight: bold;">protected</span> <span style="color: #000066; font-weight: bold;">void</span> compute<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>hi <span style="color: #339933;">-</span> lo <span style="color: #339933;">&lt;</span> SIZE_THRESHOLD<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                insertionsort<span style="color: #009900;">&#40;</span>a, lo, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
&nbsp;
            <span style="color: #000066; font-weight: bold;">int</span> m <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>lo <span style="color: #339933;">+</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #339933;">/</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
            <span style="color: #666666; font-style: italic;">// the two recursive calls are replaced by a call to invokeAll</span>
            invokeAll<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> SortTask<span style="color: #009900;">&#40;</span>a, tmp, lo, m<span style="color: #009900;">&#41;</span>, <span style="color: #000000; font-weight: bold;">new</span> SortTask<span style="color: #009900;">&#40;</span>a, tmp, m<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span>, hi<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            merge<span style="color: #009900;">&#40;</span>a, tmp, lo, m, hi<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> merge<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> b, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> m, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>m<span style="color: #009900;">&#93;</span>.<span style="color: #006633;">compareTo</span><span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>m<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span>
            <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span>a, lo, b, lo, m<span style="color: #339933;">-</span>lo<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> lo<span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> j <span style="color: #339933;">=</span> m<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> k <span style="color: #339933;">=</span> lo<span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">// copy back next-greatest element at each time</span>
        <span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span>k <span style="color: #339933;">&lt;</span> j <span style="color: #339933;">&amp;&amp;</span> j <span style="color: #339933;">&lt;=</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>b<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span>.<span style="color: #006633;">compareTo</span><span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                a<span style="color: #009900;">&#91;</span>k<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> b<span style="color: #009900;">&#91;</span>i<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
                a<span style="color: #009900;">&#91;</span>k<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> a<span style="color: #009900;">&#91;</span>j<span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">// copy back remaining elements of first half (if any)</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span>b, i, a, k, j<span style="color: #339933;">-</span>k<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> insertionsort<span style="color: #009900;">&#40;</span><span style="color: #003399;">Comparable</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> a, <span style="color: #000066; font-weight: bold;">int</span> lo, <span style="color: #000066; font-weight: bold;">int</span> hi<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> lo<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;=</span> hi<span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000066; font-weight: bold;">int</span> j <span style="color: #339933;">=</span> i<span style="color: #339933;">;</span>
            <span style="color: #003399;">Comparable</span> t <span style="color: #339933;">=</span> a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span>j <span style="color: #339933;">&gt;</span> lo <span style="color: #339933;">&amp;&amp;</span> t.<span style="color: #006633;">compareTo</span><span style="color: #009900;">&#40;</span>a<span style="color: #009900;">&#91;</span>j <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> a<span style="color: #009900;">&#91;</span>j <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                <span style="color: #339933;">--</span>j<span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
            a<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> t<span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>As you can see the majority of the  algorithm has remained intact.  As stated above a new class is created  that extends RecursiveAction, and the parameters of the function are  then passed into that class during creation.  One thing to take note, is that previously only half the size of the original array was created as  secondary storage.  Now the entire length of the array is created as a  temporary storage.  This is used to avoid different threads needing the  same area of the array at the same time.</p>
<p>Changes to the algorithm may  be needed, but it definitely helps in making it easier to move to parallel processing.  One other thing to note is the presence of the  <a title="ForkJoinPool" href="http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166ydocs/jsr166y/ForkJoinPool.html">ForkJoinPool</a>.  The default constructor looks at the processor and determines the appropriate level of parallelism for the task.</p>
<p>I have a quad core CPU, so the ForkJoinPool will spawn at least four threads if necessary.  That said, I&#8217;ve seen in where only two threads are spawned because more than that was not  necessary for the given task.  The ForkJoinPool spawns more threads as deemed necessary without starting right at the maximum.</p>
<p>A complete API for the fork/join  framework can be found here at the Concurrency <a title="JSR-166 Interest Site" href="http://gee.cs.oswego.edu/dl/concurrency-interest/">JSR-166 Interest Site</a>.  All that is needed  for Java 6 is the jsr166y package.</p>
<p>Some other algorithms that  are suited for parallelism that I&#8217;ve been thinking about are graph  searching algorithms such as depth first and breadth first search.   Depending on whether they are done on a tree or a graph determines how  much the underlying data structure will need to be changed to support  the parallelism.  I plan to look at making a parallel version of the  quicksort algorithm using this framework.  Most divide and conquer  algorithms can be adapted fairly easily to be multi-threaded using this  method, but remember for a performance benefit to be seen the task must  be sufficiently large.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.quibb.org/2010/03/jsr-166-the-java-forkjoin-framework/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Setting up a VirtualBox LAMP Server</title>
		<link>http://blog.quibb.org/2010/02/setting-up-a-virtualbox-lamp-server/</link>
		<comments>http://blog.quibb.org/2010/02/setting-up-a-virtualbox-lamp-server/#comments</comments>
		<pubDate>Wed, 17 Feb 2010 03:20:38 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[lamp]]></category>
		<category><![CDATA[virtualbox]]></category>

		<guid isPermaLink="false">http://blog.quibb.org/?p=144</guid>
		<description><![CDATA[Introduction
I recently decided to play around with web development a little bit.  Not being familiar with setting up a web server, I decided to setup a VirtualBox LAMP server.  Since I couldn&#8217;t find a good guide that went through all the steps of setting up a VirtualBox LAMP Server in one place, I [...]]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>I recently decided to play around with web development a little bit.  Not being familiar with setting up a web server, I decided to setup a <a title="VirtualBox" href="http://www.virtualbox.org/">VirtualBox</a> <a title="LAMP" href="http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29">LAMP</a> server.  Since I couldn&#8217;t find a good guide that went through all the steps of setting up a VirtualBox LAMP Server in one place, I decided to write about my experience.  I wanted a LAMP Server that I could access from any machine on my local network.  In retrospect, it isn&#8217;t very hard to do, but having all the information in one place is nice.</p>
<h2>Installing VirtualBox</h2>
<p>Start by installing VirtualBox.  The open source edition (OSE) should be good enough to use for the purposes of this guide.  I was installing the full edition on <a title="openSuSE 11.2" href="http://www.opensuse.org/en/">openSuSE 11.2</a>, and there were some issues.  The issue I had was solved with this command: <strong>sudo chmod +x /usr/lib/virtualbox/VirtualBox and remove /tmp/*vbox*</strong>.  Generally speaking it&#8217;s fairly easy to install VirtualBox on Windows.  When creating a new virtual machine, I allocated 512MB of RAM and 12GB of HD space.</p>
<p>For additional Linux troubleshooting look here (It applies to more than just OpenSuSE): <a title=" Sun’s VirtualBox on OpenSUSE 11.1" href="http://ryan.rawswift.com/2009/01/27/suns-virtualbox-on-opensuse-111/">Sun’s VirtualBox on OpenSUSE 11.1</a></p>
<h2>Installing LAMP</h2>
<p>I chose Ubuntu for my LAMP server largely because there are many documents on how to setup a LAMP server on top of Ubuntu.  I&#8217;ll do a quick overview here, and provide a link to <a title="setting up a LAMP server on Ubuntu (Hardy Heron)" href="http://www.ubuntugeek.com/ubuntu-804-hardy-heron-lamp-server-setup.html">setting up a LAMP server on Ubuntu (Hardy Heron)</a>.  I liked this guide more than the guide for Jaunty because this one tells you to install the <a title="OpenSSH" href="http://www.openssh.com/">OpenSSH</a> server, and being able to administer the VM remotely is a good idea.</p>
<p>Start by downloading the <a title="Ubuntu Server Edition" href="http://www.ubuntu.com/products/whatIsubuntu/serveredition">Ubuntu Server Edition</a>.  I tried downloading and installing Hardy Heron, which is the latest Long Term Support (LTS) release, but I kept getting a Kernel Panic when trying to boot in VirtualBox 3.1.2.  It may have been the combination of VirtualBox version and Ubuntu version.  Eventually, I ended up going with Karmic Koala.  The installation process is almost identical to the Hardy Heron installation, and it provides both the LAMP and OpenSSH options that the guide suggests.</p>
<h2>Network Configuration</h2>
<p>Click on your newly created virtual machine, and open the settings dialog.  Then click the <em>Network</em> settings area.  For Adapter 1, make sure that the <em>Enable Network Adapter</em> checkbox is checked.  Adapter 1 is attached to <em>NAT</em> by default, switch it to <em>Bridged Adapter</em> for it to look like a regular PC to the rest of your network.  It will acquire an IP address from your router, like a normal computer.  If you only want your host computer to be able to access it the <em>Host-only Adapter</em> option seems like an appropriate choice, but I did not use or test this option.  After changing the setting, start up the virtual machine.  To get the IP of the virtual machine use the <strong>ifconfig</strong> command.  If you point your browser at that IP, you should see the apache welcome page.</p>
<h3>Static IP</h3>
<p>Setting a static IP address for the virtual machine is a good idea so you can always access the same IP address.  These are the instructions for setting a static IP in Ubuntu.</p>
<p>Edit the /etc/network/interfaces file using vim or nano:</p>
<pre>sudo [your_editor] /etc/network/interfaces</pre>
<p>Find this line:</p>
<pre>iface eth0 inet dhcp</pre>
<p>Change it to:</p>
<pre>iface eth0 inet static
address 192.168.1.99
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
gateway 192.168.1.1</pre>
<p>These settings worked well for my Linksys router.  By default, the DHCP service the router provides starts using IP addresses starting with 192.168.1.100.  By using 192.168.1.99 it was outside of that range.  The Linksys router defaults to using 192.168.1.1 for its own IP, which is why the gateway is set to that.</p>
<h2>Pure-FTPd FTP Server</h2>
<p>The last step is setting up an FTP Server on it so you can easily transfer files.   For this I chose <a title="Pure-FTPd" href="http://www.pureftpd.org/project/pure-ftpd">Pure-FTPd</a> because the project prides itself on is being easy to configure.  It largely worked right out of the box without any configuration.</p>
<p>To install it:</p>
<pre>sudo apt-get install pure-ftpd</pre>
<p>Some Pure-FTPd configuration:</p>
<pre>CD to the configuration directory located here:
/etc/pure-ftpd/conf

Set display dot files to on (so you can see your .htaccess file):
echo yes &gt; DisplayDotFiles

Restart Pure-FTPd:
sudo /etc/init.d/pure-ftpd restart

Get your user connected to the /var/www directory:
CD to your home folder and create a symbolic link to /var/www
ln -s /var/www www

Change ownership /var/www to your user, so you can write to this directory.
chown -R  /var/www

Change to 755 permissions
chmod -R 755 /var/www</pre>
<p>You should now be able to connect to the FTP server from anywhere on your network by pointing your FTP client at: 192.168.1.99 (or any IP you may have chosen).  It should have no problem running PHP files.</p>
<p>If any part of this short guide was confusing or didn&#8217;t work, leave a comment so I can look into it and update the guide.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.quibb.org/2010/02/setting-up-a-virtualbox-lamp-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NLTK Regular Expression Parser (RegexpParser)</title>
		<link>http://blog.quibb.org/2010/01/nltk-regular-expression-parser-regexpparser/</link>
		<comments>http://blog.quibb.org/2010/01/nltk-regular-expression-parser-regexpparser/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 13:53:35 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[nlp]]></category>
		<category><![CDATA[nltk]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://blog.quibb.org/?p=122</guid>
		<description><![CDATA[The Natural Language Toolkit (NLTK) provides a variety of tools for dealing with natural language.  One such tool is the Regular Expression Parser.  If you&#8217;re familiar with regular expressions, it can be a useful tool in natural language processing.
Background Information
You must first be familiar with regular expressions to be able to fully  utilize the [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.nltk.org/">Natural Language Toolkit (NLTK)</a> provides a variety of tools for dealing with natural language.  One such tool is the Regular Expression Parser.  If you&#8217;re familiar with regular expressions, it can be a useful tool in <a title="natural language processing" href="http://en.wikipedia.org/wiki/Natural_language_processing">natural language processing</a>.</p>
<h2>Background Information</h2>
<p>You must first be familiar with regular expressions to be able to fully  utilize the <a title="RegexpParser" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.chunk.regexp.RegexpParser-class.html">RegexpParser</a>/<a title="RegexpChunkParser" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.chunk.regexp.RegexpChunkParser-class.html">RegexpChunkParser</a>.  If you need to learn about regular expressions, here is a site with an abundance of information to get you started: <a title="Regular Expressions" href="http://www.regular-expressions.info">http://www.regular-expressions.info</a>.  It is also necessary to know how to use a tagger, and what the tags mean.  A <a title="tagger" href="http://en.wikipedia.org/wiki/Part-of-speech_tagging">tagger</a> is a tool that marks each word in a sentence with its part of speech.  Here is a small comparison I did of python taggers: <a title="NLTK vs MontyLingua Part of Speech Taggers" href="http://blog.quibb.org/2009/03/nltk-vs-montylingua-part-of-speech-taggers/">NLTK vs MontyLingua Part of Speech Taggers</a>.  The NLTK RegexpParser works by running regular expressions on top of the part of speech tags added by a tagger.  The <a title="Brown Corpus Tags" href="http://en.wikipedia.org/wiki/Brown_Corpus#Part-of-speech_tags_used">Brown Corpus tags</a> will be the tags used throughout the rest of this post, and are commonly used by taggers in general.  On a side note, the RegexpParser can be used with either the NLTK or <a title="MontyLingua" href="http://en.wikipedia.org/wiki/MontyLingua">MontyLingua</a> tagger.</p>
<h2>Basic RegexpParser Usage</h2>
<p>Let me start by going over the &#8220;how to&#8221; provided in the NLTK  documentation.  The source of this information is here: <a title="NLTK  RegexParser HowTo" href="http://nltk.googlecode.com/svn/trunk/doc/howto/chunk.html">NLTK  RegexParser HowTo</a>.  The documentation goes through how you could use  the RegexParser/RegexpChunkParser to do a traditional parse of a  sentence.</p>
<p>The RegexParser/RegexChunkParser works by defining rules for grouping different words together.  A simple example would be: &#8220;NP: {&lt;DT&gt;? &lt;JJ&gt;* &lt;NN&gt;*}&#8221;.  This is a definition for a rule to group of words into a noun phrase.  It will group one determinant (usually an article), then zero or more adjectives followed by zero or more nouns.  In the how to, they go over prepositions and creating prepositional phrases from a preposition and noun phrase.  It&#8217;s important to note that earlier regular expressions can be used in later ones.  Also, the regular expression syntax can occur within the tags or apply to the tags themselves.</p>
<p>Here is the example from the NLTK website:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #dc143c;">parser</span> = RegexpParser<span style="color: black;">&#40;</span><span style="color: #483d8b;">''</span><span style="color: #483d8b;">'
    NP: {&lt;DT&gt;? &lt;JJ&gt;* &lt;NN&gt;*} # NP
    P: {&lt;IN&gt;}           # Preposition
    V: {&lt;V.*&gt;}          # Verb
    PP: {&lt;P&gt; &lt;NP&gt;}      # PP -&gt; P NP
    VP: {&lt;V&gt; &lt;NP|PP&gt;*}  # VP -&gt; V (NP|PP)*
    '</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span></pre></div></div>

<h2>Alternative RegexpParser Usage</h2>
<p>I call this an alternate usage because it can be used to find patterns that aren&#8217;t necessarily related to grammatical phrases in English.  It can be used to find any pattern in a sentence.  Let me start by showing the regular expression grammar from my program.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">grammar = <span style="color: #483d8b;">&quot;&quot;&quot;
	NP:   {&lt;PRP&gt;?&lt;JJ.*&gt;*&lt;NN.*&gt;+}
	CP:   {&lt;JJR|JJS&gt;}
	VERB: {&lt;VB.*&gt;}
	THAN: {&lt;IN&gt;}
	COMP: {&lt;DT&gt;?&lt;NP&gt;&lt;RB&gt;?&lt;VERB&gt;&lt;DT&gt;?&lt;CP&gt;&lt;THAN&gt;&lt;DT&gt;?&lt;NP&gt;}
	&quot;&quot;&quot;</span>
<span style="color: #008000;">self</span>.<span style="color: black;">chunker</span> = RegexpParser<span style="color: black;">&#40;</span>grammar<span style="color: black;">&#41;</span></pre></div></div>

<p>I was using it to look for a specific pattern in a sentence.  The first part, NP, is looking for a noun phrase.  The &lt;PRP&gt;? is there because of a bug found in the tagger I was using.  It was marking An with a capital &#8216;A&#8217; as a PRP (Pronoun) rather than a DT (Determinant/Article).  I found another workaround for the bug, but left the PRP in there to catch anything that might have slipped through.</p>
<p>Then it moves onto the CP, which is the comparison word.  JJR tagged words are comparative adjectives.  They include words bigger, smaller, and larger.  JJS words are words that signify the most or chief.  JJS words include biggest, smallest, and largest.</p>
<p>The next two a simply the VERB and the word THAN.  The VERB could be a compound verb, so there would be one or more verbs present.  The IN tag denotes a preposition.  In this case, I was looking specifically for the word than.</p>
<p>The last line is COMP.  This is the regular expression that puts it all together.  This was looking for a size comparison of two objects.  It might be easier to look at the output of this part of the expression than trying to explain it piece by piece.  The only tag not explained above is RB, which is an adverb.</p>
<p>Here is the parse for the sentence &#8220;Everyone knows an elephant is larger than a dog.&#8221;:</p>
<pre>
(S
  (NP everyone/NN)
  (VERB knows/VBZ)
  (COMP
    an/DT
    (NP elephant/NN)
    (VERB is/VBZ)
    (CP larger/JJR)
    (THAN than/IN)
    a/DT
    (NP dog/NN))
  ./.)
</pre>
<p>The output is a simple tree, that makes to easy data extraction.  It&#8217;s easy to see there are many possibilities that open up when looking for patterns in English text.  May this help you in your data mining endeavors.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.quibb.org/2010/01/nltk-regular-expression-parser-regexpparser/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Sort Optimization (Part 2) with JDK 6 vs JDK 7</title>
		<link>http://blog.quibb.org/2009/12/sort-optimization-part-2-with-jdk-6-vs-jdk-7/</link>
		<comments>http://blog.quibb.org/2009/12/sort-optimization-part-2-with-jdk-6-vs-jdk-7/#comments</comments>
		<pubDate>Wed, 23 Dec 2009 15:00:28 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[shootout]]></category>
		<category><![CDATA[sorting]]></category>

		<guid isPermaLink="false">http://blog.quibb.org/?p=94</guid>
		<description><![CDATA[In part 1, I went over my first foray into the world of sorting algorithms.  Since then, I&#8217;ve had some other ideas on how to improve my quicksort implementation.  One idea that I had while originally working on the sorting algorithm, was to rework the partition function to take into account duplicate elements.  I had [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://blog.quibb.org/2008/11/sort-optimization/">part 1</a>, I went over my first foray into the world of sorting algorithms.  Since then, I&#8217;ve had some other ideas on how to improve my quicksort implementation.  One idea that I had while originally working on the sorting algorithm, was to rework the partition function to take into account duplicate elements.  I had a few different working implementations, but all of them came with severe performance penalty.  I finally figured out a way to get performance close to the previous algorithm.</p>
<p>The partition function needs to perform the minimal number of swaps possible.  So moving towards the center from both ends and only swapping when both are out of order is the best approach I&#8217;ve found so far.  When grouping duplicate elements, they are swapped to the beginning of the partition area as they are found.  Then at the end, a pass is run to move them to their correct location in the final list.  Then instead of returning one number from the partition function, it returns two.  It returns the minimum and maximum indices on the range that has the pivot value.</p>
<p>Another area that I was able to get some performance gain out of was getting rid of the shell sort form the first algorithm.  While that was there to make sure the quicksort did not recurse too deeply, in practice the shell sort algorithm doesn&#8217;t run.</p>
<p><strong>Results</strong></p>
<p>Here are the results of JDK 6 MergeSort, <a href="http://en.wikipedia.org/wiki/Timsort">Tim Sort</a>, <a href="http://blog.quibb.org/2008/11/sort-optimization/">QSort</a>, QSortv2, and Dual Pivot sort 2 benchmarked on the same set of files.  Overall, the new version doesn&#8217;t outperform the old version, but I thought it was worth posting my findings.  On most data sets with duplicates it does perform better.  I ran these benchmarks on OpenJDK 7 because I was curious as to how they would compare to one another.</p>
<p>It&#8217;s important to note that the tables are speedup relative the Java implementation on the given JDK.  The graphs are the average runtimes for each algorithm.  The reason for doing the average runtime is that it could show the performance difference between Sun&#8217;s JDK 6 and OpenJDK 7 build 73.<br />
<center></p>
<table>
<tbody>
<tr>
<td>
<div id="attachment_105" class="wp-caption alignnone" style="width: 122px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK6nowarm.png"><img class="size-medium wp-image-105" title="Sun JDK 6 without Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK6nowarm-112x300.png" alt="Sun JDK 6 without Warmup" width="112" height="300" /></a><p class="wp-caption-text">Sun JDK 6 without Warmup</p></div></td>
<td>
<p><div id="attachment_106" class="wp-caption alignnone" style="width: 106px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK7nowarm.png"><img class="size-medium wp-image-106 " title="OpenJDK 7 without Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK7nowarm-96x300.png" alt="Sun JDK 7 without Warmup" width="96" height="300" /></a><p class="wp-caption-text">OpenJDK 7 without Warmup</p></div></td>
</tr>
</tbody>
</table>
<p></center><br />
<center></p>
<table>
<tbody>
<tr>
<td>
<p><div id="attachment_107" class="wp-caption alignnone" style="width: 174px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK61000warm.png"><img class="size-medium wp-image-107" title="Sun JDK 6 1000 Warmup Iterations" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK61000warm-164x300.png" alt="Sun JDK 6 1000 Warmup Iterations" width="164" height="300" /></a><p class="wp-caption-text">Sun JDK 6 1000 Warmup Iterations</p></div></td>
<td>
<p><div id="attachment_108" class="wp-caption alignnone" style="width: 151px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/JDK71000warm.png"><img class="size-medium wp-image-108" title="OpenJDK 7 1000 Warmup Iterations" src="http://blog.quibb.org/wp-content/uploads/2009/12/JDK71000warm-141x300.png" alt="OpenJDK 7 1000 Warmup Iterations" width="141" height="300" /></a><p class="wp-caption-text">OpenJDK 7 1000 Warmup Iterations</p></div></td>
</tr>
</tbody>
</table>
<p></center><br />
<div id="attachment_109" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/nowarmup.png"><img class="size-medium wp-image-109  " title="Sun JDK 6 vs OpenJDK 7 without Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/nowarmup-300x249.png" alt="JDK 6 vs JDK 7 with No Warmup" width="300" height="249" /></a><p class="wp-caption-text">Sun JDK 6 vs OpenJDK 7 without Warmup</p></div>
<div id="attachment_104" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.quibb.org/wp-content/uploads/2009/12/1000warm.png"><img class="size-medium wp-image-104 " title="Sun JDK 6 vs OpenJDK 7 1000 Iterations of Warmup" src="http://blog.quibb.org/wp-content/uploads/2009/12/1000warm-300x248.png" alt="Sun JDK 6 vs OpenJDK 7 1000 Warmup Iterations" width="300" height="248" /></a><p class="wp-caption-text">Sun JDK 6 vs OpenJDK 7 1000 Iterations of Warmup</p></div>
<p><strong>Conclusions</strong></p>
<p>Overall the new version of the Qsort implementation doesn&#8217;t improve greatly over the previous implementation.  While it didn&#8217;t work out to be the performace improvement I was looking for.  I think the last graph with 1000 iterations of warmup for each algorithm is the most interesting.  The Qsort v2 implementation apparently doesn&#8217;t get handled any better by OpenJDK 7.  The partition function is larger after my changes, so perhaps it didn&#8217;t JIT very well.  What is interesting is the boost that Tim Sort saw with the change of JDK&#8217;s.  Running these benchmarks made me realize that upgrading my Java Runtime will increase the performance of all my Java applications.  It will be interesting to see if the performance carries over to Netbeans and Eclipse; I expect it will.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.quibb.org/2009/12/sort-optimization-part-2-with-jdk-6-vs-jdk-7/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sorting Algorithm Shootout</title>
		<link>http://blog.quibb.org/2009/10/sorting-algorithm-shootout/</link>
		<comments>http://blog.quibb.org/2009/10/sorting-algorithm-shootout/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 14:00:55 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[shootout]]></category>
		<category><![CDATA[sorting]]></category>

		<guid isPermaLink="false">http://blog.quibb.org/?p=75</guid>
		<description><![CDATA[ Since I did my Sort Optimization post, I&#8217;ve been keeping an eye on things that happen in the sorting world.  Recently an article popped up on Reddit about someone wanting to replace the JDK sorting algorithm with a Dual Pivot Quick Sort.  This lead to the discovery that Tim Sort would be replacing Merge [...]]]></description>
			<content:encoded><![CDATA[<p><strong> </strong>Since I did my <a href="http://blog.quibb.org/2008/11/sort-optimization/">Sort Optimization</a> post, I&#8217;ve been keeping an eye on things that happen in the sorting world.  Recently an article popped up on Reddit about someone wanting to replace the JDK sorting algorithm with a Dual Pivot Quick Sort.  This lead to the discovery that Tim Sort would be replacing Merge Sort in the JDK starting with version 7.  This probably got some attention because of the <a href="http://openjdk.java.net/">OpenJDK</a> project.  It&#8217;s nice to see that allowing more developers to work on different areas of the JDK.  First I&#8217;ll do a quick overview of the algorithms, then show some benchmarks.  All algorithms are written in Java.</p>
<p><strong>JDK 6 Sort</strong></p>
<p>The JDK6 implements a fairly standard Merge Sort.  It will switch to an insertion sort at a specific depth.</p>
<p><strong>QSort</strong></p>
<p>This is the implementation of quicksort I outlined in the earlier blog post.  It performed admirably at the time, but how will it hold up against tougher competition.  It&#8217;s pretty much an iterative quicksort, that short-circuits to a shell sort if it&#8217;s going too deep.</p>
<p>Original QSort Post:<br />
<a href="http://blog.quibb.org/2008/11/sort-optimization/">Sort Optimization</a></p>
<p><strong>Tim Sort</strong></p>
<p>This is an optimized in place variation of a merge sort.  Tim Peters developed this sorting algorithm for the Python programming language.  It is in use by Python and will be used by Java starting with JDK 7.  It takes advantage of partially sorted parts of the list.</p>
<p>Available here:<br />
<a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6804124">http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6804124</a><br />
<a href="http://hg.openjdk.java.net/jdk7/tl/jdk/rev/bfd7abda8f79">http://hg.openjdk.java.net/jdk7/tl/jdk/rev/bfd7abda8f79</a><br />
<strong><br />
Dual Pivot Quick sort</strong></p>
<p>This is a newcomer to the sorting table.  Developed by Vladimir Yaroslavskiy for the inclusion into the Java language.  The premise is the same as quick sort, only it will choose two pivot points rather than one.  He did a full writeup detailing the algorithm, and its benefits.  I did modify it to take the comparable interface, and Vladimir explicitly said this was not the intended target of the algorithm.  He has stated it is designed to work directly with primitive types.  I don&#8217;t see how doing an int comparison vs a Integer.compareTo() would be different, as long as they are used uniformly between all algorithms.  Since my sorting algorithm works with comparable, as does Tim Sort, I chose to convert this algorithm to use the Comparable interface also.</p>
<p>Available here:<br />
<a href="http://article.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628">http://article.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628</a></p>
<p><strong>Results</strong></p>
<p>These tables show the speedup relative to JDK 6 with and without warm up.</p>
<div id="attachment_83" class="wp-caption alignnone" style="width: 123px"><a href="http://blog.quibb.org/wp-content/uploads/2009/10/nowarm_server.png"><img class="size-medium wp-image-83" title="nowarm_server" src="http://blog.quibb.org/wp-content/uploads/2009/10/nowarm_server-113x300.png" alt="nowarm_server" width="113" height="300" /></a><p class="wp-caption-text">sorting algorithm speedup without warm up</p></div>
<div id="attachment_84" class="wp-caption alignnone" style="width: 175px"><a href="http://blog.quibb.org/wp-content/uploads/2009/10/10000warm_server.png"><img class="size-medium wp-image-84" title="10000warm_server" src="http://blog.quibb.org/wp-content/uploads/2009/10/10000warm_server-165x300.png" alt="sorting algorithm comparison with warmup" width="165" height="300" /></a><p class="wp-caption-text">sorting algorithm speedup with warm up</p></div>
<p>Here is the original text data if you&#8217;re interested in that.  These are in simple table format.  The columns store the runtime in seconds for each algorithm.  The number in parenthesis is the speedup relative to JDK 6.</p>
<p><a href="http://blog.quibb.org/wp-content/uploads/2009/10/results_nowarm_server.txt">Without Warmup</a></p>
<p><a href="http://blog.quibb.org/wp-content/uploads/2009/10/results_10000warm_server.txt">With Warmup</a></p>
<p>Tim Sort is definitely the way to go if you&#8217;re interested in a stable sorting algorithm.  I was pretty amazed when I first looked at the results with how well it actually it did.  It really takes advantage of any presorted parts of the lists.  Overall, I&#8217;d say my optimized quicksort does fairly well, but maybe it could do better.  I may have to look into that again.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.quibb.org/2009/10/sorting-algorithm-shootout/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
