Friday, December 17, 2010

Parallel Quicksort in ParaSail and "slicing"

Quicksort is a classic "divide-and-conquer" algorithm, and lends itself naturally to parallelization.  However, unlike some of the other algorithms we have discussed in past postings, it is updating rather than merely walking a data structure.  This means that we need a way to safely partition the array to be sorted so that the ParaSail compiler knows that the threads are manipulating non-overlapping parts of the array, so it can be certain there are no race conditions possible.

In several languages there is the notion of an array slice.  In particular, in Fortran and Ada there is built-in syntax for selecting a contiguous part of an array.  In newer versions of Fortran this can be done on multi-dimensional arrays.  In Ada slicing is limited to single-dimensional arrays.  In both Fortran and Ada, array slices are updatable, that is they effectively represent a view of the underlying array.  APL (not surprisingly!) has all of these capabilities and more.  In many other languages there are operations for selecting a slice of a string (and in some cases arrays), but these are generally not updatable views, and hence cannot be used on the left-hand side of an assignment nor be passed as an in out parameter.  In some cases there are explicit "replace-slice" operations which do allow for updating a slice of an array.

In any case, in ParaSail, because we expect the compiler to catch race-conditions at compile-time, operations that return a view of a part of an object need to have well-defined properties so the compiler can correctly determine whether two different views have any overlap (i.e. are potentially aliased).  The simplest solution seems to be to recognize a "slicing" operator, similar to the "indexing" operator.  A "slicing" operator would take an indexable container, and then one or more operands at least one of which is an interval or a set, and return a view of the container that represents the subset of components identified by the operand(s).  For example, this might be an interface for a One_Dim_Array with both "indexing" and "slicing" operators:
interface One_Dim_Array
  <Element_Type is Assignable<>;
   Index_Type is Discrete<>> is

    function Bounds(A : One_Dim_Array) -> Interval<Index_Type>; 

    function Create(Bounds : Interval<Index_type>)
      -> One_Dim_Array {Bounds(Create) == Bounds};

    operator "indexing"(
       A : ref One_Dim_Array;
       Index : Index_Type {Index in Bounds(A)})
     -> ref Element_Type;

    operator "slicing"(
       A : ref One_Dim_Array;
       Slice : Interval<Index_Type> {Is_Subset(Slice, Bounds(A))})
     -> ref Result: One_Dim_Array {Bounds(Result) == Slice};
end interface One_Dim_Array; 

As with the "indexing" operator, the "slicing" operator would be invoked by using A[...] notation, but where one or more of the operands inside the [] would be intervals or sets.  The ParaSail compiler would impose requirements on implementations of "indexing" and "slicing" operators.  For a given "indexing" operator, if the operands are different in two different calls, then the resulting elements must be distinct.  For a given "slicing" operator, if the operands are non-overlapping sets or intervals, then the resulting slices must be non-overlapping.   Note that the implementor of the "slicing" operator can decide whether the bounds of the result correspond to those of the selecting interval, or slide back to always starting at zero or one.  In the above example, the bounds correspond to the original indices of the selected elements, as indicated by the postcondition on the "slicing" operator.  In an interface designed for string manipulation, it might be more convenient for the slice to have bounds starting at zero or one.

Note that if there are multiple "indexing" operators, or multiple "slicing" operators for the same type, then they are presumed to be creating potentially overlapping elements or views, as they presumably represent different indexing schemes into the same data structure.  For example, a hash-table might be indexable both by the key and by some unique internal index.  That is, there is no guarantee that A["key"] and A[35] are unaliased just because "key" and 35 are distinct values, presuming these represent invocations of different "indexing" operators defined for the same indexable container.

Note that in all languages, you can achieve the equivalent of slicing by passing in the entire array and the bounds for the sub-array of interest.  This would however not allow the compiler to determine as easily whether two different calls are manipulating non-overlapping slices by looking only at the call sites.  A goal with ParaSail is that the compiler can identify any possible race-conditions by looking only at call sites and associated pre- and postconditions.  This argues for having explicit ways of indicating in a precondition or postcondition what parts of an object are read, what parts are updated, and what parts are not referenced at all.  Bu this probably comes back to being able to specify slices of a container, if only in annotations. 

In any case, given the above notion of "slicing" operators, we are now in a position to write our parallel Quicksort:
interface Sorting<One_Dim_Array<>> is
    procedure Quicksort(Arr : ref var One_Dim_Array;
       function Before(Left, Right : One_Dim_Array::Element_Type) 
         -> Boolean is "<");
          // Sort Arr according to the sorting function "Before" which returns
          // True if Left must appear before Right in the sorted order.
          // Before returns False if Left = Right.
end interface Sorting;

class Sorting is
    procedure Quicksort(Arr : ref var One_Dim_Array;
       function Before(Left, Right : One_Dim_Array::Element_Type) 
         -> Boolean is "<")
        // Handle short arrays directly.  Partition longer arrays.
        case Length(Arr) of
          [0..1] => return;
          [2] => 
               if Before(Arr[Arr.Last], Arr[Arr.First]) then
                   // Swap elements
                   Arr[Arr.First] :=: Arr[Arr.Last];
               end if;
          [..] =>
               // Partition array
               const Mid := Arr[Arr.First + Length(Arr)/2];
               var Left : Index_Type := Arr.First;
               var Right : Index_Type := Arr.Last;
               until Left > Right loop
                 var New_Left : Index_Type := Right+1;
                 var New_Right : Index_Type := Left-1;
                   // Find item in left half to swap
                   for I in Left .. Right forward loop
                       if not Before(Arr[I], Mid) then
                           // Found an item that can go into right partitition
                           New_Left := I;
                           if Before(Mid, Arr[I]) then
                               // Found an item that *must* go into right part
                               exit loop;
                           end if;
                       end if;
                   end loop;
                   // Find item in right half to swap
                   for J in Left .. Right reverse loop
                       if not Before(Mid, Arr[J]) then
                           // Found an item that can go into left partitition
                           New_Right := J;
                           if Before(Arr[J], Mid) then
                               // Found an item that *must* go into left part
                               exit loop;
                           end if;
                       end if;
                   end loop;
                 end block;
                 if New_Left > New_Right then
                     // Nothing more to swap
                     // Exit loop and recurse on two partitions
                     Left := New_Left;
                     Right := New_Right;
                     exit loop;
                 end if;
                 // Swap items
                 Arr[New_Left] :=: Arr[New_Right];
                 // continue looking for items to swap
                 Left := New_Left + 1;
                 Right := New_Right - 1;
               end loop;
               // At this point, "Right" is right end of left partition
               // and "Left" is left end of right partition
               // and the partitions don't overlap
               // and neither is the whole array
               // and everything in the left partition can precede Mid
               // and everything in the right partition can follow Mid
               // and everything between the partitions is equal to Mid.
               {Left > Right;
                Right < Arr.Last;
                Left > Arr.First;
                (for all I in Arr.First .. Right : not Before(Mid, Arr[I]));
                (for all J in Left .. Arr.Last : not Before(Arr[J], Mid));
                (for all K in Right+1 .. Left-1 : 
                  not Before(Mid, Arr[K]) and not Before(Arr[K], Mid))}
               // Recurse on two halves
                   Quicksort(Arr[Arr.First .. Right], Before);
                   Quicksort(Arr[Left .. Arr.Last], Before);
               end block;
        end case;        
   end procedure Quicksort;
end class Sorting;
We have used slicing as part of the recursion. Because the compiler can prove Left > Right, there is no overlap between the two slices, and hence no race condition.  It turns out that in ParaSail we could have written this without explicit recursion by using the continue loop with ... construct. We will save that for a later posting.


  1. Just fixed a bug in the partitioning, in case you were puzzled. Made the usual mistake of posting before actually running the code.

  2. To me, partitioning is a more natural operation than slicing, for recursive parallelism. Maybe that's just me, though ;-)

    A "partition" method of a container would take an index set (specified as either a contiguous range or as a more general set of indices), and return two (guaranteed nonoverlapping) slices of the container: the first corresponding to the index set, and the second to the negation of that index set. Laziness covers the case where the latter is infinite.

    Would "partition" make it easier to prove that parallel operations aren't writing to overlapping slices? You could still partition twice and make overlapping slices, but the typical use case is to partition once and recurse. Partition promises nonoverlapping, so the compiler wouldn't have to do work in that case to prove no race conditions.

    Of course, slices are useful in their own right, so it may not matter.

  3. I agree that some more general "partitioning" scheme would be valuable. Do you have something specific in mind, or some examples of languages or frameworks that provide this? It would be interesting to see whether generalized partitioning might integrate nicely into the ParaSail model.

    Slices are nice because they are familiar to many programmers, so it seems worthwhile to include something like this generalized slicing capability in any case.

  4. Slices are definitely a more natural construct, and something programmers would want for other purposes. I was just thinking it would be nice to provide syntactic enforcement of disjointness of the two index sets. Of course it may not be hard to do that already, and I can imagine other ways to make disjoint index sets. So I may be inventing a new problem to
    solve ;-)

    The parallel iterators in Intel's Threading Building Blocks, like blocked_range, work by partitioning the iteration space recursively.

    In general, "Partition" is something people do with the nodes of graphs. Typical output is a set of disjoint sets of indices of nodes, satisfying particular criteria and/or maximizing some objective function. That's sufficiently complicated to be implemented in libraries, of course, but it's the most general kind of partitioning that makes sense to me.

  5. btw I'm not proposing you should do more work ;-) I'm just curious whether having a partitioning construct would help the compiler prove the lack of race conditions.

  6. OT to this post:
    1) Is Parasail going to have an open standard language spec such as Ada or will it be a pseudo-proprietary language such as C#? If the latter I would say don't bother since most embedded SW developers would never end up using it. Don't confuse "free" with "standard". Multiple vendors/providers are always needed to make a market.
    2) Is there a discussion forum?

  7. It is certainly a goal that ParaSail become an open standard. There is no desire to have the language be proprietary in any way.

    This blog is the only discussion forum so far. The URLs "", "", and "" have been acquired with the intent of using them for discussion forums, etc. Haven't quite gotten around to that yet!

    Somewhat amusingly, the junk mailers have already discovered that these new URLs were acquired, and have been sending (physical) junk mail to "Parasail Place". Sounds like a good name for the new web site!

  8. Yahoo or Google have discussion forums areas (groups) ready to setup and use - I would assume that would be much easier - maybe I am just old :) but I find trying to use a blog to understand and exchange technical ideas to be very limiting. I really think a blog is basically a web-based editorial.
    Maybe combining a blog with a discussion forum would lead to more input and feedback. I have been hoping for improved programming languages for embedded systems since Ada starting losing favor in the late 1990's - currently stuck using C and C++ (not my choice). Of course we use Matlab, Perl, Ruby and Python for support tasks so new languages have made in-roads there.

  9. Thanks for the suggestion. In the near term comments on this blog can hopefully act somewhat as a forum. Once the first prototype implementation is available for folks to play with, we will make sure there is a forum for feedback and discussion. A Yahoo or Google group might be the best way to do that.

  10. Well it seemed easy enough to create the Google Group now rather than later, so if you want to comment further on ParaSail, you may now do so at: