Friday, December 17, 2010

Parallel Quicksort in ParaSail and "slicing"

Quicksort is a classic "divide-and-conquer" algorithm, and lends itself naturally to parallelization.  However, unlike some of the other algorithms we have discussed in past postings, it is updating rather than merely walking a data structure.  This means that we need a way to safely partition the array to be sorted so that the ParaSail compiler knows that the threads are manipulating non-overlapping parts of the array, so it can be certain there are no race conditions possible.

In several languages there is the notion of an array slice.  In particular, in Fortran and Ada there is built-in syntax for selecting a contiguous part of an array.  In newer versions of Fortran this can be done on multi-dimensional arrays.  In Ada slicing is limited to single-dimensional arrays.  In both Fortran and Ada, array slices are updatable, that is they effectively represent a view of the underlying array.  APL (not surprisingly!) has all of these capabilities and more.  In many other languages there are operations for selecting a slice of a string (and in some cases arrays), but these are generally not updatable views, and hence cannot be used on the left-hand side of an assignment nor be passed as an in out parameter.  In some cases there are explicit "replace-slice" operations which do allow for updating a slice of an array.

In any case, in ParaSail, because we expect the compiler to catch race-conditions at compile-time, operations that return a view of a part of an object need to have well-defined properties so the compiler can correctly determine whether two different views have any overlap (i.e. are potentially aliased).  The simplest solution seems to be to recognize a "slicing" operator, similar to the "indexing" operator.  A "slicing" operator would take an indexable container, and then one or more operands at least one of which is an interval or a set, and return a view of the container that represents the subset of components identified by the operand(s).  For example, this might be an interface for a One_Dim_Array with both "indexing" and "slicing" operators:
interface One_Dim_Array
  <Element_Type is Assignable<>;
   Index_Type is Discrete<>> is

    function Bounds(A : One_Dim_Array) -> Interval<Index_Type>; 

    function Create(Bounds : Interval<Index_type>)
      -> One_Dim_Array {Bounds(Create) == Bounds};

    operator "indexing"(
       A : ref One_Dim_Array;
       Index : Index_Type {Index in Bounds(A)})
     -> ref Element_Type;

    operator "slicing"(
       A : ref One_Dim_Array;
       Slice : Interval<Index_Type> {Is_Subset(Slice, Bounds(A))})
     -> ref Result: One_Dim_Array {Bounds(Result) == Slice};
    ... 
end interface One_Dim_Array; 

As with the "indexing" operator, the "slicing" operator would be invoked by using A[...] notation, but where one or more of the operands inside the [] would be intervals or sets.  The ParaSail compiler would impose requirements on implementations of "indexing" and "slicing" operators.  For a given "indexing" operator, if the operands are different in two different calls, then the resulting elements must be distinct.  For a given "slicing" operator, if the operands are non-overlapping sets or intervals, then the resulting slices must be non-overlapping.   Note that the implementor of the "slicing" operator can decide whether the bounds of the result correspond to those of the selecting interval, or slide back to always starting at zero or one.  In the above example, the bounds correspond to the original indices of the selected elements, as indicated by the postcondition on the "slicing" operator.  In an interface designed for string manipulation, it might be more convenient for the slice to have bounds starting at zero or one.

Note that if there are multiple "indexing" operators, or multiple "slicing" operators for the same type, then they are presumed to be creating potentially overlapping elements or views, as they presumably represent different indexing schemes into the same data structure.  For example, a hash-table might be indexable both by the key and by some unique internal index.  That is, there is no guarantee that A["key"] and A[35] are unaliased just because "key" and 35 are distinct values, presuming these represent invocations of different "indexing" operators defined for the same indexable container.

Note that in all languages, you can achieve the equivalent of slicing by passing in the entire array and the bounds for the sub-array of interest.  This would however not allow the compiler to determine as easily whether two different calls are manipulating non-overlapping slices by looking only at the call sites.  A goal with ParaSail is that the compiler can identify any possible race-conditions by looking only at call sites and associated pre- and postconditions.  This argues for having explicit ways of indicating in a precondition or postcondition what parts of an object are read, what parts are updated, and what parts are not referenced at all.  Bu this probably comes back to being able to specify slices of a container, if only in annotations. 

In any case, given the above notion of "slicing" operators, we are now in a position to write our parallel Quicksort:
interface Sorting<One_Dim_Array<>> is
    procedure Quicksort(Arr : ref var One_Dim_Array;
       function Before(Left, Right : One_Dim_Array::Element_Type) 
         -> Boolean is "<");
          // Sort Arr according to the sorting function "Before" which returns
          // True if Left must appear before Right in the sorted order.
          // Before returns False if Left = Right.
end interface Sorting;

class Sorting is
  exports
    procedure Quicksort(Arr : ref var One_Dim_Array;
       function Before(Left, Right : One_Dim_Array::Element_Type) 
         -> Boolean is "<")
    is
        // Handle short arrays directly.  Partition longer arrays.
        case Length(Arr) of
          [0..1] => return;
          
          [2] => 
               if Before(Arr[Arr.Last], Arr[Arr.First]) then
                   // Swap elements
                   Arr[Arr.First] :=: Arr[Arr.Last];
               end if;
               return;
               
          [..] =>
               // Partition array
               const Mid := Arr[Arr.First + Length(Arr)/2];
               var Left : Index_Type := Arr.First;
               var Right : Index_Type := Arr.Last;
               until Left > Right loop
                 var New_Left : Index_Type := Right+1;
                 var New_Right : Index_Type := Left-1;
                 block
                   // Find item in left half to swap
                   for I in Left .. Right forward loop
                       if not Before(Arr[I], Mid) then
                           // Found an item that can go into right partitition
                           New_Left := I;
                           if Before(Mid, Arr[I]) then
                               // Found an item that *must* go into right part
                               exit loop;
                           end if;
                       end if;
                   end loop;
                 ||  
                   // Find item in right half to swap
                   for J in Left .. Right reverse loop
                       if not Before(Mid, Arr[J]) then
                           // Found an item that can go into left partitition
                           New_Right := J;
                           if Before(Arr[J], Mid) then
                               // Found an item that *must* go into left part
                               exit loop;
                           end if;
                       end if;
                   end loop;
                 end block;
                 
                 if New_Left > New_Right then
                     // Nothing more to swap
                     // Exit loop and recurse on two partitions
                     Left := New_Left;
                     Right := New_Right;
                     exit loop;
                 end if;
                 
                 // Swap items
                 Arr[New_Left] :=: Arr[New_Right];
                 
                 // continue looking for items to swap
                 Left := New_Left + 1;
                 Right := New_Right - 1;
               end loop;
               
               // At this point, "Right" is right end of left partition
               // and "Left" is left end of right partition
               // and the partitions don't overlap
               // and neither is the whole array
               // and everything in the left partition can precede Mid
               // and everything in the right partition can follow Mid
               // and everything between the partitions is equal to Mid.
               {Left > Right;
                Right < Arr.Last;
                Left > Arr.First;
                (for all I in Arr.First .. Right : not Before(Mid, Arr[I]));
                (for all J in Left .. Arr.Last : not Before(Arr[J], Mid));
                (for all K in Right+1 .. Left-1 : 
                  not Before(Mid, Arr[K]) and not Before(Arr[K], Mid))}
               
               // Recurse on two halves
               block
                   Quicksort(Arr[Arr.First .. Right], Before);
                 ||
                   Quicksort(Arr[Left .. Arr.Last], Before);
               end block;
        end case;        
   end procedure Quicksort;
end class Sorting;
We have used slicing as part of the recursion. Because the compiler can prove Left > Right, there is no overlap between the two slices, and hence no race condition.  It turns out that in ParaSail we could have written this without explicit recursion by using the continue loop with ... construct. We will save that for a later posting.

Friday, December 3, 2010

Deterministic parallelism and aliasing in ParaSail

There is growing interest in parallelization approaches that provide deterministic results.  See for example Deterministic Parallel Java.  In ParaSail, we have a somewhat different model.  Determinism should be provided for computations when it makes sense, but when the program is interactive or part of a real-time system, then clearly the environment is dynamic and time-dependent, and the notion of determinism becomes less relevant, and arguably not even desirable. 

It turns out that it is easy to know when non-determinism might be entering the execution of a ParaSail program.  It occurs only as a result of operations on concurrent objects.  Because ParaSail eliminates all race-conditions associated with non-concurrent objects at compile time, the results of a ParaSail program that manipulates only non-concurrent objects are inherently deterministic.  However, should the programmer decide to use a concurrent object, then the order in which operations on the concurrent object are performed by concurrent threads is unspecified, though the programmer can control it to some extent by specifying dequeue conditions when the operand is marked as queued.  (See blog posting parasail-concurrent-interfaces.html for more information on queued operands.)

Minimizing use of concurrent objects is generally a good idea anyway for performance reasons, as there is inevitably more overhead involved in coordinating access to a shared object, even when using lock-free approaches.  But how can a group of threads work in parallel to build up a large data structure, without the data structure itself being a concurrent object?  The normal answer to such a question is divide and conquer

What does divide and conquer really mean in ParaSail terms?  That means that a data structure needs to be conceptually divided up into non-overlapping parts and passed out to multiple threads for manipulation.  In ParaSail, a function can return a reference to a part of an object passed by reference as an input to the function.  However, the compiler needs to understand the aliasing implications of that, since in ParaSail, the compiler disallows race-conditions, which implies it must understand all aliasing related to non-concurrent objects at compile-time.  If two potentially concurrent computations are both updating parts of the same non-concurrent object, then the compiler must be able to convince itself that the two parts are non-overlapping (i.e. not aliased with one another).

Because there are no (re)assignable pointers in ParaSail, there are no cycles in data structures, so it is clear that Obj.Left and Obj.Right do not overlap.  On the other hand, when dealing with generalized containers, it is more difficult to know whether Container[X] and Container[Y] overlap, and even more complex when we are calling arbitrary functions that happen to return a reference to part of their input parameter, such as Lower_Left(Matrix) and Upper_Right(Matrix).  Clearly the programmer needs some way in a postcondition of such a function to convey the aliasing properties of the result.

ParaSail has a built-in, primitive, Basic_Array interface.  The compiler has built-in knowledge that, given an object BA based on the Basic_Array interface, BA[X] does not alias with BA[Y] if X != Y.  So one way to describe the "part" of a container that is included within the result of calling a function like "Lower_Left" is to describe the set of underlying array indices associated with the elements of the container that can be referenced from the result of Lower_Left.  More generally we can see that there is a set of primitive subcomponents accessible via any reference to a part of a container, and the aliasing problem comes down to proving that the subcomponent set from one reference does not overlap with the subcomponent set from some other reference.  This all implies that ParaSail needs a notation for identifying the set of accessible subcomponents of a reference, and a built-in Subcomponent_Set interface for specifying the values and the relationships between these subcomponent sets.

Monday, November 1, 2010

Type conversion in ParaSail

Allowing for user-defined type conversion is complex, as it is an NxN problem, where you want to allow conversion of N different types to each other, and each pair might involve a different set of operations. ParaSail addresses this problem in the following ways:
  • Allows use of the [[ ... ]] operation, which converts the operand to a universal type.  
    • The user defines which universal type(s) a given type converts to by defining the "to_univ" operator(s).  Once the value has been converted to a universal type, it will be implicitly converted to any other type which defines a "from_univ" operator from that same universal type.  
      • Using a prefix can disambiguate if necessary:  T2::[[T1_Obj]] will convert T1_Obj first to a universal type using T1's "from_univ" operator, and then to T2 using T2's "to_univ" operator.
  •  Allows use of the target type's name as a function name.
    • This will convert between two types that are structurally equivalent (that is, the same module parameters were used in their defining instantiation), but which were distinguished by one or both being defined as new.  For example:
      • type T1 is new Vector<Integer<>> and type T2 is new Vector<Integer<>> define two distinct types because of the use of new, but T2(T1_Obj) and T1(T2_Obj) will convert T1_Obj to type T2, and T2_Obj to type T1, respectively.
    • The target type's name can also convert an object from some source type if the user defines a "convert" operator for either of the two types, and this operator's input and output parameter types match the source and target of the conversion.  Note that this matching might be thanks to the input or output formal being itself parameterized.  For example, the interface Int_Vector below provides a "convert" operator that converts to any other instance of Int_Vector:
            interface Int_Vector<Element is Integer<>> is
               ...
               operator "convert"(X : Int_Vector) 
                 -> Result_Type is
                      Int_Vector<Target_Element is Integer<>>;
            end interface Int_Vector;
These capabilities seem to provide enough flexibility for the user to define the desired explicit conversion functions. Note that the only implicit conversion in ParaSail is from a universal type to a type with an appropriate "from_univ" operator.

A virtual machine for ParaSail with picothread scheduling

As we have worked toward having an initial compiler/interpreter for ParaSail, we have decided to define a ParaSail Virtual Machine (PSVM) which will support the kind of very-light-weight threading structure (picothreading) needed to be able to evaluate small constructs, like parameters in parallel.  We have decided to turn each ParaSail operation into a single PSVM Routine, even if its execution will involve multiple threads executing bits and pieces of the code for the operation. Each PSVM routine is identified by a unique index, the routine index, and is represented by a routine header and a sequence of PSVM instructions.

The ParaSail Virtual Machine instructions use relatively conventional addressing modes to refer to operands in memory. Memory is presumed to be organized into areas, each area being a contiguous sequence of 64-bit words. While executing, a PSVM routine has access to the following areas:
Parameter_Area
An area for input and output parameters. Output parameters, if any, come first, followed by input parameters. Parameters may be a single 64-bit value, or may be a 64-bit pointer to a larger object.
Local_Area
An area for storing local variables, and parameter lists being built up to pass as a Parameter_Area to a called routine. The first word of a local area is a link to the local area of the enclosing block or operation, in the presence of nesting. This sort of link is generally called a static link. The second word of a local area is a link to the associated parameter area.
Type_Area
Each type, which is an instance of a module, is represented by an area containing the actual module parameters, and a table of operations. Each operation in the operation table is represented by a pair: routine index and type area. In many cases the type area for an operation is the same as the enclosing Type_Area, but for inherited operations, the type area for the operation would refer to the type area for the super-class from which the operation was inherited. The actual module parameters are most frequently other types (represented by a pointer to their type area), but can be values, object references, or operations.
Instructions identify a memory location of interest by a pair: memory area and offset. This pair is called an object locator. In addition to the Parameter_Area, Local_Area, and Type_Area, an object locator can specify a memory area pointed-to by a local variable, and the locator offset is the offset within the pointed-to area. The local pointers are called local base registers, and may reside in any of the first 64 words of the local area. Finally, in the presence of nesting, the chain of static links may be followed to find an enclosing local area or enclosing parameter area.

Here is a sampling of ParaSail Virtual Machine instructions:

Move_Word, Store_Int_Literal, Store_String_Literal, Store_Real_Literal, Jump, Conditional_Jump, Call, Return, Parallel_Call, Parallel_Block, Parallel_Wait.

Operands are generally identified with object locators, except for literal operands which are identified either by their actual value, or by an index into a table of literals.

Note that there are no instructions for doing normal arithmetic or logical operations. These are all implemented by calling routines. There are a set of built-in routines for doing the typical set of arithmetic and logical operations, on operands of various sizes.

The more interesting instructions are the Parallel_Call, Parallel_Block, and Parallel_Wait. Parallel_Call is essentially equivalent to Call, where the calling routine computes the input parameters and places them in a parameter area, and then calls the routine. The difference with a Parallel_Call is that the caller also identifies a picothread master and allocates a small area for a picothread control block (pTCB), and the instruction completes without waiting for the call to complete. Instead, the Parallel_Call instruction adds the pTCB to a queue waiting to be serviced by one of the virtual processors (vCPUs) executing the ParaSail program. When the caller thread has finished its own work and wants to wait for the Parallel_Call to complete, it uses the Parallel_Wait instruction, identifying the same picothread master as was specified in the Parallel_Call instruction. This suspends the calling thread until all of the parallel picothreads associated with the picothread master complete.

The Parallel_Block instruction is very similar to the Parallel_Call instruction, except that it identifies instructions that are part of the current routine, rather than calling a separate routine. The execution of these instructions is performed by a separate picothread, which has its own pTCB, and local area. The static link of the local area for the Parallel_Block picothread refers to the local area of the thread invoking the Parallel_Block instruction, allowing the nested picothread to use up-level references to reach the local variables of the enclosing picothread.

The Return instruction is used to complete processing of both a Parallel_Call and a Parallel_Block, and the Parallel_Wait is used to wait for either kind of parallel activity.

We recently completed a prototype implementation of the ParaSail Virtual Machine, including the picothread scheduling. We learned some interesting lessons along the way. Initially, a vCPU that was executing a picothread that performed a Parallel_Wait was itself suspended. That quickly exhausted the number of vCPUs, and led us to start dynamically creating new vCPUs. That caused the overall stack space to grow dramatically, since each vCPU needed its own heavy-weight threading context in the underlying operating system, along with a stack.

At this point, we concluded that a vCPU that executed a Parallel_Wait instruction should service the queue of waiting pTCBs if the picothread master it was waiting for was not yet complete. That significantly reduced the number of vCPUs needed. However, it still didn't seem to be as efficient as it could be. As originally implemented, the queue of waiting pTCBs was first-in, first-out (FIFO). However, after looking at various traces of execution, we realized that it was the last pTCB that was created which was always the first pTCB to be awaited. Hence, we concluded that the pTCB queue should be last-in, first-out (LIFO). That is, a vCPU should preferentially service the most recently queued pTCB when it had cycles to spare, since that would more likely be associated with a picothread master that is being awaited, and by servicing that pTCB first, it will reduce the number of pTCBs that were suspended in a Parallel_Wait instruction. After making this final change, even a heavily recursive algorithm throwing off lots of parallel picothreads was handled efficiently.

Tuesday, September 14, 2010

Notation for "intervals" in ParaSail

It is quite common in a precondition or postcondition to want to indicate that the value of a variable must be within some interval of values (aka "range" of values), such as 1..10, or 0..+inf.  When dealing with real values (e.g. floating point values) as opposed to integer values, it is often desirable to represent an open or half-open interval, where the boundary value is not included in the specified interval.  For example, to specify that X must be in the interval 0.0 .. 10.0, but not including zero itself, the notation "(0.0 .. 10.0]" is sometimes used, where where "(" and ")" represent an open (exclusive) boundary, while "[" and "]" represent a closed (inclusive) boundary.

For ParaSail, because "()" and "[]" are already used in the syntax, we have adopted a different notation for open and half-open intervals:

  0.0 .. 10.0       closed interval  
  0.0 <.. 10.0     half-open on left
  0.0 ..< 10.0     half-open on right
  0.0 <..< 10.0   open on both sides

Hence, one can write in an annotation:

  { X in A <.. B }

as a short-hand for

  { A < X and then X <= B }

with the additional difference that X is only evaluated once (though that will rarely matter in an annotation). 

Like the other relational operators, these interval operators are defined automatically once the "=?" operator has been appropriately defined.

Wednesday, September 1, 2010

Finalization (destructors) and multi-thread exits in ParaSail

Languages such as C++ and Ada that support both finalization (destructors) and exceptions generally face some degree of distributed overhead due to the complex interactions between these two features.  Since exceptions can arise virtually anywhere, if there is finalization to be performed (e.g. destructors to be called) on the way "out" of a scope, there is generally a need for some kind of per-thread cleanup list to be walked as part of propagating exceptions, or the need for numerous implicit exception handlers and a precise indication of exactly how much initialization has been performed (e.g. constructors that have been completed) at any given moment.

As indicated in an earlier post, we have tentatively decided to do without exceptions in ParaSail, but the exit ... with feature can result in potentially numerous scopes being exited in the various other threads which are all being terminated as a result of some thread initiating an exit from a compound statement such as a block or loop.  We have also suggested that ParaSail will support some notion of finalization, if an "end" operator is associated with the type of an object, such that on exiting the scope of the object, the "end" operator will be invoked automatically.  So the question is: must these "end" operator calls be performed on all of the local objects of the threads being terminated as a side effect of an exit by some parallel thread?  Our tentative answer is "no." 

To avoid race conditions between a thread being terminated and the code executing after an exit, we believe that we need to restrict a thread that might be prematurely terminated, so that it can update only objects local to the exitable construct.  The only code of an exitable construct that would be permitted to update objects declared outside the construct would be the code following an "exit ... with" or the code following "end ... with".

For example, here is the exitable construct we used in an earlier post:

const Result : optional Tree_Id;
for T => Root then T.Left || T.Right
  while T not null concurrent loop
    if T.Value == Desired_Value then
        // Found desired node, exit with its identifier
        exit loop with (Result => T.Id);
    end if;
end loop with (Result => null);

In this example, we might have many threads all executing inside the loop concurrently.  To avoid race conditions, we would not allow any of these threads to update objects declared outside the loop, because they might be terminated at any moment, and the update might be disrupted in the middle.  However, we would allow the code following the "exit loop with" to update Result, as well as the code following "end loop with." This is safe because only one of these is ever executed for a given execution of the loop, and once we begin executing such code it won't be disrupted by some other thread of the same loop initiating an exit.

Note that the code following an exit ... with or end ... with might be disrupted by a thread exiting some enclosing construct, but this code would not be allowed to update objects outside that enclosing construct, thereby avoiding the race condition.

Given this rule that code in a compound statement with multiple threads may not update objects declared outside the compound statement, if there is a chance that at least one of those threads might perform an exit ... with, we can simplify the finalization problem.  There is no need to invoke the "end" operator on an object if that operator cannot possibly affect objects that will survive the exit statement.

Thinking more specifically about the "end" operator, and the general lack of access to global variables within an operation, what exactly can an "end" operator do?  The answer is that an "end" operator cannot really do anything unless the object being finalized includes a reference of some sort to an object that will outlive the finalization.  We will talk more in a future posting about what if any restrictions exist on incorporating a reference to one object as part of another object, but for now let us presume that such objects can exist, but they cannot be overwritten by a whole-object assignment statement (i.e. they are not "assignable"). 

What would be the purpose of such an object with an embedded reference?  One use would be to perform a kind of "mark/release" style of resource allocation.  On entry to a scope, a "mark" object could be created with a reference to an enclosing resource manager of some sort.  Inside the scope, allocations of the resource would be mediated by the "mark" object, such that when exiting the scope, the "end" operator applied to the "mark" object could automatically release all of the resources allocated on behalf of objects local to the scope.

Now returning to our original question about whether finalization needs to be performed on all objects local to a thread that is being prematurely terminated -- if we presume that the "end" operators are performing something analogous to a "release" operation, then we can see how we could safely skip all intermediary release operations so long as we perform the ones associated with the actual compound statement being exited.  This also presumes that the only references permitted from an object local to a multi-threaded compound statement with an exit, to an object declared outside the compound statement, are references to concurrent objects that are themselves local to the innermost enclosing multi-threaded-with-exit compound statement.  For example:

var Enclosing_Obj1 : concurrent T := ...
    ...
*Block1*
block
    var Mark1 := Create_Mark(Enclosing_Obj1, ...);
        // Mark1 contains a reference to Enclosing_Obj1
        // and has an "end" operator which performs a release operation.
    for I in 1..10 concurrent loop
          var Mark2 := Create_Mark(Mark1, ...);
              // Here we have an inner mark
          Allocate_Resource(Mark2, ...);
             // Here we allocate some resource using Mark2 to mediate the
             // allocation from Mark1 which in turn is mediating allocation
             // from Enclosing_Obj1.  The marks allow the resources to be 
             // automatically released on block exit as a side effect of
             // finalization via the "end" operator.
          if This_Looks_Good(I, ...) then
               exit block Block1 with (Result => I);
                 // This terminates any other threads inside Block1.
          end if;
    end loop;
end block Block1 with (Result => 0);

Now if some thread is deep within a call on "This_Looks_Good" when some other thread initiates the block exit, the only object that gets explicitly finalized will be Mark1.  The multiple Mark2 objects (one for each thread of the concurrent loop) will not be finalized, as presumably performing an "end" on Mark1 will also release any allocations that were mediated by one of the Mark2 objects.

The bottom line is that when a tree of threads is terminated by an exit from a multi-threaded compound statement, the only finalization that needs to be performed is for the objects immediately local to the compound statement being exited.  Objects local to the various threads being terminated need not be separately finalized.  This avoids the kind of complex interaction that exists between finalization and exception propagation, including avoiding the overhead of maintaining precise initialization information, and avoiding performing a lot of potentially wasteful finalization.

Tuesday, August 24, 2010

No exceptions in ParaSail, but exitable multi-thread constructs

We have been mulling over the idea of exceptions in ParaSail, and have pretty firmly concluded that they aren't worth the trouble.  In a highly parallel language, with lots of threads, exception propagation across threads becomes a significant issue, and that is a nasty area in general.  Also, exceptions can introduce many additional paths into a program, making thorough testing that much harder.  And the whole business of declaring what exceptions might be propagated, and then deciding what to do if some other exception is propagated can create numerous maintenance headaches.

There is a feature in ParaSail as currently designed which provides some of the same capabilities of exceptions, but is particularly suited to parallel programming.  This is the "exit with" statement, which allows a construct to be exited with one or more values specified as results, and at the same time terminating any other threads currently executing within the construct.  For example, here is a loop implementing a parallel search of a tree with the first thread finding the desired node exiting and killing off all of the other threads as part of the "exit ... with" statement:

const Result : optional Tree_Id;
for T => Root then T.Left || T.Right
  while T not null concurrent loop
    if T.Value == Desired_Value then
        // Found desired node, exit with its identifier
        exit loop with (Result => T.Id);
    end if;
end loop with (Result => null);

This declares a Result object of type Tree_Id.  It then walks the tree in parallel, starting at Root and continuing with T.Left and T.Right concurrently.  It continues until it reaches "null" on each branch, or some node is found with its Value component matching the Desired_Value.  The value of Identifier at the end indicates the identifier of the node having the desired Value, or null to indicate that no node was found.  The presence of optional in the declaration for Result indicates that its value might be null.

Supporting this kind of intentional "race" seems important in parallel programming, as many problems are amenable to a divide and conquer approach, but it is important that as soon as a solution is found, no further time is wasted searching other parts of the solution space.  The "end ... with" phrase allows the specification of one or more results if the construct ends normally, as opposed to via an "exit ... with" (in this case, ending normally means all threads reach a null branch in the walk of the tree without finding the desired node).  Effectively the "exit ... with" skips over the "end ... with" phrase.

So how does this all relate to exceptions?  Well given the "exit ... with" capability, one can establish two or more threads, one which monitors for a failure condition, and the others which do the needed computation.  The thread monitoring for a failure condition performs an "exit ... with" if it detects a failure, with the result indicating the nature of the failure, and as a side-effect killing off any remaining computation threads.  If the normal computation succeeds, then an "exit ... with" giving the final result will kill off the monitoring thread.  Note that the "exit ... with" statements must occur textually within the construct being exited, so it is visible whether such a premature exit can occur, unlike an exception which can arise deep within a call tree and be propagated out many levels.

As an example of the kind of failure condition which might be amenable to this kind of monitoring, imagine a resource manager object, which provides up to some fixed maximum of some kind of resource (e.g. storage) to code within a block.  This resource manager (which is presumably of a concurrent type) could be passed down to operations called within the block for their use.  Meanwhile, a separate monitoring thread would be created immediately within the block which would call an operation on the resource manager which would suspend the thread until the resource runs out, at which point it would be awakened with an appropriate indication of the resource exhaustion, and any other information that might be helpful in later diagnosis.  On return from this Wait_For_Exhaustion operation, the monitoring thread would do an "exit block with (Result => Failure, ...)" or equivalent, to indicate that the computation required more resources than were provided.  The code following the block would then be able to take appropriate action.

Monday, August 23, 2010

Initial implementation model for ParaSail types

ParaSail supports object-oriented programming, and as such there needs to be some kind of run-time representation for types to support dispatching calls (aka virtual function calls, method calls, etc.).  Most "normal" objects need no run-time type information associated with them, but an object or parameter of a polymorphic type (such as "Int_Expr+") needs some run-time type information to support dispatching calls on its operations.  In addition, each operation of an interface generally needs an implicit parameter identifying its "enclosing" type, to gain access to the module parameters, given that every module is effectively a "generic" module.

Because ParaSail supports multiple inheritance of interfaces (including ad hoc interface matching -- see blog entry on that topic), a simple table of operations for each type with a compile-time-known slot-number in the table doesn't work as it would if ParaSail only supported single inheritance.  Instead we adopt the notion of a "type view" which consists of a pointer to the overall table of operations of the type, as well as an "slot-number mapping" to provide the particular "view" of the type.  The slot-number mapping is a simple vector of operation slot numbers for the operation table, indexed by the interface slot number of the particular interface through which the type is being "viewed."  For example, presume we have an interface "Assignable" with only two operations, say, Assign and Copy, with interface slot numbers one and two.  Given some type with many operations, where the operation slot numbers of 21 and 33 are for Assign and Copy respectively, then the Assignable "view" of the type would have a slot-number mapping of:

      [1 => 21, 2 => 33]

The actual operation table would be a vector of pairs, each pair being a reference to the type from which the code for the operation was inherited, and the reference to the actual code for the operation.  Hence, the operation table for Type_C could be imagined as something like:

     [1 => (Type_A, Op1_Code),
      2 => (Type_B, Op2_Code),
      ...,
     21 => (Type_A, Assign_Code),
     ...,
     33 => (Type_C, Copy_Code),
     ...]

Here we assume that the code for Op1 and Assign originated with Type_A, the code for Op2 originated with Type_B, and the code for Copy originated with Type_C.

In addition to an operation table, a type would have a table of module actuals, one for each module formal parameter.  Module actuals could be themselves "type views," operations, or (constant) objects.

If an interface declared externally-visible components (const or var), these would be represented for uniformity by operations that take a reference to an object of the interface's type and return a reference to the component.  This allows multiple inheritance of such (at least conceptual) components from one or more implemented interfaces, though the actual offset of the components within the object (or whether they are true components in the case of consts) might very well differ from the corresponding components of the interface.

When an operation is called, in addition to its explicit parameters, it would be passed a reference to the type from which it originated.  This would allow it to gain access to the other operations of its module, as well as to the module actuals.  Because ParaSail allows the type from which a type is extended to be overridden (see blog entry on ParaSail extension, inheritance, and polymorphism), the operation table may vary depending on which type is the base for extension (since the type from which a given operation originates could vary).  Hence an operation doesn't necessarily "know" where one of the other operations of its module originates (presuming the operation is inherited rather than redefined within the module's class directly).  

Because each type has its own operation table with pointers to the code for each operation, it is possible for some of the operations to be specially compiled for that type, rather than simply reusing the "generic" code generated when the module was first compiled.  This allows the code to incorporate information about the module actuals (as an optimization), rather than having to use the information only at run-time which is what the "generic" code would do.  Hence, in the example above, the Copy_Code for Type_C might incorporate directly information about the module actuals or the operation table for Type_C, rather than having to index into the table of module actuals or into the operation table at run-time.

Other implementation models are of course possible, but this one seems to have the advantage of uniformity and flexibility, with constant-time run-time performance for making a dispatching call.

Ad hoc interface matching in ParaSail

When a module is instantiated in ParaSail to form a type, actual parameters must be specified for each module formal parameter.  When the module formal is itself a type, the module actual must be a type that matches the formal appropriately.   For example, given:

interface Set<Element_Type is Assignable<>> is
    function Union(Left, Right : Set) -> Set;
    function Unit_Set(Elem : Element_Type) -> Set;
   ...
end interface Set;

we will want to be able to write:

    type My_Int_Set is Set<My_Integer_Type>;

Now the question is whether My_Integer_Type must be based on an interface that is explicitly indicated as extending or implementing Assignable, or should we simply require that My_Integer_Type has all of the operations expected of a type based on Assignable.  In other words, does an actual type match a formal type only if it extends or implements the specified interface, or is ad hoc matching permitted, where at the point of instantiation a check is made that all of the required operations are present?  A similar question would come up when converting an object of a particular type to a polymorphic type (such as Assignable+).

Our initial instincts were to require that the actual type (or the type being converted) explicitly extend or implement the formal type (or the target type).  However, this will tend to cause a proliferation of interfaces being added to the list of implements for any given module, such as Assignable or Comparable or Numeric or ..., and still the actual subset of operations of interest might not be mentioned explicitly. 

Given the above concern, we now favor allowing ad hoc matching requirements for matching of module actuals to module formals, and when converting from a normal type to a polymorphic type.  There are some potential problems during maintenance if the operations of an interface change, but those problems exist anyway, since any code that makes calls on individual operations defined in an interface may suffer maintenance issues when operations change.  Note, however, that adding operations to the actual type, or removing them from the formal type, don't create maintenance issues.

In general, there are many fewer places where modules are instantiated, or conversions to polymorphic types occur, than there are normal calls on operations, and defining and using (presumably abstract) interfaces such as Assignable or Comparable as module formals, if they capture exactly what operations of interest are needed by the module, could reduce rather than increase maintenance problems when trying to decide whether to change an existing operation of some interface.

So the answer to the above question about instantiating Set with My_Integer_Type is now answered as follows:  My_Integer_Type must provide the operations defined in the interface Assignable, but it need not directly or indirectly extend or implement Assignable.  This generally means that the list of interfaces specified in the implements list is there primarily for documentation, and for imposing a linkage such that if additional operations are added to the implemented interface, the need to define the operations is known at the point of module definition, rather than being deferred until some point of use.  This also implies that when ad hoc matching is used, there is a presumption that the module formal (or target type) is very stable and new operations are not expected to be added to it.  A type like Assignable<> certainly qualifies. 

Eliminating the need for the Visitor Pattern in ParaSail

The "Visitor" pattern is one of the more widely recognized patterns identified by the "Gang of Four" in Design Patterns: Elements of Reusable Object-Oriented Software.  Object-oriented programming makes it relatively painless to add more types into a type hierarchy, but it makes it relatively painful to add more operations to an existing type hierarchy.  The Visitor pattern is intended to allow adding more operations without doing major surgery on the original type hierarchy, but it requires the use of a visitor object, plus an additional operation, often called accept, on each type, which turns around and calls a visit operation on the visitor object that is unique to that particular type in the hierarchy.

Object-oriented frameworks like that of Common Lisp Object System effectively eliminate the need for the visitor pattern by allowing the addition of dispatching operations (aka virtual functions, methods, etc.) after the type hierarchy has been established, with CLOS in particular allowing additional methods to be added to preexisting generic functions.

In ParaSail, we propose to support the definition of additional dispatching operations to be used with polymorphic types in a somewhat more modular way.  The basic idea is that an interface hierarchy can be extended horizontally in addition to vertically.  By this we mean that an additional named bundle of operations may be specified for the root interface of an interface hierarchy, and then this same named bundle may be defined for each descendant of this root interface, such that operations from this new bundle may be "dispatched" to when operating on polymorphic objects, just as can be the operations of the original root interface. 

Here is an example based on a simple interface hierarchy used to represent numeric expressions:

interface Expr <Value_Type is Integer<>> is
    function Evaluate(E : Expr) -> Value_Type;
end interface Expr;

interface Binary extends Expr is
    type Binary_Op_Enum is Enum<[#add, #sub, #mul, #div, ...]>;
    function Create(Op : Binary_Op_Enum; Left, Right : Expr+) -> Binary;
    function Evaluate(B : Binary) -> Value_Type;
end interface Binary;

interface Unary extends Expr is
    type Unary_Op_Enum is Enum <[#plus, #minus, #not, #abs, ...]>;
    function Create(Op : Unary_Op_Enum; Operand : Expr+) -> Unary;
    function Evaluate(U : Unary) -> Value_Type;
end interface Unary;

interface Leaf extends Expr is
    function Create(Value : Value_Type) -> Leaf;
    function Evaluate(L : Leaf) -> Value_Type;
end interface Leaf;

    ...

 interface Calculator<Int_Expr is Expr<>> is
    ...
    procedure Print_Value(Str : Output_Stream; E : Int_Expr+) is
        // E is of the polymorphic type Int_Expr+ 
        // so calls on its operations will dispatch.
        const Result := Evaluate(E);   // dispatching call
        Print(Str, Result);
    end procedure Print_Value;
end interface Calculator;

Now if we decide we want to add some more "dispatching" operations, we could either disrupt the existing hierarchy of interfaces by adding the operation to each interface, or we could add an operation to each interface to support the visitor pattern, and use that for defining additional operations.  For ParaSail, we are proposing a third approach:

interface Expr[#output] is
    function Text_Image(E : Expr) -> Univ_String;
    function XML_Image(E : Expr) -> Univ_String;
    ...
end interface Expr[#output];

interface Binary[#output] is
    function Text_Image(B : Binary) -> Univ_String;
    function XML_Image(B : Binary) -> Univ_String;
    ...
end interface Binary[#output];
...

Here we have defined an interface extension labeled "#output".  We could define additional interfae extensions with other labels.  Now we have effectively added Text_Image and XML_Image as dispatching operations of Expr without a need for a visitor pattern and without disrupting the original Expr interface hierarchy.  These operations are available when we refer to "Expr[#output]" rather than simply "Expr".  If we had another extension labeled "#input," we could gain access to both by referring to "Expr[#input | #output]".

Then we could write:

 interface Calculator<Int_Expr is Expr[#input | #output]<>> is
       // Now we can use operations from the #input and #output extensions
       // of Expr when manipulating polymorphic objects of type Int_Expr+
    ...
    procedure Print_Expr_And_Value(Str : Output_Stream; E : Int_Expr+) is
        // E is of the polymorphic type Int_Expr+ 
        // so calls on its operations (including those from the
        // interface extensions #input and #output) will dispatch.

        const Result := Evaluate(E);   // dispatching call.
        Print(Str, Text_Image(E));      // Now display the expression
        Print(Str, " = ");
        Print(Str, Result);                    // and then display its value.
    end procedure Print_Expr_And_Value;
end interface Calculator;

This approach allows the definition of additional dispatching operations in a modular fashion to an existing type hierarchy, while not disrupting existing users.  It seems to provide some advantages over the Visitor pattern approach as well, as the new operations are not second class citizens requiring the construction of a visitor object, etc.

Monday, July 19, 2010

Pointer-free primitives for ParaSail

Having pretty much settled on eliminating explicit pointers from ParaSail (largely to simplify compile-time alias analysis and race-condition elimination), the question is: what sorts of primitive data structures are needed in ParaSail to allow the user to construct essentially arbitrary abstractions on top of the primitives without using explicit pointers? The ParaSail interface/class module already provides a basic kind of basic record or struct capability.  As mentioned in an earlier posting, we have also suggested the use of the keyword optional to signify that an object might or might not have a value of its specified type, with its default being a null value.  Internally this could be implemented either with an extra bit to signify null-ness, or with an implicit level of indirection with a null pointer used to indicate null-ness (some indirection is clearly necessary for a recursive data structure). 

By using the optional keyword and the basic record capability provided by a class, the user can quite easily construct structures such as linked lists and binary trees.  However, more complex structures will have to end up falling back on a generalized notion of container with pointers replaced by the use of generalized indexing.  For example, a directed graph structure can be represented by a vector of nodes with the edges between the nodes being represented by indices into the vector.  To some extent this is going back to a Fortran array-oriented view of the world.  However, with the arrays being extensible, and the elements of the arrays being arbitrarily complex objects, which themselves are potentially optional and extensible, there is significantly more flexibility available to the programmer than what the original Fortran arrays provided.

As a first stab at what primitive container types are needed, it would seem that a basic array type, indexed by integers, and extensible only by whole-array assignment with a longer or shorter array, might be adequate.  The elements of the array would be of an arbitrary type, including user-defined types.  The whole-array assignment would deal with whatever storage management is necessary to reuse or reclaim the prior value of the object being assigned.  The compiler would attempt to arrange to have the new value be created in the storage region associated with the object to which it will be assigned, so as to avoid having to physically move the new value into that storage region as part of the assignment (see earlier post on region-based storage management).  For example, given:
    var X : Basic_Array<...> := Build_Array(Length => 10);
    ...
    X := Extend_Array(X, New_Length => 20);
the compiler would arrange for the calls on Build_Array and Extend_Array to create their resulting array in a storage region appropriate for X, so that the assignment need not move the created array into that region.  Essentially, as part of being called, a function would receive information about the region in which its result(s) should be allocated.  This information would be propagated as necessary to try to minimize the unnecessary movement of data between storage regions as part of an assignment operation.

It would seem that with this combination of the record-like capability provided by classes, the notion of optional, and something like this Basic_Array type extensible only by whole-array assignment, we have all of the building blocks needed to create arbitrary data structures in ParaSail without the use of pointers.

Sunday, July 18, 2010

Updated (A)Yacc grammar for ParaSail

Here is an update to the ParaSail (A)Yacc grammar posted a few weeks ago.   This one has no shift/reduce or reduce/reduce conflicts.   The "N Queens" example is accepted by this grammar.  Below the grammar is the slightly updated (A)Flex lexer:

--------------------------------------
-- Tentative YACC Grammar for ParaSail
--------------------------------------

-- Single-character delimiters --
%token ',' ';' ':' '.'
%token '+' '-' '*' '/' 
%token '?'
%token '(' ')' '[' ']' '<' '>' '{' '}'
%token '|' 
%token PRIME -- '''

-- Compound delimiters --
%token COMPARE -- "=?"
%token EQ   -- "=="
%token NEQ  -- "!="
%token GEQ  -- ">="
%token LEQ  -- "<="
%token POWER  -- "**"
%token ASSIGN -- ":="
%token SWAP   -- ":=:"
%token DOT_DOT -- ".."
%token DOUBLE_COLON  -- "::"
%token DOUBLE_LEFT_BRACKET  -- "[["
%token DOUBLE_RIGHT_BRACKET -- "]]"
%token REFERS_TO  -- "=>"
%token GIVES    -- "->"
%token IMPLIES    -- "==>"
%token SEQUENCE   -- ";;"
%token PARALLEL   -- "||"
%token PLUS_ASSIGN -- "+="
%token MINUS_ASSIGN -- "-="
%token TIMES_ASSIGN -- "*="
%token DIVIDE_ASSIGN -- "/="
%token POWER_ASSIGN -- "**="
%token CONCAT_ASSIGN -- "|="
%token AND_ASSIGN -- "and="
%token OR_ASSIGN -- "or="
%token XOR_ASSIGN -- "xor="

-- Literals --
%token Char_Literal
%token Enum_Literal
%token Integer_Literal 
%token Real_Literal
%token String_Literal

-- Identifier --
%token Identifier 

-- Reserved words --
%token ABS_kw
%token ABSTRACT_kw
%token ALL_kw
%token AND_kw
%token BLOCK_kw
%token CASE_kw
%token CLASS_kw
%token CONCURRENT_kw
%token CONST_kw
%token CONTINUE_kw
%token EACH_kw
%token ELSE_kw
%token ELSIF_kw
%token END_kw
%token EXIT_kw
%token EXPORTS_kw
%token EXTENDS_kw
%token FOR_kw
%token FORWARD_kw
%token FUNCTION_kw
%token IF_kw
%token IMPLEMENTS_kw
%token IMPORT_kw
%token IN_kw
%token INTERFACE_kw
%token IS_kw
%token LOCKED_kw
%token LOOP_kw
%token MOD_kw
%token MUTABLE_kw
%token NEW_kw
%token NOT_kw
%token NULL_kw
%token OF_kw
%token OPERATOR_kw
%token OPTIONAL_kw
%token OR_kw
%token PROCEDURE_kw
%token QUEUED_kw
%token REF_kw
%token REM_kw
%token RETURN_kw
%token REVERSE_kw
%token SELECT_kw
%token SOME_kw
%token THEN_kw
%token TYPE_kw
%token VAR_kw
%token WHILE_kw
%token WITH_kw
%token XOR_kw

%start module_list

{ 
   subtype yystype is integer;
}


%%

module_list : 
    module
  | module_list module
  ;

module : 
    import_clauses interface_declaration ';' 
  | import_clauses class_definition ';' 
  | import_clauses standalone_operation_definition ';'
  ;

import_clauses : 
  | import_clauses IMPORT_kw qualified_name_list ';'
  ;

qualified_name_list : 
    qualified_name
  | qualified_name_list ',' qualified_name
  ;

interface_declaration : 
   opt_interface_qualifier INTERFACE_kw module_defining_name 
     formals_and_implemented_interfaces
     IS_kw
      interface_element_list
   END_kw INTERFACE_kw module_defining_name ;
   
opt_interface_qualifier : interface_qualifier | ;

interface_qualifier : 
    class_qualifier
  | ABSTRACT_kw opt_class_qualifier
  ;

opt_class_qualifier : class_qualifier | ;

class_qualifier : CONCURRENT_kw ;

standalone_operation_definition : 
    function_definition
  | procedure_definition
  ;

formals : '<' opt_module_formal_list '>' ;

formals_and_implemented_interfaces :
    formals
  | formals implements_list
  | opt_formals EXTENDS_kw interface_name opt_implements_list
  ; 

opt_formals : formals | ;

opt_implements_list : implements_list | ;

implements_list : IMPLEMENTS_kw interface_name_list ;

interface_name_list :
    interface_name
  | interface_name_list ',' interface_name
  ;

interface_name : module_name | module_instantiation ;
   
module_name : qualified_name ;

module_defining_name : qualified_name ;

opt_module_formal_list : module_formal_list | ;

module_formal_list : 
    annotated_module_formal 
  | module_formal_list ';' annotated_module_formal 
  ;

annotated_module_formal : 
    opt_annotation type_formal opt_annotation
  | opt_annotation operation_formal
  | opt_annotation value_formal opt_annotation
  ;

opt_annotation : annotation | ;

type_formal : 
    Identifier IS_kw module_instantiation 
  | module_instantiation 
  ;

operation_formal : operation_declaration ;

value_formal : 
    id_list ':' type_specifier 
  | id_list ':' type_specifier ASSIGN simple_expression  -- to avoid use of '>'
  ;

id_list : 
    Identifier
  | id_list ',' Identifier
  ;

type_name : 
    qualified_name 
  | polymorphic_type_name
  ;

polymorphic_type_name : qualified_name '+' ;

qualified_name : 
    Identifier 
  | qualified_name DOUBLE_COLON Identifier ;


module_instantiation : 
  module_name '<' opt_module_actual_list '>' ;

opt_module_actual_list : module_actual_list | ;

module_actual_list :
    module_actual 
  | module_actual_list ',' module_actual 
  ;

module_actual : 
    simple_type_specifier_or_expression
  | Identifier REFERS_TO simple_type_specifier_or_expression
  ;

-- simple_expression subsumes simple type_name in this rule
simple_type_specifier_or_expression : 
    qualified_name annotation      -- polymorphic type name not allowed here
  | simple_expression              -- simple_expr to avoid problems with '>'
  | module_instantiation
  ;
  
annotated_type_specifier : 
    type_specifier
  | type_specifier annotation
  ;

type_specifier : 
    opt_CONCURRENT_kw type_name 
  | opt_CONCURRENT_kw module_instantiation 
  ;

opt_CONCURRENT_kw : CONCURRENT_kw | ;

interface_element_list : 
  | interface_element_list interface_element ';'
  ;

interface_element : 
    operation_declaration 
  | object_declaration 
  | interface_declaration 
  | type_declaration 
  ;


class_definition :
   opt_class_qualifier CLASS_kw module_defining_name 
      class_formals_and_implemented_interfaces
   IS_kw
      class_element_list
   END_kw CLASS_kw module_defining_name ;

class_formals_and_implemented_interfaces :
    formals_and_implemented_interfaces
  | opt_formals EXTENDS_kw Identifier ':' interface_name opt_implements_list
  ; 


class_element_list : 
    local_class_element_list
  EXPORTS_kw 
    exported_class_element_list ;

local_class_element_list :
  | local_class_element_list local_class_element ';'
  ;

local_class_element : interface_element | exported_class_element ;
  
exported_class_element_list :
  | exported_class_element_list exported_class_element ';'
  ;

exported_class_element : 
    operation_definition 
  | object_definition 
  | class_definition 
  ;
  
   
annotation : '{' annotation_element_list '}' ;

annotation_element_list : 
    annotation_element
  | annotation_element_list ';' annotation_element
  ;

annotation_element : interface_element | condition | quantified_expression ;

condition : expression ;


operation_declaration : 
    function_declaration 
  | procedure_declaration 
  | operator_declaration 
  ;

function_declaration :
  FUNCTION_kw Identifier operation_inputs GIVES operation_outputs ;

procedure_declaration :
  PROCEDURE_kw Identifier operation_inputs ;
  
operator_declaration :
  OPERATOR_kw operator_designator operation_inputs opt_gives_operation_outputs ;

opt_gives_operation_outputs : GIVES operation_outputs | ;
  
operator_designator : String_Literal ;
  
operation_inputs :
    simple_operation_input opt_annotation
  | annotation simple_operation_input opt_annotation
  | '(' opt_annotated_operation_input_list ')' opt_annotation
  ;

simple_operation_input :   -- avoids trailing use of "IS"
    id_list ':' opt_input_modifier simple_operand_type_specifier 
  | input_modifier simple_operand_type_specifier 
  | simple_operand_type_specifier 
  ;

opt_annotated_operation_input_list : annotated_operation_input_list | ;

annotated_operation_input_list : 
    annotated_operation_input
  | annotated_operation_input_list ';' annotated_operation_input
  ;

annotated_operation_input : 
    operation_input opt_annotation
  | annotation operation_input opt_annotation 
  | opt_annotation function_declaration
  | opt_annotation procedure_declaration
  ;

operation_input : 
    id_list ':' opt_input_modifier operand_type_specifier opt_ASSIGN_expression
  | input_modifier operand_type_specifier opt_ASSIGN_expression
  | operand_type_specifier opt_ASSIGN_expression
  ;

opt_input_modifier : input_modifier | ;
  
simple_operand_type_specifier :
    type_name
  | module_instantiation
  ;

operand_type_specifier : 
    simple_operand_type_specifier
  | Identifier IS_kw module_instantiation 
         -- NOTE: Operation can have "type" parameters 
         -- such as "Left_Type is Integer<>"
  ;

input_modifier : 
    output_modifier 
  | QUEUED_kw opt_output_modifier
  | LOCKED_kw opt_output_modifier
  ;


operation_outputs : 
    simple_operation_output opt_annotation
  | annotation simple_operation_output opt_annotation
  | '(' annotated_operation_output_list ')' opt_annotation
  ;

simple_operation_output :   -- avoids trailing use of "IS"
    id_list ':' opt_output_modifier simple_operand_type_specifier
  | output_modifier simple_operand_type_specifier
  | simple_operand_type_specifier
  ;

annotated_operation_output_list :
    annotated_operation_output
  | annotated_operation_output_list ';' annotated_operation_output
  ;

annotated_operation_output : 
    operation_output opt_annotation
  | annotation operation_output opt_annotation 
  ;

operation_output : 
    id_list ':' opt_output_modifier operand_type_specifier
  | output_modifier operand_type_specifier 
  | operand_type_specifier 
  ;

opt_output_modifier : output_modifier | ;

output_modifier :  
    OPTIONAL_kw
  | REF_opt_optional_mutable 
  | REF_opt_optional_mutable VAR_kw 
  | REF_opt_optional_mutable CONST_kw 
  ;

REF_opt_optional_mutable :
    REF_kw
  | REF_kw OPTIONAL_kw
  | REF_kw MUTABLE_kw
  | REF_kw OPTIONAL_kw MUTABLE_kw
  ;

object_declaration : 
    VAR_kw Identifier ':' 
      opt_OPTIONAL_kw opt_MUTABLE_kw annotated_type_specifier 
      opt_ASSIGN_expression
  | CONST_kw Identifier ':' opt_OPTIONAL_kw annotated_type_specifier 
      opt_ASSIGN_expression
  ;

opt_ASSIGN_expression : ASSIGN expression | ;
   
object_definition :
    CONST_kw Identifier ASSIGN expression
  | VAR_kw Identifier ASSIGN expression
  | CONST_kw Identifier REFERS_TO name
  | VAR_kw Identifier REFERS_TO name
  ;

opt_OPTIONAL_kw : OPTIONAL_kw | ;
opt_MUTABLE_kw : MUTABLE_kw | ;

type_declaration : 
    TYPE_kw Identifier IS_kw opt_NEW_kw annotated_type_specifier ;

opt_NEW_kw : NEW_kw | ;

operation_definition : 
    function_definition 
  | procedure_definition 
  | operator_definition 
  ;

function_definition : 
  function_declaration IS_kw 
     statement_list_with_semi 
  END_kw FUNCTION_kw Identifier ;

procedure_definition : 
  procedure_declaration IS_kw 
     statement_list_with_semi 
  END_kw PROCEDURE_kw Identifier ;

operator_definition : 
  operator_declaration IS_kw 
     statement_list_with_semi 
  END_kw OPERATOR_kw Identifier  ;

statement_list_opt_semi :
    statement_sequence
  | statement_list_with_semi
  ;

statement_list_with_semi : 
    statement_sequence_with_semi
  | statement_list_opt_semi PARALLEL statement_sequence_with_semi
  ;

statement_list :                -- "||" is lower precedence than ";" and ";;"
    statement_sequence
  | statement_list_opt_semi PARALLEL statement_sequence
  ;

statement_sequence_with_semi : statement_sequence ';' ;

statement_sequence :
    annotated_statement
  | statement_sequence SEQUENCE annotated_statement
  | statement_sequence_with_semi SEQUENCE annotated_statement
  | statement_sequence_with_semi annotated_statement
  ;
  
annotated_statement : 
    opt_annotation local_declaration  
            -- NOTE: these already allow trailing annotations
  | opt_annotation statement opt_annotation
  ;

statement : 
    local_definition 
  | simple_statement 
  | label compound_statement
  | compound_statement
  -- NOTE: now will use BLOCK instead of: '(' statement_list_opt_semi ')' 
  ;

opt_label : label | ;

simple_statement :
    name assign_operator expression
  | name SWAP name
  | name '(' opt_operation_actual_list ')'
  | '(' operation_actual_list ')' ASSIGN expression  -- multiple assignment
  | RETURN_kw expression
  | RETURN_kw opt_WITH_values 
  | CONTINUE_kw LOOP_kw opt_id opt_WITH_values 
  | EXIT_kw compound_statement_kind opt_id opt_WITH_values
  ;

opt_operation_actual_list : operation_actual_list | ;

opt_WITH_values : WITH_values | ;

WITH_values : WITH_kw '(' operation_actual_list ')' ;

opt_id : Identifier | ;

compound_statement_kind : LOOP_kw | IF_kw | CASE_kw | SELECT_kw | BLOCK_kw ;

local_declaration : 
    operation_declaration 
  | type_declaration 
  | object_declaration
  ;

local_definition :
    object_definition 
  | operation_definition 
  ;

label : '*' Identifier '*' ;

compound_statement :
    if_statement
  | case_statement
  | while_loop_statement
  | for_loop_statement
  | block_statement 
  | select_statement
  ;

if_statement : 
  IF_kw condition THEN_kw 
     statement_list_with_semi
  elsif_list
  opt_else
  END_kw IF_kw opt_id opt_WITH_values ;

elsif_list : 
  | elsif_list elsif_clause
  ;

elsif_clause :
  ELSIF_kw condition THEN_kw
     statement_list_with_semi ;

opt_else : ELSE_kw statement_list_with_semi | ;

case_statement : 
  CASE_kw expression OF_kw
    case_alt_list
    opt_default_alt
  END_kw CASE_kw opt_id opt_WITH_values ;

case_alt_list : 
    case_alt
  | case_alt_list case_alt
  ;

case_alt :
    '[' choice_list ']' REFERS_TO statement_list_with_semi ;

choice_list : 
    choice  
  | choice_list '|' choice 
  ;

choice : term_or_range ;

opt_default_alt : '[' DOT_DOT ']' REFERS_TO statement_list_with_semi | ;

while_loop_statement :
  WHILE_kw condition LOOP_kw
    statement_list_with_semi
  END_kw LOOP_kw opt_id opt_WITH_values ;

for_loop_statement :
  FOR_kw iterator_spec opt_direction LOOP_kw
    statement_list_with_semi
  END_kw LOOP_kw opt_id opt_WITH_values ;

iterator_spec : 
    iterator 
  | '(' iterator_list ')'
  ;

iterator_list :
    iterator
  | iterator_list ';' iterator
  ;

iterator :
    Identifier IN_kw choice_list
  | EACH_kw Identifier OF_kw expression
  | Identifier ASSIGN expression THEN_kw next_value_list WHILE_kw condition
  | Identifier REFERS_TO name THEN_kw next_name_list WHILE_kw condition
  | Identifier ':' type_specifier ASSIGN expression 
  ;

next_value_list : 
    expression 
  | next_value_list PARALLEL expression 
  ;

next_name_list : 
    name 
  | next_name_list PARALLEL name 
  ;

opt_direction : direction | ;

direction : CONCURRENT_kw | FORWARD_kw | REVERSE_kw ;

select_statement :
  SELECT_kw 
     select_alt_list
  END_kw SELECT_kw opt_id opt_WITH_values ;

select_alt_list : 
    select_alt
  | select_alt_list PARALLEL select_alt
  ;

select_alt : 
  '[' statement_list_opt_semi ']' REFERS_TO statement_sequence_with_semi ;

block_statement :
  BLOCK_kw
    statement_list_with_semi
  END_kw BLOCK_kw opt_id opt_WITH_values ;
 
expression :  
    expression_component
  | expression '|' expression_component
  ;

expression_component : -- logical operators are non-associative
    comparison_expression
  | comparison_expression logical_operator comparison_expression
  | comparison_expression '?' expression_component ':' expression_component
  ;

comparison_expression :  -- comparisons are non associative
    simple_expression
  | simple_expression comparison_operator simple_expression
  | simple_expression IN_kw type_or_term_or_range
  | simple_expression NOT_kw IN_kw type_or_term_or_range
  | simple_expression IS_kw NULL_kw
  | simple_expression NOT_kw NULL_kw
  ;

simple_expression : -- used to avoid use of '>' in module instantiation
    term
  | simple_expression adding_operator term
  ;

type_or_term_or_range :
    polymorphic_type_name
  | term_or_range       -- includes simple type name
  ;

term_or_range : 
    term
  | term DOT_DOT term  -- avoid ambiguity on use of '+'
  ;
  
term : 
    factor
  | term multiplying_operator factor
  ;

factor : 
    primary
  | primary power_operator factor  -- right associative
  | unary_operator factor  -- makes unary ops higher precedence 
  ;

primary :
    name
  | Integer_Literal
  | Real_Literal
  | Char_Literal
  | String_Literal
  | Enum_Literal
  | NULL_kw
  | '(' conditional_expression ')'
  | '(' quantified_expression ')'
  | aggregate
  | DOUBLE_LEFT_BRACKET expression DOUBLE_RIGHT_BRACKET 
  ;
  
name :
    qualified_name attribute_list opt_PRIME
  | qualified_name PRIME
  | qualified_name 
  | name '(' opt_operation_actual_list ')'
  | name '[' opt_operation_actual_list ']'
  | name '.' selector
  ;

attribute_list :
    Enum_Literal
  | attribute_list Enum_Literal 
  ;

opt_PRIME : PRIME | ;

operation_actual_list : 
    operation_actual 
  | operation_actual_list ',' operation_actual 
  ;

operation_actual : 
    expression
  | Identifier REFERS_TO expression 
  ;

selector : Identifier ;

unary_operator : '+' | '-' | ABS_kw | NOT_kw ;

adding_operator : '+' | '-' ;

multiplying_operator : '*' | '/' ;

power_operator : POWER ;

assign_operator : 
    ASSIGN | PLUS_ASSIGN | MINUS_ASSIGN | TIMES_ASSIGN | DIVIDE_ASSIGN 
  | CONCAT_ASSIGN | AND_ASSIGN | OR_ASSIGN | XOR_ASSIGN
  ;

comparison_operator : COMPARE | EQ | NEQ | '<' | LEQ | '>' | GEQ ;

logical_operator :
    AND_kw | OR_kw | XOR_kw  
  | AND_kw THEN_kw | OR_kw ELSE_kw | IMPLIES
  ;


aggregate : class_aggregate | container_aggregate ;

class_aggregate : '(' opt_operation_actual_list ')' ;

container_aggregate : '[' opt_container_element_list ']' ;
  
opt_container_element_list : container_element_list | ;

container_element_list : 
    container_element 
  | container_element_list ',' container_element 
  ;

container_element : 
    expression_component
  | choice_list REFERS_TO filtered_expression_stream 
  | DOT_DOT REFERS_TO filtered_expression_stream 
  ;

container_key : expression ;

filtered_expression_stream : 
    expression
  | expression ':' condition
  ;

conditional_expression :
    if_expression
  | case_expression
  ;

if_expression : 
  IF_kw condition THEN_kw 
     expression
  elsif_expr_list
  opt_else_expr ;

elsif_expr_list : 
    elsif_expr_clause
  | elsif_expr_list elsif_expr_clause
  ;

elsif_expr_clause :
  ELSIF_kw condition THEN_kw expression ;

opt_else_expr : ELSE_kw expression | ;

case_expression : 
  CASE_kw expression OF_kw
    case_expr_alt_list ;

case_expr_alt_list : 
    case_expr_alt
  | case_expr_alt_list ';' case_expr_alt
  ;

case_expr_alt : 
    '[' choice_list ']' REFERS_TO expression 
  | '[' DOT_DOT ']' REFERS_TO expression   -- NOTE: must come last
  ;

quantified_expression :
    FOR_kw ALL_or_SOME_kw iterator ':' condition ;

ALL_or_SOME_kw : ALL_kw | SOME_kw ;

%%


package parasail_parser is

    procedure yyparse;

    echo : boolean := false;
    number_of_errors : natural := 0;


end parasail_parser;

with parasail_tokens, parasail_lex_io, parasail_goto, parasail_shift_reduce;
with parasail_lex, text_io;

use  parasail_tokens, parasail_lex_io, parasail_goto, parasail_shift_reduce;
use  parasail_lex, text_io;
package body parasail_parser is

    procedure yyerror(s: in string := "syntax error") is
    begin
 number_of_errors := number_of_errors + 1;
 put("<<< *** ");
 put_line(s);
    end yyerror;


##%procedure_parse

end parasail_parser;


---------------------------------------------------
And here is the (A)Flex lexer:
---------------------------------------------------

%START IDENT Z


GRAPHIC_CHAR  [ !"#$%&'()*+,-./0-9:;<=>?@A-Z\[\\\]^_`a-z{|}~]

STRING_LITERAL  (\"([^\"]|[\\][.])*\")

CHAR_LITERAL    (\'([^\']|[\\][.])\')

IDENTIFIER        [a-zA-Z]([_]?[a-zA-Z0-9])*

  -- The following are used to match all numeric literals.
  -- Note that double underscores are rejected.
DIGIT_SEQUENCE    [0-9]([_]?[0-9])*
HEX_SEQUENCE      [0-9a-fA-F]([_]?[0-9a-fA-F])*
EXPONENT          [Ee][-+]?{DIGIT_SEQUENCE}

%%

  -- ParaSail reserved words
"abs"     {ECHO_L; ENTER(Z); return (ABS_kw);}
"abstract" {ECHO_L; ENTER(Z); return (ABSTRACT_kw);}
"all"  {ECHO_L; ENTER(Z); return (ALL_kw);}
"and"  {ECHO_L; ENTER(Z); return (AND_kw);}
"block"  {ECHO_L; ENTER(Z); return (BLOCK_kw);}
"case"  {ECHO_L; ENTER(Z); return (CASE_kw);}
"class"  {ECHO_L; ENTER(Z); return (CLASS_kw);}
"concurrent" {ECHO_L; ENTER(Z); return (CONCURRENT_kw);}
"const"  {ECHO_L; ENTER(Z); return (CONST_kw);}
"continue" {ECHO_L; ENTER(Z); return (CONTINUE_kw);}
"each"  {ECHO_L; ENTER(Z); return (EACH_kw);}
"else"  {ECHO_L; ENTER(Z); return (ELSE_kw);}
"elsif"  {ECHO_L; ENTER(Z); return (ELSIF_kw);}
"end"  {ECHO_L; ENTER(Z); return (END_kw);}
"exit"  {ECHO_L; ENTER(Z); return (EXIT_kw);}
"exports" {ECHO_L; ENTER(Z); return (EXPORTS_kw);}
"extends" {ECHO_L; ENTER(Z); return (EXTENDS_kw);}
"for"  {ECHO_L; ENTER(Z); return (FOR_kw);}
"forward" {ECHO_L; ENTER(Z); return (FORWARD_kw);}
"function" {ECHO_L; ENTER(Z); return (FUNCTION_kw);}
"if"  {ECHO_L; ENTER(Z); return (IF_kw);}
"implements" {ECHO_L; ENTER(Z); return (IMPLEMENTS_kw);}
"import" {ECHO_L; ENTER(Z); return (IMPORT_kw);}
"in"  {ECHO_L; ENTER(Z); return (IN_kw);}
"interface" {ECHO_L; ENTER(Z); return (INTERFACE_kw);}
"is"  {ECHO_L; ENTER(Z); return (IS_kw);}
"locked" {ECHO_L; ENTER(Z); return (LOCKED_kw);}
"loop"  {ECHO_L; ENTER(Z); return (LOOP_kw);}
"mod"  {ECHO_L; ENTER(Z); return (MOD_kw);}
"mutable" {ECHO_L; ENTER(Z); return (MUTABLE_kw);}
"new"  {ECHO_L; ENTER(Z); return (NEW_kw);}
"not"  {ECHO_L; ENTER(Z); return (NOT_kw);}
"null"  {ECHO_L; ENTER(Z); return (NULL_kw);}
"of"  {ECHO_L; ENTER(Z); return (OF_kw);}
"operator" {ECHO_L; ENTER(Z); return (OPERATOR_kw);}
"optional" {ECHO_L; ENTER(Z); return (OPTIONAL_kw);}
"or"  {ECHO_L; ENTER(Z); return (OR_kw);}
"procedure" {ECHO_L; ENTER(Z); return (PROCEDURE_kw);}
"queued" {ECHO_L; ENTER(Z); return (QUEUED_kw);}
"ref"  {ECHO_L; ENTER(Z); return (REF_kw);}
"rem"  {ECHO_L; ENTER(Z); return (REM_kw);}
"return" {ECHO_L; ENTER(Z); return (RETURN_kw);}
"reverse" {ECHO_L; ENTER(Z); return (REVERSE_kw);}
"select" {ECHO_L; ENTER(Z); return (SELECT_kw);}
"some"  {ECHO_L; ENTER(Z); return (SOME_kw);}
"then"  {ECHO_L; ENTER(Z); return (THEN_kw);}
"type"  {ECHO_L; ENTER(Z); return (TYPE_kw);}
"var"  {ECHO_L; ENTER(Z); return (VAR_kw);}
"while"  {ECHO_L; ENTER(Z); return (WHILE_kw);}
"with"  {ECHO_L; ENTER(Z); return (WITH_kw);}
"xor"  {ECHO_L; ENTER(Z); return (XOR_kw);}

  -- Match all the compound ParaSail delimiters. 
"=?"  {ECHO_L; ENTER(Z); return(COMPARE);}
"=="  {ECHO_L; ENTER(Z); return(EQ);}
"!="  {ECHO_L; ENTER(Z); return(NEQ);}
">="  {ECHO_L; ENTER(Z); return(GEQ);}
"<="  {ECHO_L; ENTER(Z); return(LEQ);}
"**"  {ECHO_L; ENTER(Z); return(POWER);}
":="  {ECHO_L; ENTER(Z); return(ASSIGN);}
":=:"  {ECHO_L; ENTER(Z); return(SWAP);}
".."  {ECHO_L; ENTER(Z); return(DOT_DOT);}
"::"  {ECHO_L; ENTER(Z); return(DOUBLE_COLON);}
"[["  {ECHO_L; ENTER(Z); return(DOUBLE_LEFT_BRACKET);}
"]]"  {ECHO_L; ENTER(Z); return(DOUBLE_RIGHT_BRACKET);}
"=>"  {ECHO_L; ENTER(Z); return(REFERS_TO);}
"->"  {ECHO_L; ENTER(Z); return(GIVES);}
"==>"  {ECHO_L; ENTER(Z); return(IMPLIES);}
";;"  {ECHO_L; ENTER(Z); return(SEQUENCE);}
"||"  {ECHO_L; ENTER(Z); return(PARALLEL);}
"+="  {ECHO_L; ENTER(Z); return(PLUS_ASSIGN);}
"-="  {ECHO_L; ENTER(Z); return(MINUS_ASSIGN);}
"*="  {ECHO_L; ENTER(Z); return(TIMES_ASSIGN);}
"/="  {ECHO_L; ENTER(Z); return(DIVIDE_ASSIGN);}
"**="  {ECHO_L; ENTER(Z); return(POWER_ASSIGN);}
"|="  {ECHO_L; ENTER(Z); return(CONCAT_ASSIGN);}
"and="  {ECHO_L; ENTER(Z); return(AND_ASSIGN);}
"or="  {ECHO_L; ENTER(Z); return(OR_ASSIGN);}
"xor="  {ECHO_L; ENTER(Z); return(XOR_ASSIGN);}

  -- Match all the ParaSail single-character delimiters.
<IDENT>\'  {ECHO_L; ENTER(Z);     return(PRIME);}
"("        {ECHO_L; ENTER(Z);     return('(');}
")"        {ECHO_L; ENTER(IDENT); return(')');}
"["        {ECHO_L; ENTER(Z);     return('[');}
"]"        {ECHO_L; ENTER(IDENT); return(']');}
"<"        {ECHO_L; ENTER(Z);     return('<');}
">"        {ECHO_L; ENTER(Z);     return('>');}
"{"    {ECHO_L; ENTER(Z);   return('{');}
"}"    {ECHO_L; ENTER(Z);   return('}');}
"*"        {ECHO_L; ENTER(Z);     return('*');}
"+"        {ECHO_L; ENTER(Z);     return('+');}
","        {ECHO_L; ENTER(Z);     return(',');}
"-"        {ECHO_L; ENTER(Z);     return('-');}
"."        {ECHO_L; ENTER(Z);     return('.');}
"/"        {ECHO_L; ENTER(Z);     return('/');}
":"        {ECHO_L; ENTER(Z);     return(':');}
";"        {ECHO_L; ENTER(Z);     return(';');}
"|"        {ECHO_L; ENTER(Z);     return('|');}
"?"        {ECHO_L; ENTER(Z);     return('?');}

  -- The following is used to match all valid ParaSail identifiers
  -- except reserved words. Note that leading digits and underscores
  -- are not allowed and that double underscores are not allowed.

{IDENTIFIER}       {ECHO_L; ENTER(IDENT);return(Identifier);}

  -- Enumeration literals
[#]{IDENTIFIER}    {ECHO_L; ENTER(IDENT);return(Enum_Literal);}

  -- Decimal numeric literals
{DIGIT_SEQUENCE}{EXPONENT}?  {
         ECHO_L; ENTER(Z);
         return(Integer_Literal);}

{DIGIT_SEQUENCE}[.]{DIGIT_SEQUENCE}{EXPONENT}?  {
         ECHO_L; ENTER(Z);
         return(Real_Literal);}

  -- Based numeric literals.

{DIGIT_SEQUENCE}[#]{HEX_SEQUENCE}[#]{EXPONENT}? {
         ECHO_L; ENTER(Z);
         return(Integer_Literal);}

{DIGIT_SEQUENCE}[#]{HEX_SEQUENCE}[.]{HEX_SEQUENCE}[#]{EXPONENT}? {
         ECHO_L; ENTER(Z);
         return(Real_Literal);}

"0"[xX]{HEX_SEQUENCE}  {ECHO_L; ENTER(Z); return(Integer_Literal);}
"0"[bB]{DIGIT_SEQUENCE}  {ECHO_L; ENTER(Z); return(Integer_Literal);}

  -- Match all valid character literals.  See Ada LRM 2.6.

<Z>{CHAR_LITERAL}      {ECHO_L; ENTER(Z); return(Char_Literal);}

  -- Match all valid string literals.  See Ada LRM 2.6.

{STRING_LITERAL}                {ECHO_L; ENTER(Z); return(String_Literal);}

"//".*    {ECHO_L;}           -- ignore comments to end-of-line

"--".*    {ECHO_L;}           -- ignore comments to end-of-line

  -- The following matches all whitespace.  Except for vertical tabs.  AFLEX,
  -- ALEX and LEX does not support vertical tabs.  This causes use to fail
  -- when parsing some of the ACVC Btests.

[ \r\t\f]+ {ECHO_L;}        -- ignore spaces,Carriage returns,tabs,form feeds

  -- The following matches all new lines.

[\n]       {ECHO_L; linenum;}

  -- The following matches everything else and prints an error message
  -- indicating that something unexpected was found.

.          {ECHO_L; 
            text_io.put_line("?? lexical error '" & 
       parasail_lex_dfa.yytext & "' ??");
     num_errors := num_errors + 1;}

%%

with parasail_tokens; 
use  parasail_tokens;
use text_io;

package parasail_lex is
  
  lines      : positive := 1;
  num_errors     : natural := 0;
  Trace          : Boolean := False;

  procedure ECHO_L; --local version_of define_string.
  procedure linenum; 

  function yylex return token;

end parasail_lex;

package body parasail_lex is

  procedure ECHO_L is
  --
  -- Local version of the  define string.
  -- 
  begin
-----    if Trace then
-----      TEXT_IO.PUT_Line("Token :" & YYTEXT & ":");
-----    end if;
 text_io.put(yytext);
  end ECHO_L;


  procedure linenum is
    line_number_string : constant string :=
          integer'image ( lines );
  begin
    lines := lines + 1;
    put(line_number_string);
    for i in 1 .. 5 - integer ( line_number_string'length ) loop
      text_io.put(" ");
    end loop;

  end linenum;

##

end parasail_lex;

Saturday, July 17, 2010

N Queens Problem in ParaSail

Here is a (parallel) solution to the "N Queens" problem in ParaSail, where we try to place N queens on an NxN chess board such that none of them can take each other. This takes the idea of using the "continue" statement as a kind of implicit recursion to its natural conclusion.  This presumes you can turn a "normal" data structure like Vector<> into a concurrent data structure by using the keyword "concurrent," which presumably means that locking is used on all operations to support concurrency.  It is debatable whether this use of a "continue" statement to effectively start the next iteration of a loop almost like a recursive call is easier or harder to understand than true recursion. 

interface N_Queens <N : Univ_Integer := 8> is
    // Place N queens on a checkerboard so that none of them can
    // "take" each other.
    type Row is new Integer<1, N>;
     
    function Place_Queens() -> Vector<Vector<Row>> 
      {for all I in Place_Queens#range : Length(Place_Queens[I]) == N};
end interface N_Queens;

class N_Queens <N : Univ_Integer := 8> is
    type Column is new Integer<1, N>;
    type Sum is Vector<Boolean, Index => Integer<2, 2*N>>;
    type Diff is Vector<Boolean, Index => Integer<1-N, N-1>>;
    
exports
    
    function Place_Queens() -> Vector<Vector<Row>> 
      {for all I in Place_Queens#range : Length(Place_Queens[I]) == N} is
      // Place N queens on checkerboard so that none of them can "take" each other
      var Solutions : concurrent Vector<Vector<Row>> := [];
      
      *Outer_Loop*
      for (C : Column := 1; Trial : Vector<Row> := [];
        Diag1 : SSum := [.. => #false]; Diag2 : Diff := [.. => #false]) loop
          // Iterate over the columns
        
          for R in Row concurrent loop
              // Iterate over the rows
              if not Diag1[R + C] and then not Diag2[R - C] then
                  // Found a Row/Column combination that is not on any diagonal
                  // already occupied.
                  if C < N then
                      // Keep going since haven't reached Nth column.
                      continue loop Outer_Loop with (C => C+1, Trial => Trial | R,
                        Diag1 => Diag1 | [(R+C) => #true],
                        Diag2 => Diag2 | [(R-C) => #true]);
                  else
                      // All done, remember trial result.
                      Solutions |= Trial;
                  end if;
              end if;
          end loop;
          
          
      end loop Outer_Loop;
      return Solutions;
      
    end function Place_Queens;
end class N_Queens;

Thursday, June 24, 2010

Some additional examples of ParaSail

Here is a grab-bag of ParaSail examples, with interspersed musings embedded in the comments.  These are acceptable to the ayacc/aflex-based parser posted earlier.

import XXYZ;  // TBD: Does this precede the interface?
              //      Any notion of "use" or "using"
interface Hello_World <Standard_IO<>> is
    procedure Main(IO : ref Standard_IO);
end interface Hello_World;

// TBD: What do enumeration type declarations look like?
//      The first parameter is an id-list type.  The other
//      parameters might be something relating to first/last,
//      'image, etc.

// TBD: What is the comment syntax?  "//" or "--"?  
//      Modula and ML use "(* *)" but ML folks are suggesting
//      the use of "# " as the start of a comment that goes to the EOL.
//      For some reason I find "#" pretty ugly, especially in the
//      middle of a line.
//      C#, C++, and Java all support "//", as well as the
//      partial line '/* ... */'.  That seems error-prone,
//      so sticking with "//" by itself seems simplest.
//      Also "--" means decrement to half the world.

// Syntax:  module::module, type::operation, type::nested_entity,
//          object.component, object.operation [equiv to type::operation(object, ...)]

// TBD: When a non-abstract interface has no module parameters, does that mean
//      the module name can be used directly as a type name?  Or are the "<>"
//      mandatory?  Can we say "interface Univ_Enumeration<> is ..."

// TBD: Should we make aggregates usable when outside the associated class?
//      What about "extension" aggregates?  Operator "aggregate"?

// TBD: Get rid of "subtype" and just use "type Blah is ..." or "type Blah is new ..."
//      "subtype" is a relationship, not a kind of entity.

interface Std <> is

    abstract interface Any<> is
        operator "end" (Obj : ref var Any);
    end interface Any;
    
    abstract interface Assignable<> extends Any is
        operator "copy" Assignable -> Assignable;
    end interface Assignable;

    -- type Univ_Enumeration is ...; // type with all enumerals
    
    type Boolean is Enum <[#false, #true]>; // True or False
    type Ordering is Enum <[#less, #equal, #greater, #unordered]>;

    -- type Univ_Integer is ...; // infinite precision integers with +/- inf
    -- type Univ_Real is ...;    // infinite precision rational numbers 
         // with +/- zero and inf
    -- type Univ_Character is ...;  // ISO 10646, 31-bit character
    -- type Univ_String is ...;  // UTF-32
    
    
end interface Std;

interface Std::Integer<First, Last : Univ_Integer> is
    operator "from_univ"(Univ : Univ_Integer {Univ in First .. Last}) 
      -> Integer;
    operator "to_univ"(Int : Integer) 
      -> Result: Univ_Integer {Result in First .. Last};

    operator "+"(Left, Right : Integer {[[Left]] + [[Right]] in First .. Last}) 
      -> Result : Integer {[[Result]] == [[Left]] + [[Right]]};
end interface Std::Integer;

// [[x]] means "to_univ"(X), as a nod to the double bracket notation used in
// denotational semantics to define the meaning of an expression in a program.


abstract interface Std::Ref<Target_Type is Any<>> 
is
   function Is_Writable(R : Ref) -> Boolean;
   function Exists(R : Ref) -> Boolean;
   function Fetch(R : Ref {Exists(R)}) -> Target_Type;
   procedure Store(R : Ref {Is_Writable(R)}; 
     New_Contents : Target_Type) {Exists(R)};
   procedure Delete(R : Ref {Is_Writable(R)}) {not Exists(R)};
end interface Std::Ref;

class Hello_World <Context<>; Standard_IO<>> is
  exports
    procedure Main(IO : ref Standard_IO) is
       Put_Line(IO.Output, "Hello World");
    end procedure Main;
end class Hello_World;

interface String<Character<>; Index is Integer<>> is
    operator "in"(Lit : Univ_String) -> Boolean;
    operator "from_univ"(Lit : Univ_String {Lit in String}) 
      -> String;
    operator "to_univ"(Str : String) -> Result : Univ_String
      {Result in String};
    operator "~"(Left, Right : String) -> Result : String
      {Length(Result) == Length(Left) + Length(Right)};
end interface String;
      
interface Standard_IO<Context<>; Input_Stream<>; Output_Stream<>> is
    function Output(Context) -> Output_Stream;
    function Input(Context) -> Input_Stream;
end interface Standard_IO;

interface Output_Stream<String<>> is
    procedure Put(Output : ref Output_Stream; Char : String::Character);
    procedure Put(Output : ref Output_Stream; Message : String);
    procedure Put_Line(Output : ref Output_Stream; Message : String);
end interface Output_Stream;

interface Input_Stream<String<>> is
    function Get(Input : ref Input_Stream) -> String::Character;
    function Get_Line(Output : ref Input_Stream) -> String;
end interface Input_Stream;

interface Integer<First, Last : Univ_Integer> is
    operator "in"(Lit : Univ_Integer) -> Result : Boolean 
      {Result == Lit in First .. Last};
    operator "from_univ"(Univ_Integer {Univ_Integer in First .. Last}) 
      -> Integer;
    operator "to_univ"(Integer) -> Result : Univ_Integer 
      {Result in First .. Last};
    operator "+"(Left, Right : Integer 
      {[[Left]] + [[Right]] in First .. Last}) 
      -> Result : Integer
        {[[Result]] == [[Left]] + [[Right]]};
    operator "-"(Left, Right : Integer
      {[[Left]] - [[Right]] in First .. Last}) 
        -> Integer;
    operator "*"(Left, Right : Integer
      {[[Left]] * [[Right]] in First .. Last}) 
        -> Integer;
    operator "/"(Left, Right : Integer
      {[[Left]] / [[Right]] in First .. Last}) 
        -> Integer;
    operator "=?"(Left, Right : Integer) -> Ordering;
end interface Integer;

interface Character<> is
    operator "in"(Lit : Univ_Character) -> Boolean;
    operator "from_univ"(Lit : Univ_Character {Lit in Character}) 
      -> Character;
    operator "to_univ"(Char : Character) 
      -> Result : Univ_Character 
      {Result in Character};
end interface Character;

interface Vector<Component is Assignable<>; 
   First : Univ_Integer := 1;
   Max_Length : Univ_Integer := +inf> is
    function Length(Vec : ref const Vector) -> Univ_Integer 
      {Length in 0..Max_Length};
    function Last(Vec : ref const Vector) -> Univ_Integer 
      {Last == Length(Vec) + First - 1};
    operator "[]"(Vec : ref Vector; Index : Univ_Integer 
      {Index in First..Last(Vec)}) 
      -> ref Component_Type;
    operator "~"(Left, Right : Vector) -> Result : Vector 
      {Length(Left) + Length(Right) == Length(Result)}; 
        // "~" is pronounced "concatenate"
        // or "connect" or "tied to"
        // or "chained to" or "linked to" or ...?
    operator "~"(Left : Vector; Right : Component) -> Result : Vector
      {Length(Left) + 1 == Length(Result)};
    operator "~"(Left : Component; Right : Vector) -> Result : Vector
      {Length(Right) + 1 == Length(Result)};
    operator "~"(Left, Right : Component) -> Result : Vector
      {Length(Result) == 2}; // Do we really need this one?  
                             // Is equivalent to "[Left, Right]"
    operator "null"() -> Result : Vector {Length(Result) == 0};
    procedure Append(Vec : ref mutable Vector; Value : Component)
      {Length(Vec') == Length(Vec) + 1};
    procedure Append(Vec : ref mutable Vector; Other_Vec : Vector)
      {Length(Vec') == Length(Vec) + Length(Other_Vec)}; 
                                        // Vec' means new value
                                        // do we prefer "old(...)"?
                                        // How about post(...) and pre(...)?
                                        // or "new(...)" and "old(...)"?
                                        // Since these are appearing in
                                        // a postcondition, it seems
                                        // redundant to say "post" or "new".
                                        // In any case "new" reads better than post.

    operator ":="(Left : ref var Vector; Right : Vector)
      {Length(Left) == Length(Right)};
      
    operator ":=:"(Left, Right : ref var Vector)
      {Length(Left') == Length(Right) and 
       Length(Right') == Length(Left)};
    operator "+"(Left, Right : Vector {Length(Left) == Length(Right)}) 
      -> Result : Vector
      {Length(Result) == Length(Left)};  // component-wise addition
    
end interface Vector;

interface Stack<Component is Assignable<>; 
  Size_Type is Integer<>> is
    
    function Max_Stack_Size(S : Stack) -> Size_Type 
      {Max_Stack_Size > 0};
    
    function Count(S : Stack) -> Size_Type 
      {Count <= Max_Stack_Size(S)};
    
    function Init(Max : Size_Type {Max > 0}) -> Stack 
      {Max_Stack_Size(Init) == Max and Count(Init) == 0};
    
    function Is_Empty(S : Stack) -> Boolean
      {Is_Empty == (Count(S) == 0)};
    
    function Is_Full(S : Stack) -> Boolean
      {Is_Full == (Count(S) == Mas_Stack_Size(S))};
      
    function Top(S : Stack {not Is_Empty(S)}) -> Component;
    
    procedure Pop(S : ref var Stack {not Is_Empty(S)}) 
      {Count(S') == Count(S) - 1};
      
    procedure Push(S : ref var Stack {not Is_Full(S)}; 
      X : Component) {Count(S') == Count(S) + 1};
      
end interface Stack;