Here is a new revision (rev 1.6) of the alpha 0.2 release of the ParaSail compiler and virtual machine
http://bit.ly/tJyc5E
The big addition here is a new example, the "drinking philosophers" example (drinking_phils.psl). This example is a variant on the classic "dining philosophers" example, and makes heavy use of timers and complex dequeue conditions. A lot of changes under the hood, but no major new features.
This blog will follow the trials and tribulations of designing a new programming language designed to allow productive development of parallel, high-integrity (safety-critical, high-security) software systems. The language is tentatively named "ParaSail" for Parallel, Specification and Implementation Language.
Monday, November 7, 2011
Monday, October 24, 2011
Alpha release 0.2 of ParaSail Compiler and Virtual Machine
We are now making available alpha release 0.2 (rev 1.5) of the ParaSail Compiler and Virtual Machine:
http://bit.ly/rsxQDX
This release implements several important additional features, including queued parameters for operations on concurrent objects, floating-point and fixed-point numbers along with the universal type Univ_Real, and timers supporting delay for a time interval and delay until a specified timer value, or until a specific wall clock time.
This release also includes two different version of the VM, one which simulates multi-threading on a single O/S thread, and another which uses multiple O/S threads to provide true parallelism on a multi-core/multi-processor machine.
Please read the release notes included in the above download for additional information.
-Tuck
http://bit.ly/rsxQDX
This release implements several important additional features, including queued parameters for operations on concurrent objects, floating-point and fixed-point numbers along with the universal type Univ_Real, and timers supporting delay for a time interval and delay until a specified timer value, or until a specific wall clock time.
This release also includes two different version of the VM, one which simulates multi-threading on a single O/S thread, and another which uses multiple O/S threads to provide true parallelism on a multi-core/multi-processor machine.
Please read the release notes included in the above download for additional information.
-Tuck
Friday, October 14, 2011
Upcoming panels, tutorials, and workshops on ParaSail
ParaSail will be discussed at two panels at the upcoming SPLASH/OOPSLA conference in Portland, Oregon, and will be covered in depth in a tutorial and workshop/birds-of-a-feather session at the upcoming SIGAda conference in Denver.
The panels at SPLASH/OOPSLA are:
The panels at SPLASH/OOPSLA are:
- A panel that is part of the Transitioning to Multicore (TMC) workshop, on Sunday Oct. 23rd, at 2:00PM -- see http://tmc.supertriceratops.com/
- A panel on Multicore, Manycore, and Cloud computing -- Is a new programming language paradigm required?, on Wednesday Oct. 26th, at 4:00PM -- see http://splashcon.org/2011/program/229
- A tutorial on Experimenting with ParaSail, which will be held Monday November 7th, from 9:00AM to 12:30PM -- see http://www.sigada.org/conf/sigada2011/tutorials-descriptions.html#MA1
- A workshop/birds-of-a-feather session on ParaSail, which will be on Tuesday, November 8th, from 4:30PM to 5:30PM -- see http://www.sigada.org/conf/sigada2011/workshops.html
Thursday, October 13, 2011
Deadlocking the Virtual Machine
As mentioned in an earlier posting, ParaSail has its own virtual machine which directly supports the notion of picothreads, very light-weight threads which are served by heavier-weight server processes (see http://parasail-programming-language.blogspot.com/2010/11/virtual-machine-for-parasail-with.html). We are now compiling and executing various interesting ParaSail examples, and we recently began to see some apparent deadlocks in the virtual machine implementation.
The actual ParaSail program (N Queens -- see http://parasail-programming-language.blogspot.com/2011/09/n-queens-join-party.html) does not have any inherent deadlock itself. In fact, it only has one concurrent object, a vector of solutions, and the only locking operation used is to add another solution on to the end of the vector. Nevertheless, we are seeing with a true multi-threaded virtual machine implementation, a deadlock in the virtual machine, such that all forward progress stops. Like many situations in concurrent programs (at least pre-ParaSail...), these kinds of problems are timing dependent. In some cases we can get the N-Queens solver for 8 queens to finish, sometimes not. The version using a single-kernel-thread virtual machine, which merely simulates multi-threading, never has a problem.
So what could be causing the deadlock? After lots of debugging, a hypothesis has emerged. We are now in the process of acting on this hypothesis, so we don't know whether it is correct, or whether making a corresponding fix will solve the entire problem. So here is the hypothesis:
As explained in the posting about the virtual machine (actually, in a comment on that posting), we serve the queue of picothreads associated with a particular server process in a LIFO manner. But when a server steals threads from another server, it picks up the oldest picothread, i.e. uses FIFO. This is essentially identical to the approach used in the Cilk thread scheduler (see
http://supertech.csail.mit.edu/papers/steal.pdf). Our hypothesis is that it is the stealing of picothreads which is creating the deadlock.
So how could the stealing of threads create a deadlock? The hypothesized mechanism begins with a server executing a pictothread that gets a lock on an object. This happens when there is a call on an operation with a locked or queued parameter, or when an individual concurrent object (such as the Solutions vector in N-Queens) is passed to an operation where it is not known to be concurrent (such as the "|=" operation in the Vectors module).
In the virtual machine implementation, getting a lock means acquiring a mutex or equivalent. This mutex is automatically released when the operation completes. However, suppose the operation has internal parallelism. In ParaSail, parallelism is everywhere, and any operation of any complexity is likely to have some internal parallelism. The "|=" operation is relatively complex, particularly if the underlying array needs to be expanded before adding on another element to the vector. So there is plenty of opportunity for some internal parallelism. That is manifested in the virtual machine by one or more additional picothreads being created and queued on the same server's thread, and then eventually, the server waits for the parallel picothreads to complete.
As soon as new picothreads show up on the server's queue, other servers may start stealing them. As mentioned above, stealing works FIFO, i.e. oldest first, but if the original server's queue is essentially empty at the moment when a picothread from the locked operation is queued, then it will be the oldest thread on the queue, and so is a candidate for stealing. So let's presume that this new sub-picothread gets stolen. And perhaps that was the only picothread spawned by the original server. When it gets around to waiting for it to complete, it finds it is not yet done, and in fact its own queue is now empty, so it steals from a third server. So now we have our original server thread which is executing a picothread which is waiting for a subthread, going off and stealing work from some random third server's queue. Now let's suppose that the thread on that third server's queue wants to get the original lock. So the original server is now waiting on the lock that it already holds on behalf of some other picothread. A classic tail-chasing deadlock. Pretty soon all of the other servers stack up behind this same lock and everything grinds to a halt.
So how do we avoid this problem? Fundamentally we don't want a server that is holding a lock, to go off and start executing a thread which is not a sub-thread of the original thread which acquired the lock. If each server and each picothread keeps track of the innermost lock it holds, and locks are chained together so that an inner lock points to the next outer lock, if any, held, then we should be able to avoid the problem. Whenever a server goes looking for a thread, if it holds a lock, then the thread it chooses must also hold that same lock. It might also have some additional lock, but it must, somewhere on the chain of locks starting at the innermost one, have the same lock as that held by the server. This argues for having at least two queues per server, one of picothreads holding no locks, and one of picothreads holding at least one lock. Then when a queue is served by a server holding a lock, either FIFO or LIFO, it will only look at the queue of picothreads holding at least one lock, and it will ignore those on that queue that don't have the server's lock somewhere in their chain of locks.
Deadlocks are fun -- no doubt about it...
The actual ParaSail program (N Queens -- see http://parasail-programming-language.blogspot.com/2011/09/n-queens-join-party.html) does not have any inherent deadlock itself. In fact, it only has one concurrent object, a vector of solutions, and the only locking operation used is to add another solution on to the end of the vector. Nevertheless, we are seeing with a true multi-threaded virtual machine implementation, a deadlock in the virtual machine, such that all forward progress stops. Like many situations in concurrent programs (at least pre-ParaSail...), these kinds of problems are timing dependent. In some cases we can get the N-Queens solver for 8 queens to finish, sometimes not. The version using a single-kernel-thread virtual machine, which merely simulates multi-threading, never has a problem.
So what could be causing the deadlock? After lots of debugging, a hypothesis has emerged. We are now in the process of acting on this hypothesis, so we don't know whether it is correct, or whether making a corresponding fix will solve the entire problem. So here is the hypothesis:
As explained in the posting about the virtual machine (actually, in a comment on that posting), we serve the queue of picothreads associated with a particular server process in a LIFO manner. But when a server steals threads from another server, it picks up the oldest picothread, i.e. uses FIFO. This is essentially identical to the approach used in the Cilk thread scheduler (see
http://supertech.csail.mit.edu/papers/steal.pdf). Our hypothesis is that it is the stealing of picothreads which is creating the deadlock.
So how could the stealing of threads create a deadlock? The hypothesized mechanism begins with a server executing a pictothread that gets a lock on an object. This happens when there is a call on an operation with a locked or queued parameter, or when an individual concurrent object (such as the Solutions vector in N-Queens) is passed to an operation where it is not known to be concurrent (such as the "|=" operation in the Vectors module).
In the virtual machine implementation, getting a lock means acquiring a mutex or equivalent. This mutex is automatically released when the operation completes. However, suppose the operation has internal parallelism. In ParaSail, parallelism is everywhere, and any operation of any complexity is likely to have some internal parallelism. The "|=" operation is relatively complex, particularly if the underlying array needs to be expanded before adding on another element to the vector. So there is plenty of opportunity for some internal parallelism. That is manifested in the virtual machine by one or more additional picothreads being created and queued on the same server's thread, and then eventually, the server waits for the parallel picothreads to complete.
As soon as new picothreads show up on the server's queue, other servers may start stealing them. As mentioned above, stealing works FIFO, i.e. oldest first, but if the original server's queue is essentially empty at the moment when a picothread from the locked operation is queued, then it will be the oldest thread on the queue, and so is a candidate for stealing. So let's presume that this new sub-picothread gets stolen. And perhaps that was the only picothread spawned by the original server. When it gets around to waiting for it to complete, it finds it is not yet done, and in fact its own queue is now empty, so it steals from a third server. So now we have our original server thread which is executing a picothread which is waiting for a subthread, going off and stealing work from some random third server's queue. Now let's suppose that the thread on that third server's queue wants to get the original lock. So the original server is now waiting on the lock that it already holds on behalf of some other picothread. A classic tail-chasing deadlock. Pretty soon all of the other servers stack up behind this same lock and everything grinds to a halt.
So how do we avoid this problem? Fundamentally we don't want a server that is holding a lock, to go off and start executing a thread which is not a sub-thread of the original thread which acquired the lock. If each server and each picothread keeps track of the innermost lock it holds, and locks are chained together so that an inner lock points to the next outer lock, if any, held, then we should be able to avoid the problem. Whenever a server goes looking for a thread, if it holds a lock, then the thread it chooses must also hold that same lock. It might also have some additional lock, but it must, somewhere on the chain of locks starting at the innermost one, have the same lock as that held by the server. This argues for having at least two queues per server, one of picothreads holding no locks, and one of picothreads holding at least one lock. Then when a queue is served by a server holding a lock, either FIFO or LIFO, it will only look at the queue of picothreads holding at least one lock, and it will ignore those on that queue that don't have the server's lock somewhere in their chain of locks.
Deadlocks are fun -- no doubt about it...
Monday, October 3, 2011
Alpha release of ParaSail Compiler and Virtual Machine
We are very pleased to announce the first, alpha, release of the ParaSail prototype Compiler and Virtual Machine. This release includes executables for the Mac, Linux, and Windows. Please visit the ParaSail Google Group to download this release:
http://groups.google.com/group/parasail-programming-language
http://groups.google.com/group/parasail-programming-language
Wednesday, September 14, 2011
N Queens Join the Party
We are continuing to work on the ParaSail compiler. Our latest program to compile and run is a parallel but non-recursive version of a solution to the "N Queens" problem. There is an N_Queens module, with a nested Solution_State module used to represent a partial solution. The algorithm starts at column 1 and works toward column N, branching into parallel threads to evaluate each possibility for the next row, and when finding an acceptable one, circling back to the outer loop to march onto the next column. There is a simple Test_N_Queens func at the bottom. Below that is the result of running the program.
Enjoy!
Here is the result of running Test_N_Queens, which is set up to solve the 8-queens problem. It has also been verified to work with 4 queens (2 solutions), 5 queens (10 solutions), and 9 queens (352 solutions). After "quit"ing, a display of threading statistics is provided.
Enjoy!
interface N_Queens <N : Univ_Integer := 8> is // Place N queens on an NxN checkerboard so that none of them can // "take" each other. type Chess_Unit is new Integer_Range<-100 .. 100>; // An integer type sufficiently big to represent values -100 to +100 const Rows : Range<Chess_Unit> := 1..N; const Columns : Range<Chess_Unit> := 1..N; type Row is Chess_Unit {Row in Rows}; // A subrange in 1..N type Column is Chess_Unit {Column in Columns}; // A subrange in 1..N type Solution is Array<optional Column, Row>; // A "solution" is an array of Column's, indexed by "Row." // It indicates in which Column a queen of the given Row is located // An example solution would be: 2 8 6 1 3 5 7 4 // meaning that the queen in row 1 is at column 2, // the queen in row 2 is at column 8, the queen in // row 3 is at column 6, and so on. func Place_Queens() -> Vector<Solution> {for all Sol of Place_Queens => for all Col of Sol => Col not null}; // Produce a vector of solutions, with the requirement // that for each solution, there are non-null column numbers // specified for each row of the checkerboard. end interface N_Queens; class N_Queens is type Sum_Range is Chess_Unit {Sum_Range in 2..2*N}; // Sum_Range is used for diagonals where the row+column is the // same throughout the diagonal. type Diff_Range is Chess_Unit {Diff_Range in (1-N) .. (N-1)}; // Diff_Range is used for diagonals where row-column is the // same throughout the diagonal. type Sum is Countable_Set<Sum_Range>; // This type of set keeps track of which Sum_Range diagonals // have a queen on them already. type Diff is Countable_Set<Diff_Range>; // This type of set keeps track of which Diff_Range diagonals // have a queen on them already. interface Solution_State<> is // We build up a solution state progressively as we move // across the checkerboard, one column at a time. func Initial_State() -> Solution_State; func Is_Acceptable(S : Solution_State; R : Row) -> Boolean; // Is_Acceptable returns True if the next queen could be // place in row R. func Current_Column(S : Solution_State) -> Column; // Current_Column indicates which column we are working on. func Next_State(S : Solution_State; R : Row) -> Solution_State; // Next_State returns a Solution_State produced by // adding a queen at (Current_Column(S), R). func Final_Result(S : Solution_State; R : Row) -> Solution; // Final_Result returns a result produced by adding a queen // at (Columns.Last, R) to a solution with all other columns // placed. end interface Solution_State; class Solution_State is const C : Column; // Current column const Trial : Solution; // Trial solution, some col#s still null const Diag_Sum : Sum; // Set of "sum" diagonals in use const Diag_Diff : Diff; // Set of "diff" diagnoals in use exports func Initial_State() -> Solution_State is return (C => 1, Trial => Create(Rows, null), Diag_Sum => [], Diag_Diff => []); end func Initial_State; func Is_Acceptable(S : Solution_State; R : Row) -> Boolean is // Is_Acceptable returns True if the next queen could be // place in row R. return S.Trial[R] is null and then (R+S.C) not in S.Diag_Sum and then (R-S.C) not in S.Diag_Diff; end func Is_Acceptable; func Current_Column(S : Solution_State) -> Column is // Current_Column indicates which column we are working on. return S.C; end func Current_Column; func Next_State(S : Solution_State; R : Row) -> Solution_State is // Next_State returns a Solution_State produced by // adding a queen at (Current_Column(S), R). return (C => S.C+1, Trial => S.Trial | [R => S.C], Diag_Sum => S.Diag_Sum | (R+S.C), Diag_Diff => S.Diag_Diff | (R-S.C)); end func Next_State; func Final_Result(S : Solution_State; R : Row) -> Solution is // Final_Result returns a result produced by adding a queen // at (Columns.Last, R) to a solution with all other columns // placed. return S.Trial | [R => S.C]; end func Final_Result; end class Solution_State; exports func Place_Queens() -> Vector<Solution> {for all Sol of Place_Queens => for all Col of Sol => Col not null} // Produce a vector of solutions, with the requirement // that for each solution, there are non-null column numbers // specified for each row of the checkerboard. is var Solutions : concurrent Vector<Solution> := []; *Outer_Loop* for State : Solution_State := Initial_State() loop // Iterate over the columns for R in Rows concurrent loop // Iterate over the rows if Is_Acceptable(State, R) then // Found a Row/Column combination that is not on any diagonal // already occupied. if Current_Column(State) < N then // Keep going since haven't reached Nth column. continue loop Outer_Loop with Next_State(State, R); else // All done, remember trial result with last queen placed Solutions |= Final_Result(State, R); end if; end if; end loop; end loop Outer_Loop; return Solutions; end func Place_Queens; end class N_Queens; func Test_N_Queens() is type Eight_Queens is N_Queens<8>; var Results := Eight_Queens::Place_Queens(); Println("Number of results = " | Length(Results)); for I in 1..Length(Results) forward loop const Result := Results[I]; Print("Result #" | I); for J in 1..Length(Result) forward loop Print(" " | [[Result[J]]]); end loop; Print('\n'); end loop; end func Test_N_Queens;
Here is the result of running Test_N_Queens, which is set up to solve the 8-queens problem. It has also been verified to work with 4 queens (2 solutions), 5 queens (10 solutions), and 9 queens (352 solutions). After "quit"ing, a display of threading statistics is provided.
Parsing aaa.psi Parsing n_queens4.psl ---- Beginning semantic analysis ---- 58 trees in library. Done with First pass. Done with Second pass. Done with Pre codegen pass. Done with Code gen. Command to execute: Test_N_Queens Number of results = 92 Result #1 4 2 5 8 6 1 3 7 Result #2 1 7 4 6 8 2 5 3 Result #3 4 2 8 5 7 1 3 6 Result #4 1 5 8 6 3 7 2 4 Result #5 4 1 5 8 6 3 7 2 Result #6 2 7 3 6 8 5 1 4 Result #7 7 1 3 8 6 4 2 5 Result #8 5 2 4 6 8 3 1 7 Result #9 5 7 1 3 8 6 4 2 Result #10 5 1 8 6 3 7 2 4 Result #11 1 6 8 3 7 4 2 5 Result #12 4 2 7 3 6 8 1 5 Result #13 3 1 7 5 8 2 4 6 Result #14 3 6 2 7 5 1 8 4 Result #15 5 1 4 6 8 2 7 3 Result #16 6 4 2 8 5 7 1 3 Result #17 3 6 2 5 8 1 7 4 Result #18 1 7 5 8 2 4 6 3 Result #19 3 7 2 8 5 1 4 6 Result #20 2 6 1 7 4 8 3 5 Result #21 3 7 2 8 6 4 1 5 Result #22 4 6 8 2 7 1 3 5 Result #23 4 1 5 8 2 7 3 6 Result #24 7 4 2 8 6 1 3 5 Result #25 3 6 8 2 4 1 7 5 Result #26 5 1 8 4 2 7 3 6 Result #27 7 4 2 5 8 1 3 6 Result #28 4 7 5 2 6 1 3 8 Result #29 7 3 8 2 5 1 6 4 Result #30 5 7 2 6 3 1 8 4 Result #31 5 7 2 4 8 1 3 6 Result #32 4 7 3 8 2 5 1 6 Result #33 4 7 1 8 5 2 6 3 Result #34 7 3 1 6 8 5 2 4 Result #35 6 3 7 2 8 5 1 4 Result #36 6 1 5 2 8 3 7 4 Result #37 4 8 1 5 7 2 6 3 Result #38 6 3 1 7 5 8 2 4 Result #39 6 3 7 2 4 8 1 5 Result #40 6 4 1 5 8 2 7 3 Result #41 5 7 2 6 3 1 4 8 Result #42 5 7 1 4 2 8 6 3 Result #43 3 5 2 8 1 7 4 6 Result #44 4 6 1 5 2 8 3 7 Result #45 2 7 5 8 1 4 6 3 Result #46 5 3 1 6 8 2 4 7 Result #47 4 2 7 5 1 8 6 3 Result #48 6 3 1 8 5 2 4 7 Result #49 3 8 4 7 1 6 2 5 Result #50 6 8 2 4 1 7 5 3 Result #51 5 3 1 7 2 8 6 4 Result #52 2 5 7 4 1 8 6 3 Result #53 3 6 2 7 1 4 8 5 Result #54 6 3 1 8 4 2 7 5 Result #55 4 2 8 6 1 3 5 7 Result #56 8 3 1 6 2 5 7 4 Result #57 3 5 8 4 1 7 2 6 Result #58 4 8 1 3 6 2 7 5 Result #59 6 3 5 8 1 4 2 7 Result #60 6 3 7 4 1 8 2 5 Result #61 8 4 1 3 6 2 7 5 Result #62 8 2 5 3 1 7 4 6 Result #63 6 3 5 7 1 4 2 8 Result #64 4 8 5 3 1 7 2 6 Result #65 7 2 6 3 1 4 8 5 Result #66 2 4 6 8 3 1 7 5 Result #67 5 2 4 7 3 8 6 1 Result #68 3 5 2 8 6 4 7 1 Result #69 4 6 8 3 1 7 5 2 Result #70 4 7 5 3 1 6 8 2 Result #71 3 6 4 2 8 5 7 1 Result #72 2 6 8 3 1 4 7 5 Result #73 5 7 4 1 3 8 6 2 Result #74 7 5 3 1 6 8 2 4 Result #75 4 2 7 3 6 8 5 1 Result #76 6 4 7 1 8 2 5 3 Result #77 5 2 6 1 7 4 8 3 Result #78 7 2 4 1 8 5 3 6 Result #79 6 2 7 1 3 5 8 4 Result #80 8 2 4 1 7 5 3 6 Result #81 3 6 8 1 5 7 2 4 Result #82 5 2 8 1 4 7 3 6 Result #83 5 3 8 4 7 1 6 2 Result #84 3 6 8 1 4 7 5 2 Result #85 5 8 4 1 7 2 6 3 Result #86 3 6 4 1 8 5 7 2 Result #87 3 5 7 1 4 2 8 6 Result #88 6 2 7 1 4 8 5 3 Result #89 6 4 7 1 3 5 2 8 Result #90 5 8 4 1 3 6 2 7 Result #91 2 5 7 1 3 8 6 4 Result #92 2 8 6 1 3 5 7 4 Command to execute: quit Threading Statistics: Num_Initial_Thread_Servers : 10 Num_Dynamically_Allocated_Thread_Servers : 0 Max_Waiting_Threads (for server task): 74 Max_Active : 118 Max_Active_Masters : 133 Max_Sub_Threads_Per_Master : 87 Max_Waiting_For_Threads (to finish): 37171
Thursday, September 8, 2011
Another ParaSail example that compiles and runs
Here is a version of parallel Quicksort that now compiles and runs through the ParaSail compiler. This is parallel but non-recursive. It uses a parallel pair of continue loop statements rather than recursion. It uses "slicing" on the array. A[..] takes an array and returns a "slice" of the array that contains all elements of the array. After partitioning the array, the code splits into two threads and then continues with a left "slice" and a right "slice" of the array. The "<=>" operator performs a swap operation.
There is a simple Test_Sort test function at the bottom. It uses a very simple random number generator to populate an array and then sort it (twice).
At the very bottom we have the result of running the program.
There is a simple Test_Sort test function at the bottom. It uses a very simple random number generator to populate an array and then sort it (twice).
At the very bottom we have the result of running the program.
interface Sorting<Array_Type is Indexable<Comparable<>, Countable<>>> is func Quicksort(var A : Array_Type); // Sort Arr according to the sorting op "<" which returns // True if Left must appear before Right in the sorted order. // Before returns False if Left = Right. end interface Sorting; class Sorting is exports func Quicksort(var A : Array_Type) is // Handle short arrays directly. Partition longer arrays. for Arr : Slice<Array_Type> => A[..] while Length(Arr) > 1 loop if Length(Arr) == 2 then if Arr[Arr.Last] < Arr[Arr.First] then // Swap elements Arr[Arr.First] <=> Arr[Arr.Last]; end if; else // Partition array const Mid := Arr[Arr.First + Length(Arr)/2]; var Left := Arr.First; var Right := Arr.Last; until Left > Right loop var New_Left := Right+1; var New_Right := Left-1; block // Find item in left half to swap for I in Left .. Right forward loop if not (Arr[I] < Mid) then // Found an item that can go into right partitition New_Left := I; if Mid < Arr[I] then // Found an item that *must* go into right part exit loop; end if; end if; end loop; || // Find item in right half to swap for J in Left .. Right reverse loop if not (Mid < Arr[J]) then // Found an item that can go into left partitition New_Right := J; if Arr[J] < Mid then // Found an item that *must* go into left part exit loop; end if; end if; end loop; end block; if New_Left > New_Right then // Nothing more to swap // Exit loop and recurse on two partitions Left := New_Left; Right := New_Right; exit loop; end if; // Swap items Arr[New_Left] <=> Arr[New_Right]; // continue looking for items to swap Left := New_Left + 1; Right := New_Right - 1; end loop; // At this point, "Right" is right end of left partition // and "Left" is left end of right partition // and the partitions don't overlap // and neither is the whole array // and everything in the left partition can precede Mid // and everything in the right partition can follow Mid // and everything between the partitions is equal to Mid. {Left > Right; Right < Arr.Last; Left > Arr.First; (for all I in Arr.First .. Right => not (Mid < Arr[I])); (for all J in Left .. Arr.Last => not (Arr[J] < Mid)); (for all K in Right+1 .. Left-1 => not (Mid < Arr[K]) and not (Arr[K] < Mid))} then // continue with two halves in parallel continue loop with Arr[Arr.First .. Right]; || continue loop with Arr[Left .. Arr.Last]; end if; end loop; end func Quicksort; end class Sorting; func Random(var Seed : Univ_Integer; Mult, Mod : Univ_Integer) -> Univ_Integer is Seed := Seed * Mult mod Mod; return Seed; end func Random; func Test_Sort(Len : Univ_Integer) is // For Random const Mult := 7**5; const Mod := 2**31 - 1; var Seed := Len; Println("Seed = " | Seed | ", Mult = " | Mult | ", Mod = " | Mod); type My_Sorter is Sorting<Vector<Univ_Integer>>; var Vec : Vector<Univ_Integer> := []; for I in 1..Len loop Vec |= Random(Seed, Mult, Mod) mod 100; end loop; var Vec2 := Vec; Println("Before sort, Vec = "); for I in 1 .. Length(Vec) forward loop Print(" " | Vec[I]); if I < Length(Vec) then Print(","); end if; end loop; Print("\n"); My_Sorter::Quicksort(Vec); Println("After sort, Vec = "); for I in 1 .. Length(Vec) forward loop Print(" " | Vec[I]); if I < Length(Vec) then Print(","); end if; end loop; Print("\n"); My_Sorter::Quicksort(Vec2); Println("After 2nd sort, Vec2 = "); for I in 1 .. Length(Vec2) forward loop Print(" " | Vec2[I]); if I < Length(Vec2) then Print(","); end if; end loop; Print("\n"); end func Test_Sort;Here is the result of compiling and running the program. "aaa.psi" contains the standard ParaSail library of modules. "qsort6.psl" is given above. We are using the compiler interactively here, so we can run stand-alone ParaSail operations. It also can run non-interactively, by giving "-command <operation> <param1> ..." after the list of source file names on the command line. When in interactive mode, it dumps out some virtual machine threading statistics after the user "quit"s. These are at the very bottom. In this example, we have 21 threads active simultaneously at one point during execution.
% pslc aaa.psi qsort6.psl Parsing aaa.psi Parsing qsort6.psl ---- Beginning semantic analysis ---- 52 trees in library. Done with First pass. Done with Second pass. Done with Pre codegen pass. Done with Code gen. Command to execute: Test_Sort 20 Seed = 20, Mult = 16807, Mod = 2147483647 Before sort, Vec = 40, 86, 55, 37, 30, 52, 80, 49, 49, 34, 71, 30, 88, 40, 93, 90, 29, 80, 71, 93 After sort, Vec = 29, 30, 30, 34, 37, 40, 40, 49, 49, 52, 55, 71, 71, 80, 80, 86, 88, 90, 93, 93 After 2nd sort, Vec2 = 29, 30, 30, 34, 37, 40, 40, 49, 49, 52, 55, 71, 71, 80, 80, 86, 88, 90, 93, 93 Command to execute: quit Threading Statistics: Num_Initial_Thread_Servers : 10 Num_Dynamically_Allocated_Thread_Servers : 0 Max_Waiting_Threads (for server task): 6 Max_Active : 21 Max_Active_Masters : 17 Max_Sub_Threads_Per_Master : 6 Max_Waiting_For_Threads (to finish): 113
Wednesday, September 7, 2011
ParaSail compiler reaches a milestone
The ParaSail compiler has reached a bit of a milestone, in that it now can compile and run what might be considered to be "interesting" programs. Annotations are not yet enforced, and explicitly concurrent lock-based objects are not yet there, but many other things are working, including implicit and explicit parallelism, dynamic storage management using expandable/shrinkable objects (rather than pointers and garbage collection), fully-shared generic template instantiations, etc. We are trying to solidify the features that are currently supported so we can make a version of the prototype compiler available for experimentation by others within the next couple of weeks.
Below is one of the "interesting" programs. This one is an implementation of balanced "AA" trees. This doesn't illustrate any explicit parallelism, but does illustrate creating and updating interesting data structures without using explicit pointers. Pointer-free data structures allow the compiler to easily understand any aliasing, since there is no hidden sharing between distinct objects. That allows the compiler to eliminate race conditions at compile-time.
Here is the interface to the AA_Tree module. (As a side-note, the program that "HTML"-ized this code was itself written in ParaSail. I've included it at the bottom of this posting.)
Below is one of the "interesting" programs. This one is an implementation of balanced "AA" trees. This doesn't illustrate any explicit parallelism, but does illustrate creating and updating interesting data structures without using explicit pointers. Pointer-free data structures allow the compiler to easily understand any aliasing, since there is no hidden sharing between distinct objects. That allows the compiler to eliminate race conditions at compile-time.
Here is the interface to the AA_Tree module. (As a side-note, the program that "HTML"-ized this code was itself written in ParaSail. I've included it at the bottom of this posting.)
interface AA_Tree<Element is Comparable<>> is // This module implements a balanced "AA" tree, originally // described by Arne Andersson in the "Proceedings of the Workshop // on Algorithms and Data Structures," pp 60-71, Springer Verlag, 1993. // The following algorithm and descriptions were taken from the // WikiPedia article on AA_Tree: // http://en.wikipedia.org/wiki/AA_tree // Note that various additional checks for a null tree have been added. // Only two operations are needed for maintaining balance in an AA tree. // These operations are called skew and split. Skew is a right rotation // when an insertion or deletion creates a left horizontal link. Split // is a conditional left rotation when an insertion or deletion creates two // horizontal right links, which once again corresponds to two // consecutive red links in red-black trees. op "[]"() -> optional AA_Tree; // Create an empty tree func Insert(var T : optional AA_Tree; X : Element); // input: X, the value to be inserted, and // T, the root of the tree to insert it into. // output: A balanced T' including X. func Delete(var T : optional AA_Tree; X : Element); // input: X, the value to delete, and T, // the root of the tree from which it should be deleted. // output: T', balanced, without the value X. op "in"(X : Element; T : optional AA_Tree) -> Boolean; func Overlapping(T : optional AA_Tree; X : Element) -> optional Element; // input: X, the value to find, and T, // the root of the tree to be searched. // output: the element equal to or "unordered" relative to X. op "|="(var T : optional AA_Tree; X : Element); func First(T : optional AA_Tree) -> optional Element; // Return first (smallest) element in tree func Last(T : optional AA_Tree) -> optional Element; // Return last (greatest) element in tree func Remove_First(var T : optional AA_Tree) -> optional Element; // Remove first (smallest) element in tree func Remove_Last(var T : optional AA_Tree) -> optional Element; // Remove last (greatest) element in tree func Remove_Any(var T : optional AA_Tree) -> optional Element; // Remove some element from tree func Count(T : optional AA_Tree) -> Univ_Integer; // Return a count of the nodes in the tree end interface AA_Tree;Here is the class that provides the implementation for the AA_Tree module:
class AA_Tree is var Left : optional AA_Tree; var Right : optional AA_Tree; var Value : Element; var Level : Univ_Integer := 0; func Node(Value : Element; Level : Univ_Integer; Left : optional AA_Tree; Right : optional AA_Tree) -> AA_Tree is // Create a new tree. return (Left => Left, Right => Right, Value => Value, Level => Level); end func Node; func Is_Leaf(T : optional AA_Tree) -> Boolean is return T not null and then T.Left is null and then T.Right is null; end func Is_Leaf; func Leftmost(ref T : optional AA_Tree) -> ref optional AA_Tree is for L => T loop if L not null and then L.Left not null then // Continue with Left until we reach null continue loop with L.Left; else // Found left-most return L; end if; end loop; end func Leftmost; func Successor(T : optional AA_Tree) -> optional Element is // Return element in tree greater than but closest to T.Value if T.Right not null then return Leftmost(T.Right).Value; else return null; end if; end func Successor; func Rightmost(ref T : optional AA_Tree) -> ref optional AA_Tree is for R => T loop if R not null and then R.Right not null then // Keep following down Right side continue loop with R.Right; else // Found right-most return R; end if; end loop; end func Rightmost; func Predecessor(T : optional AA_Tree) -> optional Element is // Return element in tree less than but closest to T.Value if T.Left not null then return Rightmost(T.Left).Value; else return null; end if; end func Predecessor; func Skew(var T : optional AA_Tree) is // input: T, a node representing an AA tree that needs to be rebalanced. // output: T' Another node representing the rebalanced AA tree. if T not null and then T.Left not null and then T.Left.Level == T.Level then // The current T.Left becomes new root // Exchange value of T.Left with root T.Value <=> T.Left.Value; // Move old root and T.Left.Right over to right side of tree T.Left.Right <=> T.Right; T.Left.Left <=> T.Right; T.Left <=> T.Right; end if; end func Skew; func Split(var T : optional AA_Tree) is // input: T, a node representing an AA tree that needs to be rebalanced. // output: T' Another node representing the rebalanced AA tree. if T not null and then T.Right not null and then T.Right.Right not null and then T.Level == T.Right.Right.Level then // T.Right becomes the new root // Exchange value and level between root and T.Right T.Value <=> T.Right.Value; T.Level <=> T.Right.Level; // Move old root and T.Right.Left to left side of tree T.Left <=> T.Right.Right; T.Right.Left <=> T.Right.Right; T.Left <=> T.Right; // Increment level T.Level += 1; end if; end func Split; func Decrease_Level(var T : optional AA_Tree) is // input: T, a tree for which we want to remove links that skip levels. // output: T with its level decreased. if T is null then return; end if; var Should_Be : Univ_Integer := 1; if T.Left not null then Should_Be := T.Left.Level + 1; end if; if T.Right not null then Should_Be := Min(Should_Be, T.Right.Level + 1); end if; if Should_Be < T.Level then T.Level := Should_Be; if T.Right not null and then Should_Be < T.Right.Level then T.Right.Level := Should_Be; end if; end if; end func Decrease_Level; exports op "[]"() -> optional AA_Tree is // Create an empty tree return null; end op "[]"; // Insertion begins with the normal binary tree search and insertion // procedure. Then, as the call stack unwinds (assuming a recursive // implementation of the search), it's easy to check the validity of the // tree and perform any rotations as necessary. If a horizontal left link // arises, a skew will be performed, and if two horizontal right links // arise, a split will be performed, possibly incrementing the level of the // new root node of the current subtree. Note, in the code as given above, // the increment of T.Level. This makes it necessary to continue checking // the validity of the tree as the modifications bubble up from the leaves. func Insert(var T : optional AA_Tree; X : Element) is // input: X, the value to be inserted, and // T, the root of the tree to insert it into. // output: A balanced T' including X. // Do the normal binary tree insertion procedure. // Set the result of the recursive call to the correct // child in case a new node was created or the // root of the subtree changes. if T is null then // Create a new leaf node with X. T := Node(X, 1, null, null); return; end if; case X =? T.Value of [#less] => Insert(T.Left, X); [#greater] => Insert(T.Right, X); [#equal | #unordered] => // Note that the case of X == T.Value is unspecified. // As given, an insert will have no effect. // The implementor may desire different behavior. return; end case; // Perform skew and then split. // The conditionals that determine whether or // not a rotation will occur or not are inside // of the procedures, as given above. Skew(T); Split(T); end func Insert; // As in most balanced binary trees, the deletion of an internal node can // be turned into the deletion of a leaf node by swapping the internal node // with either its closest predecessor or successor, depending on which are // in the tree or on the implementor's whims. Retrieving a predecessor is // simply a matter of following one left link and then all of the remaining // right links. Similarly, the successor can be found by going right once // and left until a null pointer is found. Because of the AA property of // all nodes of level greater than one having two children, the successor // or predecessor node will be in level 1, making their removal trivial. // // To re-balance a tree, there are a few approaches. The one described by // Andersson in his original paper is the simplest, and it is described // here, although actual implementations may opt for a more optimized // approach. After a removal, the first step to maintaining tree validity // is to lower the level of any nodes whose children are two levels below // them, or who are missing children. Then, the entire level must be skewed // and split. This approach was favored, because when laid down // conceptually, it has three easily understood separate steps: // // Decrease the level, if appropriate. // Skew the level. // Split the level. // // However, we have to skew and split the entire level this time instead of // just a node, complicating our code. func Delete(var T : optional AA_Tree; X : Element) is // input: X, the value to delete, and T, // the root of the tree from which it should be deleted. // output: T', balanced, without the value X. if T is null then // Not in tree -- should we complain? return; end if; case X =? T.Value of [#less] => Delete(T.Left, X); [#greater] => Delete(T.Right, X); [#equal] => // If we're a leaf, easy, otherwise reduce to leaf case. if Is_Leaf(T) then T := null; elsif T.Left is null then // Get successor value and delete it from right tree, // and set root to have that value const Succ := Successor(T); Delete(T.Right, Succ); T.Value := Succ; else // Get predecessor value and delete it from left tree, // and set root to have that value const Pred := Predecessor(T); Delete(T.Left, Pred); T.Value := Pred; end if; [#unordered] => // Not in tree; should we complain? return; end case; // Rebalance the tree. Decrease the level of all nodes in this level if // necessary, and then skew and split all nodes in the new level. if T is null then return; end if; Decrease_Level(T); Skew(T); Skew(T.Right); if T.Right not null then Skew(T.Right.Right); end if; Split(T); Split(T.Right); end func Delete; op "in"(X : Element; T : optional AA_Tree) -> Result : Boolean is for P => T while P not null loop case X =? P.Value of [#less] => continue loop with P.Left; [#greater] => continue loop with P.Right; [#equal] => return #true; [#unordered] => return #false; end case; end loop; return #false; // Not found end op "in"; func First(T : optional AA_Tree) -> optional Element is // Return first (smallest) element in tree if T is null then return null; else return Leftmost(T).Value; end if; end func First; func Last(T : optional AA_Tree) -> optional Element is // Return last (greatest) element in tree if T is null then return null; else return Rightmost(T).Value; end if; end func Last; func Remove_First(var T : optional AA_Tree) -> Result : optional Element is // Remove first (smallest) element in tree Result := First(T); if Result not null then Delete(T, Result); end if; end func Remove_First; func Remove_Last(var T : optional AA_Tree) -> Result : optional Element is // Remove last (greatest) element in tree Result := Last(T); if Result not null then Delete(T, Result); end if; end func Remove_Last; func Remove_Any(var T : optional AA_Tree) -> Result : optional Element is // Remove some element from tree if T is null then return null; end if; Result := T.Value; if Result not null then Delete(T, Result); end if; end func Remove_Any; func Count(T : optional AA_Tree) -> Univ_Integer is // Return a count of the nodes in the tree if T is null then return 0; else return Count(T.Left) + Count(T.Right) + 1; end if; end func Count; op "|="(var T : optional AA_Tree; X : Element) is Insert(T, X); end op "|="; func Overlapping(T : optional AA_Tree; X : Element) -> optional Element is // input: X, the value to find, and T, // the root of the tree to be searched. // output: the element equal to or "unordered" relative to X. if T is null or else T.Value is null then return null; else case X =? T.Value of [#less] => return Overlapping(T.Left, X); [#greater] => return Overlapping(T.Right, X); [#equal | #unordered] => // Close enough return T.Value; end case; end if; end func Overlapping; end class AA_Tree;Here is a small test program for AA_Tree:
func Test_AA_Tree(A : Univ_Integer; B : Univ_Integer; C : Univ_Integer) is type Univ_Tree is AA_Tree<Univ_Integer>; var T : Univ_Tree := []; var X : Univ_Integer := A; Insert(T, A); Println("Count = " | Count(T) | " after insert of " | A); Insert(T, B); Println("Count = " | Count(T) | " after insert of " | B); Insert(T, C); Println("Count = " | Count(T) | " after insert of " | C); Insert(T, A); Println("Count = " | Count(T) | " after another insert of " | A); Println(A | " in T = " | [[A in T]]); Println(B | " in T = " | [[B in T]]); Println(C | " in T = " | [[C in T]]); Println("7 in T = " | [[7 in T]]); for E := Remove_First(T) then Remove_First(T) while E not null loop Println("Remove_First = " | E); end loop; Println("Count after loop : " | Count(T)); for I in 1..10 forward loop Insert(T, I); Println("Count = " | Count(T) | " after insert of " | I); end loop; for L := Remove_Last(T) then Remove_Last(T) while L not null loop Println("Remove_Last = " | L); end loop; Println("Count after loop : " | Count(T)); for J in 1..10 reverse loop Insert(T, J); Println("Count = " | Count(T) | " after insert of " | J); end loop; Println("Count after loop : " | Count(T)); Println("Overlapping(T, 5) = " | Overlapping(T, 5)); for Z := Remove_Any(T) then Remove_Any(T) while Z not null loop Println("Remove_Any = " | Z); end loop; Println("Count after loop : " | Count(T)); for K in 1..10 loop Insert(T, K); Println("Count = " | Count(T) | " after insert of " | K); end loop; for F := Remove_First(T) then Remove_First(T) while F not null loop Println("Remove_First = " | F); end loop; Println("Count after loop : " | Count(T)); end func Test_AA_Tree;Here is the ParaSail program that was used to "HTML"-ize the listings above (as well as the one below!). Not terribly sophisticated, but illustrates some of the character handling:
func Html_Escape(C : Univ_Character) -> Univ_String is // Do single-character escapes case C of ['<'] => return "<"; ['>'] => return ">"; ['&'] => return "&"; ['\\'] => return "\\"; // Prevent "Print" from expanding control chars [..] => return "" | C; // Convert character into a string end case; end func Html_Escape; func Htmlize_Line(Orig : Univ_String) -> Result : Univ_String is // bold lower case words that are not in comments or // character, string, or enum literals Result := ""; var I := 1; const L := Length(Orig); var State : Univ_Enumeration := #other; while I <= L loop var C := Orig[I]; case C of ['\\'] => if State == #string_literal or else State == #char_literal then // skip next character no matter what it is if I < L then Result |= Html_Escape(C); I += 1; C := Orig[I]; end if; end if; ['/'] => if I < L and then Orig[I+1] == '/' then // comment, italicize it Result |= "<i>" | Html_Escape(C); while I < L loop I += 1; Result |= Html_Escape(Orig[I]); end loop; Result |= "</i>"; C := null; end if; ['a' .. 'z'] => if State == #other then // Presume this is a reserved word, so bold it Result |= "<b>" | C; while I < L and then Orig[I+1] in 'a' .. 'z' loop I += 1; Result |= Orig[I]; end loop; Result |= "</b>"; C := null; end if; ['#' | '0'..'9' | 'A' .. 'Z' | '_'] => // Presume this is not a reserved word, pass through as is if State == #other then State := #unreserved; end if; ['"'] => if State == #string_literal then // End of string literal State := #other; elsif State != #char_literal then State := #string_literal; end if; ['\''] => if State == #char_literal then // End of char literal State := #other; elsif State != #string_literal then State := #char_literal; end if; [..] => if State != #string_literal and then State != #char_literal then State := #other; end if; end case; if C not null then Result |= Html_Escape(C); end if; I += 1; end loop; Println(Result); end func Htmlize_Line; func Htmlize() is // Htmlize the standard input, // putting the result on the standard output Println("<pre>"); loop const Line : Univ_String := Readln(); if Line is null then // End of file indicated by "null" exit loop; end if; Htmlize_Line(Line); end loop; Println("</pre>"); end func Htmlize;
Friday, September 2, 2011
ParaSail on Intel Parallel Programming Talk video
The talk about ParaSail at OSCON 2011 generated a nice bit of
media attention, and as a result we were invited to be on the Intel
Software Network "Parallel Programming Talk" video show. The show is
now online at:
http://software.intel.com/en-u s/blogs/2011/08/26/parasail-a- new-programming-language-paral lel-programming-talk-120/
There is a relatively long intro discussion of about 8 minutes, and then we get into the nitty gritty of ParaSail.
http://software.intel.com/en-u
There is a relatively long intro discussion of about 8 minutes, and then we get into the nitty gritty of ParaSail.
Saturday, July 30, 2011
A load-and-null-out operation
As we have continued to write more programs in ParaSail, we have found the pointer-free expandable-object data structuring primitives very interesting to use. A simple tree structure can be defined as:
In most cases you would keep this structure private by declaring it inside the class defining the Binary_Tree module rather than declaring it in the externally-visible interface, but this illustrates the basic structuring technique, using optional to allow recursive structures to be created.
Because there are no pointers, only sub-objects, essentially any arbitrarily complex data structure can be copied using a simple assignment:
This will create a copy of Original_Tree, no matter how complex it is. Similarly, we can expand a tree by assigning a new tree into its Left or Right subtree:
So what did the above two assignments do? Well they made a complete copy of the Original_Tree, and then assigned that copy into the left subtree of the Original_Tree, wiping out whatever was in the left subtree before. But we now have a complete copy of what was in the Original_Tree.Left before in Original_Tree.Left.Left. And Original_Tree.Left.Right is now identical in structure to Original_Tree.Right. We could have done this without explicitly declaring the intermediate "Copy_Of_Tree" object, with the same effect:
The old contents of the left-hand side of an assignment are wiped out, and automatically reclaimed immediately -- no need for the user to write a deep reclaim operation. We don't need to wait for a garbage collector to decide whether it is safe to reclaim the storage associated with the old contents, since there is no sharing between objects in ParaSail. If we overwrite the contents of an object or part of an object, whatever was there before can be reclaimed without danger of creating any dangling references.
Here is another example:
Given this situation, which seems like it might come up fairly frequently when making structural modifications to a data structure (such as rebalancing the tree), or destructively removing elements from a data structure (e.g. "Remove_First(Queue)"), we believe it might be useful to have a notion of load-and-null-out. That is, we want the value of Original.Left_Tree, but after fetching the value, we are "done" with it, so set it to null. This means we may not need to make a copy of the subobject, since it has been carved out of its old enclosing object, with a null left behind.
We considered various notations for representing load-and-null-out, and came up with the notion of a deferred assignment, that is an assignment where before we overwrite the left-hand-side, we fetch its old value and return it. This is reminiscent of the compare-and-swap operation, or fetch-and-add, or the post increment X++ of C (and PDP-11/Vax-11 fame, for those familiar with those older DEC machines). Here is a tentative notation for a deferred assignment:
What the "::= null" operation means is to fetch the value of Original_Tree.Left, and after doing so, assign it to null. One could imagine generalizing this extra ':' notation to support post-increment:
This would mean fetch the value of I, and after doing so, apply the "+= 1" operation to it. It is not obvious whether this generalization is a good idea, or just cute. But the deferred assignment of null seems like an operation that will be quite useful in the pointer-free world of ParaSail, to provide more control when restructuring a complex object.
interface Binary_Tree<Element_Type is Assignable<>> is var Left, Right : optional Binary_Tree; var Contents : Element_Type; end interface Binary_Tree;
In most cases you would keep this structure private by declaring it inside the class defining the Binary_Tree module rather than declaring it in the externally-visible interface, but this illustrates the basic structuring technique, using optional to allow recursive structures to be created.
Because there are no pointers, only sub-objects, essentially any arbitrarily complex data structure can be copied using a simple assignment:
var Copy_Of_Tree : optional Binary_Tree := Original_Tree;
This will create a copy of Original_Tree, no matter how complex it is. Similarly, we can expand a tree by assigning a new tree into its Left or Right subtree:
Original_Tree.Left := Copy_Of_Tree;
So what did the above two assignments do? Well they made a complete copy of the Original_Tree, and then assigned that copy into the left subtree of the Original_Tree, wiping out whatever was in the left subtree before. But we now have a complete copy of what was in the Original_Tree.Left before in Original_Tree.Left.Left. And Original_Tree.Left.Right is now identical in structure to Original_Tree.Right. We could have done this without explicitly declaring the intermediate "Copy_Of_Tree" object, with the same effect:
Original_Tree.Left := Original_Tree;I'm not sure whether what we just did was of any use, but it does give you a feel for the power of expandable objects. The user doesn't have to write explicitly any deep copy operation, since from ParaSail's point of view, these are just subobjects, and ":=" assigns the whole thing, including all of the subobjects.
The old contents of the left-hand side of an assignment are wiped out, and automatically reclaimed immediately -- no need for the user to write a deep reclaim operation. We don't need to wait for a garbage collector to decide whether it is safe to reclaim the storage associated with the old contents, since there is no sharing between objects in ParaSail. If we overwrite the contents of an object or part of an object, whatever was there before can be reclaimed without danger of creating any dangling references.
Here is another example:
Original_Tree := Original_Tree.Left;This wipes out the old contents of Original _Tree and replaces it with a copy of its left subtree. But there seems to be some unnecessary copying going on here. The underlying sequence is:
- Make a copy of Original_Tree.Left
- Reclaim the old contents of Original_Tree
- Overwrite it with the copy of Original_Tree.Left
Given this situation, which seems like it might come up fairly frequently when making structural modifications to a data structure (such as rebalancing the tree), or destructively removing elements from a data structure (e.g. "Remove_First(Queue)"), we believe it might be useful to have a notion of load-and-null-out. That is, we want the value of Original.Left_Tree, but after fetching the value, we are "done" with it, so set it to null. This means we may not need to make a copy of the subobject, since it has been carved out of its old enclosing object, with a null left behind.
We considered various notations for representing load-and-null-out, and came up with the notion of a deferred assignment, that is an assignment where before we overwrite the left-hand-side, we fetch its old value and return it. This is reminiscent of the compare-and-swap operation, or fetch-and-add, or the post increment X++ of C (and PDP-11/Vax-11 fame, for those familiar with those older DEC machines). Here is a tentative notation for a deferred assignment:
Original_Tree := (Original_Tree.Left ::= null);
What the "::= null" operation means is to fetch the value of Original_Tree.Left, and after doing so, assign it to null. One could imagine generalizing this extra ':' notation to support post-increment:
X := A[(I :+= 1)];
This would mean fetch the value of I, and after doing so, apply the "+= 1" operation to it. It is not obvious whether this generalization is a good idea, or just cute. But the deferred assignment of null seems like an operation that will be quite useful in the pointer-free world of ParaSail, to provide more control when restructuring a complex object.
Monday, July 25, 2011
Adapting one interface to implement another
We bumped into an interesting problem in ParaSail recently. We have an abstract module Countable which has the following interface:
Other examples of "countable" types are representations of time, where you can add and subtract some number of clock ticks, or compute the number of clock ticks between two times, even if the "time" type itself isn't necessarily represented as a single integer. And in general any ordered enumeration type (such as Enum<[#monday,#tuesday,#wednesday, ...]>) might want to be considered Countable, so that intervals can be defined and manipulated.
A Countable_Set abstraction would be useful as a way to represent sets of Countable elements, while still efficiently representing contiguous ranges of values such as "1..1000" as part of the set. For example:
More generally, we can see a situation where a useful abstraction, such as a Countable_Set, expects a type parameter such as Countable<>, and the type we have is close but is missing some of the required operations, and we don't really want to add the missing operations for general use because of ambiguity, or they don't quite do the right thing, or whatever. Ideally we would want to make them available for implementing a given interface or interfaces, but not for general use.
Some languages provide mechanisms for addressing problems like these. In Eiffel, as part of inheriting from another class the derived class can rename some of the operations. With that approach, on inheriting these Countable operations we would rename them to something like "Countable_Plus" and "Countable_Minus" and then define them to do whatever we want. In C++ it is possible to disambiguate particular operations by specifying from which base type they are inherited.
In ParaSail we are pursuing somewhat of a different approach. Essentially we are allowing a module to publicly declare that it is implementing various operations that might be needed to be considered "Countable" or "Hashable" or whatever, while not making them available for "general" use. Here is how the ParaSail Integer module solves the Countable conundrum:
abstract interface Countable<> is op "+"(Left : Countable; Right : Univ_Integer) -> Countable; op "+"(Left : Univ_Integer; Right : Countable) -> Countable; op "-"(Left : Countable; Right : Univ_Integer) -> Countable; op "-"(Left, Right : Countable) -> Univ_Integer; op "=?"(Left, Right : Countable) -> Ordering; end interface Countable;This interface is used when defining a "countable interval" such as "X .. Y" where you want to be able to iterate from X up to Y, even though X and Y are not themselves of an integer type. For example, if X and Y are characters, it makes sense to define an interval such as 'a' .. 'z' and there is a desire to be able to go from 'a' to the next character in the interval, so by using the "+" operator from Countable that is easy to do. Just add 1. Similarly, we can iterate through the interval in reverse by subtracting 1. Or we can find out how big is the interval X..Y by computing (Y - X) + 1, where we are using the second "-" operator.
Other examples of "countable" types are representations of time, where you can add and subtract some number of clock ticks, or compute the number of clock ticks between two times, even if the "time" type itself isn't necessarily represented as a single integer. And in general any ordered enumeration type (such as Enum<[#monday,#tuesday,#wednesday, ...]>) might want to be considered Countable, so that intervals can be defined and manipulated.
A Countable_Set abstraction would be useful as a way to represent sets of Countable elements, while still efficiently representing contiguous ranges of values such as "1..1000" as part of the set. For example:
interface Countable_Set<Element_Type is Countable<>> is op ".."(Left, Right : Element_Type) -> Countable_Set; op "|"(Left, Right : Countable_Set) -> Countable_Set; op "in"(Left : Element_Type; Right : Countable_Set) -> Boolean; func Count(CS : Countable_Set) -> Univ_Integer; ... end interface Countable_Set;Implementing the Count function clearly will need to make use of the "+" or "-" Countable operations. But this now brings us to the question, is Integer itself a Countable type? Can we legally write Countable_Set<Integer>? Well if we look at the binary "+" and "-" operators for Integer we see:
interface Integer<...> is op "+"(Left, Right : Integer) -> Integer; op "-"(Left, Right : Integer) -> Integer; ... end interface Integer;These don't actually match the operators expected for a Countable type. And if we were to simply add the missing operators, we would create significant ambiguity, since "X + 1" could either resolve to the existing Integer,Integer->Integer "+" or to the added Countable one, Integer,Univ_Integer->Integer "+". Not ideal.
More generally, we can see a situation where a useful abstraction, such as a Countable_Set, expects a type parameter such as Countable<>, and the type we have is close but is missing some of the required operations, and we don't really want to add the missing operations for general use because of ambiguity, or they don't quite do the right thing, or whatever. Ideally we would want to make them available for implementing a given interface or interfaces, but not for general use.
Some languages provide mechanisms for addressing problems like these. In Eiffel, as part of inheriting from another class the derived class can rename some of the operations. With that approach, on inheriting these Countable operations we would rename them to something like "Countable_Plus" and "Countable_Minus" and then define them to do whatever we want. In C++ it is possible to disambiguate particular operations by specifying from which base type they are inherited.
In ParaSail we are pursuing somewhat of a different approach. Essentially we are allowing a module to publicly declare that it is implementing various operations that might be needed to be considered "Countable" or "Hashable" or whatever, while not making them available for "general" use. Here is how the ParaSail Integer module solves the Countable conundrum:
interface Integer<...> is op "+"(Left, Right : Integer) -> Integer; op "-"(Left, Right : Integer) -> Integer; ... implements for Countable op "+"(Left : Integer; Right : Univ_Integer) -> Integer; op "+"(Left : Univ_Integer; Right : Integer) -> Integer; op "-"(Left : Integer; Right : Univ_Integer) -> Integer; op "-"(Left, Right : Integer) -> Univ_Integer; end interface Integer;You can also omit the "for Countable" part and then the operations are available to implement any other module's interface. But the key thing is that you cannot call these operations directly on an Integer type. You can only invoke them when "viewing" an Integer as a "Countable" object. You can have multiple sections after "implements" each section starting with "for Interface1, Interface2, ..." if there are several different sets of operations that need to be implemented to satisfy the needs for different sets of interfaces. In general situations like this are expected to be rare, but when you bump into one, it is useful to have a way to avoid the ambiguity and still satisfy the needs of the other interface.
Saturday, July 23, 2011
Another slight shift in syntax
As we use ParaSail more, we are making some additional small changes in syntax. The first one has to do with allowing a component to be declared as a reference to another object. The ability to return a short-lived object reference is important for user-defined indexing. We also imagine that certain data structures will want to incorporate references as components. Such data structures will not allow assignment, but nevertheless they can be useful for iterating over some other data structure. For example, one might have a complex tree structure, and then an iterator over the structure which contains a reference to the current item in the tree, and references to the enclosing nodes in the tree, so that given the iterator one can generate a reference to the next element of the tree (presuming some ordering such as depth-first left-to-right).
Currently, object references are recognized by using "=>" rather than ":=" for defining their initial value. However, that is somewhat subtle, and for a component, the initial value can't be specified when the component is declared. Having already used the term "ref" for parameters that act as references, we have chosen to change the syntax for other objects that are references by explicitly using the term "ref" for them as well. For example:
A "ref" by itself means that the reference is a read-write reference only if the object to which it refers is a variable. Explicitly using "ref const" makes the reference read-only even if the object to which it refers is a variable. Explicitly using "ref var" makes the reference read-write, and requires the object to which it refers to be a variable. So the first example above creates a reference to the Ith element of M, providing read-write access only if M is a variable, the second creates a read-only reference to the Right component of T, and the third creates a read-write reference to the Left component of T, and requires that T itself is a variable.
Having adopted the use of "ref", we can now leave off the "=> obj" part and simply specify a type, and in that way declare a component which will be a reference. Here is an example of an iterator that walks over a tree structure. It uses a "ref" component to point to a particular node in the tree:
This Tree_Iterator has a component that is a reference to a node in a tree, an optional Tree_Iterator for the enclosing node in the tree, and an indication of which child the current node is of the enclosing node.
By specifying simply "ref" rather than "ref var" or "ref const" for the Current_Node reference, we are saying that it is a reference to a variable when the enclosing Tree_Iterator object is a variable.
Any type that has a (constructor) function in its interface that takes a "ref" input and produces a new object of the type as an output is presumed to be burying the "ref" somewhere in the object. (Of course the compiler can just "look" in the class and see this, but a client of the module wants to know strictly from looking at the interface.) If the objects of a type incorporate a reference within them, then objects of that type do not permit assignment, since references in ParaSail are always required to be short-lived. The result of such a constructor function therefore can only be used to initialize a new object, be passed as a parameter, or be used to define the next value for a loop variable. The result of such a function cannot be assigned as the new value for an existing object.
Now that we are using "ref" explicitly for objects and components, it makes the syntax used for parameters, where "ref" comes after the parameter name and the ":", seem inconsistent. Hence we are changing the syntax for parameters to more closely match that of objects and components:
For example:
http://groups.google.com/group/parasail-programming-language
Currently, object references are recognized by using "=>" rather than ":=" for defining their initial value. However, that is somewhat subtle, and for a component, the initial value can't be specified when the component is declared. Having already used the term "ref" for parameters that act as references, we have chosen to change the syntax for other objects that are references by explicitly using the term "ref" for them as well. For example:
ref X => M[I]; ref const R => T.Right; ref var L => T.Left;
A "ref" by itself means that the reference is a read-write reference only if the object to which it refers is a variable. Explicitly using "ref const" makes the reference read-only even if the object to which it refers is a variable. Explicitly using "ref var" makes the reference read-write, and requires the object to which it refers to be a variable. So the first example above creates a reference to the Ith element of M, providing read-write access only if M is a variable, the second creates a read-only reference to the Right component of T, and the third creates a read-write reference to the Left component of T, and requires that T itself is a variable.
Having adopted the use of "ref", we can now leave off the "=> obj" part and simply specify a type, and in that way declare a component which will be a reference. Here is an example of an iterator that walks over a tree structure. It uses a "ref" component to point to a particular node in the tree:
class Tree_Iterator is ref Current_Node : Tree_Node; var Enclosing_Node : optional Tree_Iterator; var Which_Child : Child_Index; exports func First_Item(ref Root : Tree_Node) -> Tree_Iterator is // Return leftmost child of Root to start the iteration return Leftmost_Child ((Current_Node => Root, Enclosing_Node => null, Which_Child => 0)); end func First_Item; func Leftmost_Child(It : Tree_Iterator) -> Tree_Iterator is // Return iterator designating leftmost child if Num_Children(It.Current_Node) == 0 then // We are at the leftmost child already return It; else // Recurse down the left side return (Current_Node => It.Current_Node[1], Enclosing_Node => It, Which_Child => 1); end if; end func Leftmost_Child; func Next_Item(It : Tree_Iterator) -> optional Tree_Iterator is // Advance iterator to next item in left-to-right depth-first walk if It.Enclosing_Node is null then // All done return null; elsif It.Which_Child < Num_Children(It.Enclosing_Node.Current_Node) then // Return next child of same enclosing node return Leftmost_Child ((Current_Node => It.Enclosing_Node[It.Which_Child + 1], Enclosing_Node => It.Enclosing_Node, Which_Child => It.Which_Child + 1)); else // Recurse to go to next sibling of enclosing node return Next_Item(It.Enclosing_Node); end if; end func Next_Item; ... end class Tree_Iterator;
This Tree_Iterator has a component that is a reference to a node in a tree, an optional Tree_Iterator for the enclosing node in the tree, and an indication of which child the current node is of the enclosing node.
By specifying simply "ref" rather than "ref var" or "ref const" for the Current_Node reference, we are saying that it is a reference to a variable when the enclosing Tree_Iterator object is a variable.
Any type that has a (constructor) function in its interface that takes a "ref" input and produces a new object of the type as an output is presumed to be burying the "ref" somewhere in the object. (Of course the compiler can just "look" in the class and see this, but a client of the module wants to know strictly from looking at the interface.) If the objects of a type incorporate a reference within them, then objects of that type do not permit assignment, since references in ParaSail are always required to be short-lived. The result of such a constructor function therefore can only be used to initialize a new object, be passed as a parameter, or be used to define the next value for a loop variable. The result of such a function cannot be assigned as the new value for an existing object.
Now that we are using "ref" explicitly for objects and components, it makes the syntax used for parameters, where "ref" comes after the parameter name and the ":", seem inconsistent. Hence we are changing the syntax for parameters to more closely match that of objects and components:
operation_input ::= id ':' type [ := default ] | 'var' id ':' type | 'ref' [ 'var' | 'const' ] id ':' type | 'global' [ 'var' ] id ':' type | [ 'locked' | 'queued' ] [ 'var' ] id ':' type operation_output ::= [ id ':' ] type | 'ref' [ 'var' | 'const' ] [ id ':' ] type
For example:
func Append(var L : List; E : Element); op "indexing"(ref V : Vector; I : Index) -> ref Element; func Get_Next(queued var Q : Queue) -> Element;We will be posting a new draft of the ParaSail reference manual reflecting these changes shortly to the ParaSail google group:
http://groups.google.com/group/parasail-programming-language
Wednesday, July 13, 2011
Everything in a func?
Here is an issue which is trivial semantically, but potentially has a more substantial effect on the look and feel of the language. Currently, reflecting to some extent the Pascal heritage, ParaSail uses the terms "function," "procedure," and "operator" for the various kinds of operations. Functions must have at least one output, procedures may not have any outputs, and operators may have zero or more outputs, but have a name given in quotes and are used for defining operations that connect into special syntax or semantics, such as unary or binary operators, implicit conversions from universal types, etc.
As we go about creating more examples in ParaSail, and compare with various other recent languages, we find the use of these three distinct terms not particularly helpful, and certainly more verbose than most other languages. On the other hand, the terms do communicate something about the appropriate use of the operations.
We are now leaning toward replacing all three terms with the term "func". Our sense is this would give the syntax a somewhat less heavy, and perhaps more modern feel. We would retain the requirement to echo the term at the end of the operation, hence:
The above three "func"s would be "function GCD," "procedure Print," and operator "|" using the current ParaSail syntax. Given that we are already abbreviating "constant" to const and "variable" to var in ParaSail, it doesn't seem inconsistent to adopt a standard abbreviation here for function/procedure/operator, and func is one that appears in a number of languages.
Now this is an issue about which anyone and everyone probably has an opinion...
As we go about creating more examples in ParaSail, and compare with various other recent languages, we find the use of these three distinct terms not particularly helpful, and certainly more verbose than most other languages. On the other hand, the terms do communicate something about the appropriate use of the operations.
We are now leaning toward replacing all three terms with the term "func". Our sense is this would give the syntax a somewhat less heavy, and perhaps more modern feel. We would retain the requirement to echo the term at the end of the operation, hence:
func GCD(X, Y : Integer) -> Integer is case X =? Y of [#less] => return GCD(X, Y mod X); [#equal] => return X; [#greater] => return GCD(X mod Y, Y); end case; end func GCD; func Print(F : ref var File; X : Integer) is if abs X >= 10 then Print(F, X/10); end if; Print(F, '0' + (X rem 10)) end func Print; func "|"(X : Set; E : Elem) -> Set is return Union(X, Make_Set(E)); end func "|";
The above three "func"s would be "function GCD," "procedure Print," and operator "|" using the current ParaSail syntax. Given that we are already abbreviating "constant" to const and "variable" to var in ParaSail, it doesn't seem inconsistent to adopt a standard abbreviation here for function/procedure/operator, and func is one that appears in a number of languages.
Now this is an issue about which anyone and everyone probably has an opinion...
Thursday, July 7, 2011
Implementing concurrent loops in ParaSail
The implementation of the ParaSail compiler and virtual machine (see http://parasail-programming-language.blogspot.com/2010/11/virtual-machine-for-parasail-with.html) are proceeding apace. Basic support is there for interfaces, classes, operators, functions, procedures, module instantiations, object and type declarations, if, case, while, and assignment statements, user-defined indexing, parallel evaluation of expressions, parallel execution of statements on either side of a "||", etc. However, a big gap at the moment is for loops, and in particular concurrent for loops.
For loops are a bit unusual in ParaSail, as they support both a continue loop and an exit loop (i.e. "break") capability in the presence of parallelism. Here is an example using continue loop:
The semantics of saying concurrent loop rather than simply loop is that the next iteration starts before even starting the current iteration. If we had said simply loop in the second example, then it would first check the Root node to see if it has the Desired_Value, and only if it did not would we proceed on to the next iteration(s), at which point we would check both the Left and Right subtrees (presuming both were non-null). Since we said concurrent loop, the iteration on the Left and Right subtrees begins in parallel with executing the loop body on the Root.
The semantics of the exit loop statement are that all other threads within the loop being exited are stopped, and when the thread first reaching the exit loop is the only thread remaining, the assignments specified by the with-values clause are performed. If the loop ends normally without an exit loop, then the assignments specified by the end with-values clause are performed. The net effect is that if the Desired_Value is in the tree, Result ends up with the Id of that node. If it is not found, Result ends up null.
See http://parasail-programming-language.blogspot.com/2010/06/generalized-for-loops-in-parasail.html and http://parasail-programming-language.blogspot.com/2010/06/intentional-race-condition-in-parasail.html for more details about the generalized for loop capabilities.
So given all of the above, what should be the implementation model for a for loop in ParaSail? As mentioned in the blog entry on the ParaSail virtual machine (PSVM -- see link in first paragraph above), the implementation uses pico-threads to represent the potential parallelism in the code. There are PSVM instructions for invoking a separate operation, or a local block, as a separate parallel pico-thread. Currently this is used to support parallel evaluation of expressions or concurrent sequences ("||"), where each operand of a call on an operation, or each side of "||", can be evaluated in its own pico-thread. In fact, the compiler is a bit smarter than that, and it doesn't bother to create a pico-thread if the computation of the operand involves only simple built-in operations (such as basic arithmetic, numeric comparisons, etc.). If it involves an out-of-line call then it becomes a candidate for being turned into a pico-thread (this is clearly an area for tuning of the compiler as more experience is gained).
So it makes sense for the body of a concurrent loop, or a loop that gains parallelism through multiple parallel "next" values or explicit uses of continue loop inside a concurrent sequence, to be executed as a separate pico-thread. The compiler can turn the loop body into a local operation, with the parameter(s) being the loop index (or indices). A continue loop is essentially a (tail) recursive call on this local loop-body operation, with the parameter being the next value specified in the with-values clause. Similarly, if there are explicit "next" values given after then in the loop header, these can be handled using (parallel) calls on the loop-body operation, with the "next" value as the parameter.
Rather than using tail recursion, an alternative is to use a parallel call followed by a return, with the pico-thread for the parallel call being associated with a pico-thread master that controls the entire loop. At the outer level, the whole loop can then be executed by creating the pico-thread master, and then starting the loop off with a parallel call on the loop-body operation, followed by waiting for the pico-thread master to be complete.
The overall flow would be:
When the master is complete:
For loops are a bit unusual in ParaSail, as they support both a continue loop and an exit loop (i.e. "break") capability in the presence of parallelism. Here is an example using continue loop:
for X => Root while X not null loop case X.Kind of [#leaf] => Process(X.Data); [#unary] => Process(X.Data); || continue loop with X => X.Operand; [#binary] => Process(X.Data); || continue loop with X => X.Left; || continue loop with X => X.Right; end case; end loop;Here is an example using exit loop:
const Result : Id_Type; for X => Root then X.Left || X.Right while X not null concurrent loop if X.Value == Desired_Value then exit loop with Result => X.Id; end if; end loop with Result => null;This first example processes an expression tree, and creates parallelism by using an explicit continue loop statement in concert with a concurrent sequence ("||") construct. The second example searches a binary tree and creates parallelism by both the use of concurrent loop and by having two values in the then part of the for loop header, which will result in the execution of the loop splitting into two separate threads at each binary node.
The semantics of saying concurrent loop rather than simply loop is that the next iteration starts before even starting the current iteration. If we had said simply loop in the second example, then it would first check the Root node to see if it has the Desired_Value, and only if it did not would we proceed on to the next iteration(s), at which point we would check both the Left and Right subtrees (presuming both were non-null). Since we said concurrent loop, the iteration on the Left and Right subtrees begins in parallel with executing the loop body on the Root.
The semantics of the exit loop statement are that all other threads within the loop being exited are stopped, and when the thread first reaching the exit loop is the only thread remaining, the assignments specified by the with-values clause are performed. If the loop ends normally without an exit loop, then the assignments specified by the end with-values clause are performed. The net effect is that if the Desired_Value is in the tree, Result ends up with the Id of that node. If it is not found, Result ends up null.
See http://parasail-programming-language.blogspot.com/2010/06/generalized-for-loops-in-parasail.html and http://parasail-programming-language.blogspot.com/2010/06/intentional-race-condition-in-parasail.html for more details about the generalized for loop capabilities.
So given all of the above, what should be the implementation model for a for loop in ParaSail? As mentioned in the blog entry on the ParaSail virtual machine (PSVM -- see link in first paragraph above), the implementation uses pico-threads to represent the potential parallelism in the code. There are PSVM instructions for invoking a separate operation, or a local block, as a separate parallel pico-thread. Currently this is used to support parallel evaluation of expressions or concurrent sequences ("||"), where each operand of a call on an operation, or each side of "||", can be evaluated in its own pico-thread. In fact, the compiler is a bit smarter than that, and it doesn't bother to create a pico-thread if the computation of the operand involves only simple built-in operations (such as basic arithmetic, numeric comparisons, etc.). If it involves an out-of-line call then it becomes a candidate for being turned into a pico-thread (this is clearly an area for tuning of the compiler as more experience is gained).
So it makes sense for the body of a concurrent loop, or a loop that gains parallelism through multiple parallel "next" values or explicit uses of continue loop inside a concurrent sequence, to be executed as a separate pico-thread. The compiler can turn the loop body into a local operation, with the parameter(s) being the loop index (or indices). A continue loop is essentially a (tail) recursive call on this local loop-body operation, with the parameter being the next value specified in the with-values clause. Similarly, if there are explicit "next" values given after then in the loop header, these can be handled using (parallel) calls on the loop-body operation, with the "next" value as the parameter.
Rather than using tail recursion, an alternative is to use a parallel call followed by a return, with the pico-thread for the parallel call being associated with a pico-thread master that controls the entire loop. At the outer level, the whole loop can then be executed by creating the pico-thread master, and then starting the loop off with a parallel call on the loop-body operation, followed by waiting for the pico-thread master to be complete.
The overall flow would be:
- Create pico-thread master.
- Initiate parallel-call on loop-body operation with parameter referring to initial value of loop index, presuming initial value satisfies the while condition (if any).
- Wait for pico-thread master to be complete.
- Determine next value(s) for loop index.
- Initiate parallel call(s) on loop-body operation with parameter referring to next value, presuming value satisfies the while condition (if any).
- Return from the loop-body operation (don't wait for the call(s) to complete).
- Determine next value(s) for loop index.
- Initiate parallel call(s) on loop-body operation with parameter referring to next value, presuming value satisfies the while condition (if any).
- Perform the loop body for the current value.
- Return from the loop-body operation (don't wait for the call(s) to complete).
- Tell pico-thread master of loop to terminate all but the current thread.
- Wait for other threads to terminate.
- Perform with-values assignment(s).
- Return from the loop-body operation.
When the master is complete:
- If there is a with-values clause on the end loop and the master completed normally, as opposed to being terminated early by an exit loop statement:
- Perform the assignment(s) specified by the end loop with-values clause.
Friday, June 3, 2011
Upcoming presentations on ParaSail -- Edinburgh and Portland
This summer we will be presenting on ParaSail at:
- AdaEurope 2011 in Edinburgh, Scotland, Tutorial/Workshop on Experimenting with ParaSail (Monday 20-June-2011, 9:00am - 1:00pm GMT) http://conferences.ncl.ac.uk/adaconnection2011/tutorials/T1.html
- OSCON 2011 in Portland, Introduction to ParaSail (Weds. 27-July-2011 10:40am - 11:20am PDT Room: E144 --NOTE the time change) http://www.oscon.com/oscon2011/public/schedule/detail/18557
Wednesday, May 25, 2011
When null isn't just null -- value representation in ParaSail
The qualifier optional may be applied to any type in ParaSail, effectively producing a type with one additional value, namely the null value. In many languages, null is only used for pointer types, and is often represented as a zero address. In ParaSail, we want to be able to add the null value to any type without presuming that the zero value is available for that purpose, so this means that the representation for null will vary depending on the particular type.
For simple types, coming up with a null value is pretty straightforward. For unsigned integer types, a value one greater than the maximum value could be used for null. For example, given an unsigned type going from 0 to 100, the value 101 could be used to represent null. Similarly, for enumeration types that use an internal unsigned integer representation, such as 0 for #false and 1 for #true, the null value could be represented by one more than the maximum internal unsigned integer code, for example 2 for the null value of an optional Boolean type. For a signed integer type, one less than the minimum value might make sense, particularly on a 2's complement machine where there is one "extra" negative value anyway. So for a 64-bit signed integer, one might use -2**63+1 .. 2**63-1 for "normal" values of the type, and reserve -2**63 for the null value. Most floating point representations include the notion of "Not a Number" (NaN), and some NaN value would make sense for null. Since there are no run-time checks in ParaSail (checking all being done at compile-time), it would be fine to use a "non-signaling" NaN for null.
For more complex types, the representation for null is a bit more interesting. One common kind of type would be a simple "wrapper," that is, a type defined by a class that has exactly one var component. For example:
So our conclusion is that a wrapper type can use the same representation as its component type so long as the component type is not itself marked optional. If the component type is itself marked optional, then the wrapper needs to allow for its "own" representation for null, which might in some cases be accommodated by simply allowing for yet one more value if the component type is "simple," or a level of indirection for a more complex component type.
Now what about more complex types? For example, a type defined by a class with multiple var components:
So back to the initial question -- how will we represent objects with multiple components (or with a single component whose type is marked optional)? And how will we represent null for such objects? One important thing to remember is that (large) ParaSail objects live in regions, and the regions are associated with stack frames. Whenever assigning a value to an object, if the new value can't "fit" in the same space as its current value, then the space for the old value needs to be released and the space for the new value needs to be allocated, in the appropriate region. Since a "ref var" parameter (or subcomponent thereof) might be assigned a new value, it won't necessarily be known at compile-time what region the object being assigned "lives" in. This suggests that every value for a large object must identify its region, so that its space can be released back to that region when it is freed, and so that the new value can be allocated in that region. This same requirement applies even if the object has a null value to begin with. Hence, we are left with the conclusion that a null value for a "large" type needs to identify its region.
Another issue for "large" objects is that we need to know how to reclaim the entire "subtree" of subcomponents when we release the enclosing object. In some cases we will know the type at compile-time, but in the case of polymorphic objects, or in cases where the type of the object is a formal type parameter of the enclosing module, we won't necessarily know. This implies that we may want to have some kind of "type-id" associated with large objects. This then results in the following representation for large objects:
where the Type-Id provides enough information to know which of the components are themselves "large" and hence need to be recursively released. In addition, there would probably be a special null Type-Id, which would allow the "is null" operation to be implemented relatively efficiently by comparing the Type-Id against some kind of standard "Null-Type-Id" value. Each region would only need a single null value, which pointed back to the region, and had the Null-Type-Id. Such null objects would not be reclaimed if they were the "old" value of the left-hand side of an assignment, since there is only one such value per region, and it is shared among all objects in that region currently having a null value.
So to summarize, we now see that null is not a single value, but rather each simple type potentially has its own representation for null, and for "large" types that use a level of indirection, there is a separate null value for each region. So the "is null" operation is not necessarily a simple check for equality with the global null value, but instead would depend on the type, and in particular for "large" objects, would be a check of the Type-Id field of the object to see if it has the Null-Type-Id.
On a final note, when a function returns "large" values, it needs to be "told" in which region to allocate the value(s) being returned. One simple way to do this is for the "large" output parameter(s) to be initialized with a (large) null value by the caller. Since such a null value identifies the region, the function can use that information when allocating the space for the returned value.
For simple types, coming up with a null value is pretty straightforward. For unsigned integer types, a value one greater than the maximum value could be used for null. For example, given an unsigned type going from 0 to 100, the value 101 could be used to represent null. Similarly, for enumeration types that use an internal unsigned integer representation, such as 0 for #false and 1 for #true, the null value could be represented by one more than the maximum internal unsigned integer code, for example 2 for the null value of an optional Boolean type. For a signed integer type, one less than the minimum value might make sense, particularly on a 2's complement machine where there is one "extra" negative value anyway. So for a 64-bit signed integer, one might use -2**63+1 .. 2**63-1 for "normal" values of the type, and reserve -2**63 for the null value. Most floating point representations include the notion of "Not a Number" (NaN), and some NaN value would make sense for null. Since there are no run-time checks in ParaSail (checking all being done at compile-time), it would be fine to use a "non-signaling" NaN for null.
For more complex types, the representation for null is a bit more interesting. One common kind of type would be a simple "wrapper," that is, a type defined by a class that has exactly one var component. For example:
class Box<Contents_Type is Assignable<>> is var Contents : Contents_Type; exports function Create(Value : Contents_Type) -> Box is return (Contents => Value); end function Create; ... end class Box;In this case, it would be nice to have the wrapper type use exactly the same representation as that of the underlying component type (e.g. Contents_Type). This would mean that the null value for the wrapper would be the same as the null value for the component type. This does mean that the component type must not itself be marked as optional, as then there would be no way to distinguish the wrapper being non-null but containing a single null component, from the case where the wrapper itself is null.
So our conclusion is that a wrapper type can use the same representation as its component type so long as the component type is not itself marked optional. If the component type is itself marked optional, then the wrapper needs to allow for its "own" representation for null, which might in some cases be accommodated by simply allowing for yet one more value if the component type is "simple," or a level of indirection for a more complex component type.
Now what about more complex types? For example, a type defined by a class with multiple var components:
class Pair<Element_Type is Assignable<>> is var Left : Element_Type; var Right : Element_Type; exports function Cons(Left, Right : Element_Type) -> Pair is return (Left => Left, Right => Right); end function Cons; ... end class Pair;One obvious representation for a type with multiple components is as a sequence of memory cells long enough to accommodate the representation of each of the components, and then some provision for representing null, which could be by piggy-backing on one of the components if it is not itself allowed to be null, or by adding an additional bit to indicate null-ness. However, in our initial (prototype, ParaSail-Virtual-Machine-based) implementation, we have chosen to represent every object using a single 64-bit memory cell. This basically means that if the value cannot fit in a single 64-bit cell, it will need to use a level of indirection. To simplify further, we won't be doing any "packing" initially, so even if the components are each quite short (such as booleans), we will nevertheless go to an indirect representation. We do anticipate supporting packed arrays, but that would initially be handled by doing explicit shifting and masking, rather than by building in the notion of packing into the ParaSail Virtual Machine instruction set. In the ParaSail Virtual Machine, pretty much everything occupies a single 64-bit word.
So back to the initial question -- how will we represent objects with multiple components (or with a single component whose type is marked optional)? And how will we represent null for such objects? One important thing to remember is that (large) ParaSail objects live in regions, and the regions are associated with stack frames. Whenever assigning a value to an object, if the new value can't "fit" in the same space as its current value, then the space for the old value needs to be released and the space for the new value needs to be allocated, in the appropriate region. Since a "ref var" parameter (or subcomponent thereof) might be assigned a new value, it won't necessarily be known at compile-time what region the object being assigned "lives" in. This suggests that every value for a large object must identify its region, so that its space can be released back to that region when it is freed, and so that the new value can be allocated in that region. This same requirement applies even if the object has a null value to begin with. Hence, we are left with the conclusion that a null value for a "large" type needs to identify its region.
Another issue for "large" objects is that we need to know how to reclaim the entire "subtree" of subcomponents when we release the enclosing object. In some cases we will know the type at compile-time, but in the case of polymorphic objects, or in cases where the type of the object is a formal type parameter of the enclosing module, we won't necessarily know. This implies that we may want to have some kind of "type-id" associated with large objects. This then results in the following representation for large objects:
A 64-bit pointer to: [Region, Type-Id, component1, component2, component3, ...]
where the Type-Id provides enough information to know which of the components are themselves "large" and hence need to be recursively released. In addition, there would probably be a special null Type-Id, which would allow the "is null" operation to be implemented relatively efficiently by comparing the Type-Id against some kind of standard "Null-Type-Id" value. Each region would only need a single null value, which pointed back to the region, and had the Null-Type-Id. Such null objects would not be reclaimed if they were the "old" value of the left-hand side of an assignment, since there is only one such value per region, and it is shared among all objects in that region currently having a null value.
So to summarize, we now see that null is not a single value, but rather each simple type potentially has its own representation for null, and for "large" types that use a level of indirection, there is a separate null value for each region. So the "is null" operation is not necessarily a simple check for equality with the global null value, but instead would depend on the type, and in particular for "large" objects, would be a check of the Type-Id field of the object to see if it has the Null-Type-Id.
On a final note, when a function returns "large" values, it needs to be "told" in which region to allocate the value(s) being returned. One simple way to do this is for the "large" output parameter(s) to be initialized with a (large) null value by the caller. Since such a null value identifies the region, the function can use that information when allocating the space for the returned value.
Monday, May 9, 2011
Updated YACC grammar for ParaSail
Here is a more up-to-date YACC grammar from ParaSail. It uses "--" for comments, but otherwise is pretty vanilla "yacc" syntax.
-------------------------------------- -- YACC Grammar for ParaSail -------------------------------------- -- Single-character delimiters -- %token ',' ';' ':' '.' %token '+' '-' '*' '/' %token '?' %token '(' ')' '[' ']' '<' '>' '' %token '|' %token '=' -- for error recovery only %token PRIME -- ''' -- Compound delimiters -- %token COMPARE -- "=?" %token EQ -- "==" %token NEQ -- "!=" %token GEQ -- ">=" %token LEQ -- "<=" %token POWER -- "**" %token ASSIGN -- ":=" %token SWAP -- ":=:" %token DOT_DOT -- ".." %token OPEN_CLOSED_INTERVAL -- "<.." %token OPEN_INTERVAL -- "<..<" %token CLOSED_OPEN_INTERVAL -- "..<" %token DOUBLE_COLON -- "::" %token REFERS_TO -- "=>" %token GIVES -- "->" %token IMPLIES -- "==>" %token SEQUENCE -- ";;" %token PARALLEL -- "||" %token PLUS_ASSIGN -- "+=" %token MINUS_ASSIGN -- "-=" %token TIMES_ASSIGN -- "*=" %token DIVIDE_ASSIGN -- "/=" %token POWER_ASSIGN -- "**=" %token CONCAT_ASSIGN -- "|=" %token AND_ASSIGN -- "and=" %token OR_ASSIGN -- "or=" %token XOR_ASSIGN -- "xor=" -- Literals -- %token Char_Literal %token Enum_Literal %token Integer_Literal %token Real_Literal %token String_Literal -- Identifier -- %token Identifier -- Reserved words -- %token ABS_kw %token ABSTRACT_kw %token ALL_kw %token AND_kw %token BEGIN_kw -- used for error recovery only %token BLOCK_kw %token CASE_kw %token CLASS_kw %token CONCURRENT_kw %token CONST_kw %token CONTINUE_kw %token EACH_kw %token ELSE_kw %token ELSIF_kw %token END_kw %token EXIT_kw %token EXPORTS_kw %token EXTENDS_kw %token FOR_kw %token FORWARD_kw %token FUNCTION_kw %token GLOBAL_kw %token IF_kw %token IMPLEMENTS_kw %token IMPORT_kw %token IN_kw %token INTERFACE_kw %token IS_kw %token LAMBDA_kw %token LOCKED_kw %token LOOP_kw %token MOD_kw %token MUTABLE_kw %token NEW_kw %token NOT_kw %token NULL_kw %token OF_kw %token OPERATOR_kw %token OPTIONAL_kw %token OR_kw %token PRIVATE_kw %token PROCEDURE_kw %token QUEUED_kw %token REF_kw %token REM_kw %token RETURN_kw %token REVERSE_kw %token SELECT_kw %token SOME_kw %token THEN_kw %token TYPE_kw %token UNTIL_kw %token VAR_kw %token WHILE_kw %token WITH_kw %token XOR_kw %start module_list %% module_list : module | module_list module ; module : import_clauses interface_declaration ';' | import_clauses class_definition ';' | import_clauses standalone_operation_definition ';' | import_clauses error ';' ; import_clauses : | import_clauses IMPORT_kw qualified_name_list ';' ; qualified_name_list : qualified_name | qualified_name_list ',' qualified_name ; interface_declaration : opt_interface_qualifier INTERFACE_kw module_defining_name formals_and_implemented_interfaces IS_kw interface_element_list END_kw opt_INTERFACE_kw module_defining_name ; opt_INTERFACE_kw : INTERFACE_kw | ; opt_interface_qualifier : interface_qualifier | ; interface_qualifier : class_qualifier | ABSTRACT_kw opt_class_qualifier | PRIVATE_kw opt_class_qualifier ; opt_class_qualifier : class_qualifier | ; class_qualifier : CONCURRENT_kw ; standalone_operation_definition : function_definition | procedure_definition | operator_definition | operation_import ; formals : '<' opt_module_formal_list '>' ; formals_and_implemented_interfaces : opt_formals opt_implements_list | opt_formals EXTENDS_kw interface_name opt_implements_list ; opt_formals : formals | ; opt_implements_list : implements_list | ; implements_list : IMPLEMENTS_kw interface_name_list ; interface_name_list : interface_name | interface_name_list ',' interface_name ; interface_name : module_name | module_instantiation ; module_name : qualified_name ; module_defining_name : qualified_name | qualified_name add_on_label ; add_on_label : '[' operation_actual_list ']' ; opt_module_formal_list : module_formal_list | ; module_formal_list : annotated_module_formal | module_formal_list ';' annotated_module_formal ; annotated_module_formal : opt_annotation type_formal opt_annotation | opt_annotation operation_formal | opt_annotation value_formal opt_annotation ; opt_annotation : annotation | ; type_formal : id IS_kw module_instantiation | module_instantiation ; operation_formal : operation_declaration opt_operation_default ; opt_operation_default : IS_kw simple_expression | ; value_formal : id_list ':' opt_output_modifier operand_type_specifier | id_list ':' opt_output_modifier operand_type_specifier ASSIGN_or_equal simple_expression | ref_or_global_modifier operand_type_specifier ; id : Identifier ; id_list : id | id_list ',' id ; type_name : qualified_name | polymorphic_type_name ; polymorphic_type_name : qualified_name '+' ; qualified_name : id_or_string_literal | qualified_name DOUBLE_COLON id_or_string_literal ; id_or_string_literal : id | String_Literal ; module_instantiation : module_name '<' opt_module_actual_list '>' | name '[' opt_operation_actual_list ']' '<' opt_module_actual_list '>' ; opt_add_on_label : add_on_label | ; opt_module_actual_list : module_actual_list | ; module_actual_list : module_actual | module_actual_list ',' module_actual ; module_actual : simple_type_specifier_or_expression | id REFERS_TO simple_type_specifier_or_expression ; -- simple_expression subsumes simple type_name in this rule simple_type_specifier_or_expression : qualified_name annotation | qualified_type_specifier opt_annotation | simple_expression -- simple_expr to avoid problems with '>' | lambda_expression | module_instantiation ; annotated_type_specifier : type_specifier | type_specifier annotation ; type_specifier : basic_type_specifier | qualified_type_specifier ; basic_type_specifier : type_name | module_instantiation | module_instantiation EXTENDS_kw type_specifier ; qualified_type_specifier : type_qualifier type_name | type_qualifier module_instantiation | type_qualifier module_instantiation EXTENDS_kw type_specifier ; type_qualifier : OPTIONAL_kw opt_MUTABLE_kw opt_CONCURRENT_kw | MUTABLE_kw opt_CONCURRENT_kw | CONCURRENT_kw ; opt_CONCURRENT_kw : CONCURRENT_kw | ; interface_element_list : | interface_element_list interface_element ';' | interface_element_list operation_import ';' | interface_element_list error ';' ; interface_element : operation_declaration | object_declaration | interface_declaration | type_declaration ; operation_import : function_declaration IS_kw import_operation | procedure_declaration IS_kw import_operation | operator_declaration IS_kw import_operation ; class_definition : opt_class_qualifier CLASS_kw module_defining_name class_formals_and_implemented_interfaces IS_kw class_element_list END_kw opt_CLASS_kw module_defining_name ; opt_CLASS_kw : CLASS_kw | ; class_formals_and_implemented_interfaces : formals_and_implemented_interfaces | opt_formals EXTENDS_kw id ':' interface_name opt_implements_list ; class_element_list : local_class_element_list EXPORTS_kw exported_class_element_list | local_class_element_list annotation EXPORTS_kw exported_class_element_list | local_class_element_list ; local_class_element_list : | local_class_element_list local_class_element ';' ; local_class_element : interface_element | operation_import | annotated_exported_class_element ; exported_class_element_list : | exported_class_element_list annotated_exported_class_element ';' | exported_class_element_list operation_import ';' | exported_class_element_list interface_element ';' | exported_class_element_list error ';' ; annotated_exported_class_element : | exported_class_element | annotation | annotation exported_class_element ; exported_class_element : operation_definition | object_definition | class_definition ; annotation : '' | annotation '' ; annotation_element_list : annotation_element | annotation_element_list ';' annotation_element | annotation_element_list ';' error ; annotation_element : interface_element | operation_import | condition | quantified_expression | annotation ; condition : expression ; operation_declaration : function_declaration | procedure_declaration | operator_declaration ; function_declaration : opt_ABSTRACT_or_OPTIONAL_kw FUNCTION_kw id_or_string_literal operation_inputs opt_annotation GIVES_or_RETURN_kw operation_outputs opt_annotation ; GIVES_or_RETURN_kw : GIVES | RETURN_kw ; procedure_declaration : opt_ABSTRACT_or_OPTIONAL_kw PROCEDURE_kw id operation_inputs opt_annotation ; operator_declaration : opt_ABSTRACT_or_OPTIONAL_kw OPERATOR_kw operator_designator operation_inputs opt_annotation | opt_ABSTRACT_or_OPTIONAL_kw OPERATOR_kw operator_designator operation_inputs opt_annotation GIVES_or_RETURN_kw operation_outputs opt_annotation ; opt_ABSTRACT_or_OPTIONAL_kw : ABSTRACT_kw | OPTIONAL_kw | ; operator_designator : String_Literal ; operation_inputs : simple_operation_input | '(' opt_annotated_operation_input_list ')' | '(' id_list ',' id ')' | ; simple_operation_input : -- avoids trailing use of "IS" id ':' opt_input_modifier simple_operand_type_specifier | input_modifier simple_operand_type_specifier | simple_operand_type_specifier ; opt_annotated_operation_input_list : annotated_operation_input_list | ; annotated_operation_input_list : annotated_operation_input | annotated_operation_input_list ';' annotated_operation_input ; annotated_operation_input : operation_input opt_annotation | annotation operation_input opt_annotation | function_declaration opt_operation_default | annotation function_declaration opt_operation_default | procedure_declaration opt_operation_default | annotation procedure_declaration opt_operation_default ; operation_input : id_list ':' opt_input_modifier operand_type_specifier opt_ASSIGN_expression | input_modifier operand_type_specifier opt_ASSIGN_expression | operand_type_specifier opt_ASSIGN_expression ; opt_input_modifier : input_modifier | ; simple_operand_type_specifier : type_name | module_instantiation ; operand_type_specifier : simple_operand_type_specifier | id IS_kw module_instantiation ; input_modifier : output_modifier | QUEUED_kw mutable_or_var_or_const | LOCKED_kw mutable_or_var_or_const ; mutable_or_var_or_const : MUTABLE_kw opt_VAR_kw | VAR_kw | CONST_kw ; opt_VAR_kw : VAR_kw | ; operation_outputs : simple_operation_output | annotation simple_operation_output | '(' annotated_operation_output_list ')' | '(' id_list ',' id ')' ; simple_operation_output : -- avoids trailing use of "IS" id ':' opt_output_modifier simple_operand_type_specifier | output_modifier simple_operand_type_specifier | simple_operand_type_specifier ; annotated_operation_output_list : annotated_operation_output | annotated_operation_output_list ';' annotated_operation_output ; annotated_operation_output : operation_output opt_annotation | annotation operation_output opt_annotation ; operation_output : id_list ':' opt_output_modifier operand_type_specifier | output_modifier operand_type_specifier | operand_type_specifier ; opt_output_modifier : output_modifier | ; output_modifier : OPTIONAL_kw | ref_or_global_modifier ; ref_or_global_modifier : REF_or_GLOBAL_opt_optional_mutable | REF_or_GLOBAL_opt_optional_mutable VAR_kw | REF_or_GLOBAL_opt_optional_mutable CONST_kw ; REF_or_GLOBAL_opt_optional_mutable : REF_or_GLOBAL_kw | REF_or_GLOBAL_kw OPTIONAL_kw | REF_or_GLOBAL_kw MUTABLE_kw | REF_or_GLOBAL_kw OPTIONAL_kw MUTABLE_kw ; REF_or_GLOBAL_kw : REF_kw | GLOBAL_kw ; object_declaration : VAR_kw id ':' annotated_type_specifier opt_ASSIGN_expression | CONST_kw id ':' annotated_type_specifier opt_ASSIGN_expression | id ':' annotated_type_specifier opt_ASSIGN_expression ; opt_ASSIGN_expression : ASSIGN_or_equal expression | ; object_definition : CONST_kw id ASSIGN_or_equal expression | VAR_kw id ASSIGN_or_equal expression | CONST_kw id REFERS_TO name | VAR_kw id REFERS_TO name ; opt_OPTIONAL_kw : OPTIONAL_kw | ; opt_MUTABLE_kw : MUTABLE_kw | ; type_declaration : TYPE_kw id IS_kw opt_NEW_kw annotated_type_specifier ; opt_NEW_kw : NEW_kw | ; operation_definition : function_definition | procedure_definition | operator_definition ; function_definition : function_declaration IS_kw opt_queued_clause statement_list_with_semi END_kw opt_FUNCTION_kw id ; opt_FUNCTION_kw : FUNCTION_kw | ; procedure_definition : procedure_declaration IS_kw opt_queued_clause statement_list_with_semi END_kw opt_PROCEDURE_kw id ; opt_PROCEDURE_kw : PROCEDURE_kw | ; operator_definition : operator_declaration IS_kw statement_list_with_semi END_kw opt_OPERATOR_kw operator_designator ; opt_OPERATOR_kw : OPERATOR_kw | ; opt_queued_clause : queued_clause | ; queued_clause : QUEUED_kw WHILE_or_UNTIL_kw condition THEN_kw ; import_operation : IMPORT_kw '(' opt_operation_actual_list ')' ; statement_list_with_semi : parallel_sequence_with_semi | statement_list_no_semi THEN_kw parallel_sequence_with_semi | statement_list_with_semi THEN_kw parallel_sequence_with_semi | statement_list_with_semi use_BEGIN_kw parallel_sequence_with_semi | use_BEGIN_kw parallel_sequence_with_semi ; use_BEGIN_kw : BEGIN_kw ; statement_list_no_semi : parallel_sequence_no_semi | statement_list_no_semi THEN_kw parallel_sequence_no_semi | statement_list_with_semi THEN_kw parallel_sequence_no_semi ; parallel_sequence_with_semi : statement_sequence_with_semi | parallel_sequence_no_semi PARALLEL statement_sequence_with_semi | parallel_sequence_with_semi PARALLEL statement_sequence_with_semi ; parallel_sequence_no_semi : statement_sequence | parallel_sequence_no_semi PARALLEL statement_sequence | parallel_sequence_with_semi PARALLEL statement_sequence ; statement_sequence_opt_semi : statement_sequence_with_semi | statement_sequence ; statement_sequence_with_semi : statement_sequence ';' ; statement_sequence : annotated_statement | statement_sequence SEQUENCE annotated_statement | statement_sequence ';' SEQUENCE annotated_statement | statement_sequence ';' annotated_statement ; annotated_statement : opt_annotation local_declaration | opt_annotation statement opt_annotation | annotation ; statement : local_definition | simple_statement | label compound_statement | compound_statement ; simple_statement : primitive_statement | name equal_as_assign expression | NULL_kw | name '(' opt_operation_actual_list ')' | RETURN_kw expression | RETURN_kw opt_WITH_values | CONTINUE_kw LOOP_kw opt_id opt_WITH_values | EXIT_kw compound_statement_kind opt_id opt_WITH_values ; primitive_statement : name assign_operator_not_divide expression | name DIVIDE_ASSIGN expression | name SWAP name | '(' opt_operation_actual_list ')' ASSIGN expression ; opt_operation_actual_list : operation_actual_list | ; opt_WITH_values : WITH_values | ; WITH_values : WITH_kw operation_actual ; opt_id : id | ; compound_statement_kind : LOOP_kw | IF_kw | CASE_kw | SELECT_kw | BLOCK_kw ; local_declaration : operation_declaration | type_declaration | object_declaration ; local_definition : object_definition | operation_definition ; label : '*' id '*' ; compound_statement : if_statement | case_statement | indefinite_loop_statement | while_until_loop_statement | for_loop_statement | block_statement | select_statement | error ';' ; if_statement : IF_kw condition THEN_kw statement_list_with_semi opt_else_part END_kw IF_kw opt_id opt_WITH_values ; opt_else_part : ELSIF_kw condition THEN_kw statement_list_with_semi opt_else_part | ELSE_kw statement_list_with_semi | ; case_statement : CASE_kw expression OF_kw case_alt_list opt_default_alt END_kw CASE_kw opt_id opt_WITH_values ; case_alt_list : case_alt | case_alt_list case_alt ; case_alt : '[' simple_expression_opt_named ']' REFERS_TO statement_list_with_semi ; simple_expression_opt_named : simple_expression | id ':' simple_expression ; opt_default_alt : '[' dot_dot_opt_named ']' REFERS_TO statement_list_with_semi | ; dot_dot_opt_named : dot_dot_as_interval | id ':' dot_dot_as_interval ; dot_dot_as_interval : DOT_DOT ; indefinite_loop_statement : LOOP_kw statement_list_with_semi END_kw LOOP_kw opt_id opt_WITH_values ; while_until_loop_statement : WHILE_or_UNTIL_kw condition LOOP_kw statement_list_with_semi END_kw LOOP_kw opt_id opt_WITH_values ; WHILE_or_UNTIL_kw : WHILE_kw | UNTIL_kw ; for_loop_statement : FOR_kw iterator_spec opt_direction LOOP_kw statement_list_with_semi END_kw LOOP_kw opt_id opt_WITH_values ; iterator_spec : iterator | '(' iterator_list ')' ; iterator_list : iterator | iterator_list ';' iterator ; iterator : index_set_iterator | EACH_kw element_iterator | initial_next_while_iterator | initial_value_iterator ; index_set_iterator : id opt_COLON_type_specifier IN_kw opt_REVERSE_kw expression ; opt_REVERSE_kw : | REVERSE_kw ; element_iterator : id opt_COLON_type_specifier OF_kw expression | '[' id REFERS_TO id ']' OF_kw expression ; initial_next_while_iterator : id opt_COLON_type_specifier ASSIGN_or_equal expression opt_THEN_next_value_list WHILE_or_UNTIL_kw condition | id opt_COLON_type_specifier REFERS_TO name opt_THEN_next_name_list WHILE_or_UNTIL_kw condition ; opt_THEN_next_value_list : THEN_kw next_value_list | ; opt_THEN_next_name_list : THEN_kw next_name_list | ; initial_value_iterator : id opt_COLON_type_specifier ASSIGN_or_equal expression | id opt_COLON_type_specifier REFERS_TO name ; opt_COLON_type_specifier : ':' type_specifier | ; next_value_list : expression | next_value_list PARALLEL expression ; next_name_list : name | next_name_list PARALLEL name ; opt_direction : direction | ; direction : CONCURRENT_kw | FORWARD_kw | REVERSE_kw ; select_statement : SELECT_kw select_alt_list END_kw SELECT_kw opt_id opt_WITH_values ; select_alt_list : select_alt | select_alt_list PARALLEL select_alt ; select_alt : '[' statement_sequence_opt_semi ']' REFERS_TO statement_sequence_with_semi ; block_statement : BLOCK_kw statement_list_with_semi END_kw opt_BLOCK_kw opt_id opt_WITH_values ; opt_BLOCK_kw : BLOCK_kw | ; expression : expression_no_err | expression_no_err divide_assign_as_not_equal expression_no_err ; divide_assign_as_not_equal : DIVIDE_ASSIGN ; expression_no_err : logical_expression | logical_expression '?' expression ':' expression_no_err | lambda_expression ; lambda_expression : LAMBDA_kw operation_inputs opt_annotation IS_kw simple_expression_or_expr_stmt_seq | LAMBDA_kw operation_inputs opt_annotation GIVES operation_outputs opt_annotation IS_kw simple_expression_or_expr_stmt_seq ; simple_expression_or_expr_stmt_seq : simple_expression | '(' expr_statement_seq ')' ; expr_statement_seq : expr_statement ';' expr_statement | expr_statement_seq ';' expr_statement ; expr_statement : primitive_statement | expression_no_err ; logical_expression : comparison_expression | logical_expression logical_operator comparison_expression ; comparison_expression : -- comparisons are non associative simple_expression | simple_expression comparison_operator simple_expression | adding_expression IN_kw simple_expression | adding_expression NOT_kw IN_kw simple_expression | adding_expression IS_kw NULL_kw | adding_expression NOT_kw NULL_kw ; simple_expression : -- used to avoid use of '>' in module instantiation simple_expression_component | simple_expression '|' simple_expression_component ; simple_expression_component : adding_expression | adding_expression interval_operator adding_expression | adding_expression interval_operator | interval_operator adding_expression | adding_expression '+' ; adding_expression : term | adding_expression adding_operator term ; term : factor | term multiplying_operator factor ; factor : primary | primary power_operator factor | unary_operator factor ; primary : name | literal | '(' conditional_expression ')' | '(' quantified_expression ')' | aggregate ; literal: -- NOTE: See "name" for String_Literal Integer_Literal | Real_Literal | Char_Literal | Enum_Literal | NULL_kw ; name : qualified_name_and_property opt_PRIME | qualified_name DOUBLE_COLON literal | name '(' opt_operation_actual_list ')' | name '[' opt_operation_actual_list ']' | name '.' selector ; qualified_name_and_property : qualified_name | qualified_name_and_property Enum_Literal ; opt_PRIME : PRIME | PRIME Identifier | ; operation_actual_list : operation_actual | operation_actual_list ',' operation_actual ; operation_actual : expression_no_err | id REFERS_TO expression ; selector : id ; unary_operator : '+' | '-' | ABS_kw | NOT_kw ; adding_operator : '+' | '-' ; multiplying_operator : '*' | '/' | MOD_kw | REM_kw ; power_operator : POWER ; assign_operator : assign_operator_not_divide | DIVIDE_ASSIGN ; assign_operator_not_divide : ASSIGN | PLUS_ASSIGN | MINUS_ASSIGN | TIMES_ASSIGN | CONCAT_ASSIGN | AND_ASSIGN | OR_ASSIGN | XOR_ASSIGN ; ASSIGN_or_equal : ASSIGN | equal_as_assign ; equal_as_assign : '=' ; comparison_operator : COMPARE | EQ | NEQ | '<' | LEQ | '>' | GEQ | '<' '<' | '>' '>' | '=' ; logical_operator : AND_kw | OR_kw | XOR_kw | AND_kw THEN_kw | OR_kw ELSE_kw | IMPLIES ; interval_operator : DOT_DOT | OPEN_INTERVAL | CLOSED_OPEN_INTERVAL | OPEN_CLOSED_INTERVAL ; aggregate : class_aggregate | container_aggregate ; class_aggregate : '(' opt_operation_actual_list ')' | '(' name DIVIDE_ASSIGN expression ')' | qualified_name DOUBLE_COLON '(' opt_operation_actual_list ')' ; container_aggregate : '[' opt_container_element_list ']' | qualified_name DOUBLE_COLON '[' opt_container_element_list ']' ; opt_container_element_list : container_element_list | DOT_DOT | ; container_element_list : container_element | container_element_list ',' container_element ; container_element : expression | simple_expression REFERS_TO filtered_expression_stream | DOT_DOT REFERS_TO filtered_expression_stream | FOR_kw iterator REFERS_TO filtered_expression_stream ; filtered_expression_stream : expression | expression ':' condition ; conditional_expression : if_expression | case_expression ; if_expression : IF_kw condition THEN_kw expression opt_else_expr ; opt_else_expr : ELSIF_kw condition THEN_kw expression opt_else_expr | ELSE_kw expression | ; case_expression : CASE_kw expression OF_kw case_expr_alt_list ; case_expr_alt_list : case_expr_alt | case_expr_alt_list ';' case_expr_alt ; case_expr_alt : '[' simple_expression_opt_named ']' REFERS_TO expression | '[' dot_dot_opt_named ']' REFERS_TO expression ; quantified_expression : FOR_kw ALL_or_SOME_kw quantified_iterator REFERS_TO condition_or_quantified_expression ; condition_or_quantified_expression : condition | if_expression | quantified_expression ; ALL_or_SOME_kw : ALL_kw | SOME_kw ; quantified_iterator : index_set_iterator | element_iterator | initial_next_while_iterator ; %%
Subscribe to:
Posts (Atom)