Monday, May 31, 2010

Talking about ParaSail

In planning for a talk to be given about ParaSail at the upcoming AdaEurope conference in Valencia, Spain, it seemed useful to try to come up with a set of bullet points that summarize why ParaSail might be of interest.  Here is the first draft of those bullets:
  • Implicit parallelism
    • Parameters evaluated in parallel with "hand-off" semantics,
    • "for" loops unordered by default, must work to force explicit sequencing between statements; 
    • Can request explicit parallelism with "||" and "concurrent loop"
  • Both Lock-free and Lock/Queue-based concurrent data structures
    • Operations on concurrent data structures presumed unordered by default;
    • Can use "dequeue conditions" to control ordering for queued operations;
    • Can wait simultaneously on multiple queued operations, selecting first one that is ready to be dequeued to proceed.
  • Small number of concepts:
    • Module, Type, Object, Operation
    • You get a Type by instantiating a Module (all Modules are "generic").
    • You get an Object by instantiating a Type.
    • An Operation takes inputs and produces outputs; no up-level variables
      • Operation names generally usable without any prefix, presuming some input or output is of type defined by module enclosing operation definition.
    • Nesting of entities uses distinct syntax:
      • type::nested_entity
      • object.component
      • object.operation(...) equiv to operation(object,...)
  • No global variables and no aliasing of (non-concurrent) variables
    • Read-write access only via parameters; 
    • "Context" parameter provided to gain access to operating system, database, user, etc.
    • Compiler prevents aliasing between parameters using "handoff" semantics.
  • Encourages formal approach to software
    • Preconditions, Postconditions, Assertions, Subtype constraints all use consistent syntax:
      • {boolean_expression} 
      • c.f. {P}S{Q} in Hoare logic
    • Compile-time enforcement of assertions and all other language-defined "checks"
      • no run-time exceptions to worry about.
    • [[Expr]] notation (c.f. denotational semantics) used to define semantics of operations in terms of "universal" types.
  • Minimal need for pointers:
    • Generalized containers support indexing and aggregate syntax 
      • A[I], [X, Y, Z], ["abc" => M, "def" => N]
    • Expandable objects with optional back/encloser references
    • Pointers only to stable or concurrent objects
    • Safe region-based/object-based storage management.
  • Five kinds of literals, all usable with user-defined types.
    • Integer, Real, Character, String, and Enumeration.
    • Corresponding Univ_XXX type for each kind of literal with full run-time semantics.
    • Syntactically distinct from other names used for entities
      • 42, 3.1459, 'X', "Hello", #red
    • User-defined types can use literals
      • provide conversion operation from Univ_xxx type to allow use of literals 
      • provide conversion operation to Univ_xxx to allow "[[...]]" denotational semantics notation.
  • Declarations syntactically distinguished from statements
    • Allows intermingling of statements and declarations within a block
    • var X [: T] [:= Expr];
    • const K [: T] := Expr;
    • type T is [new] Queue<actuals>;
    • function F(A : P; B : Q) -> R;
    • [abstract] [concurrent] interface Queue<formals> is ...

Wednesday, May 26, 2010

Handling concurrent events in ParaSail

In a real-time system it is common to have a situation where you are waiting for two or more events, and want to react to the first one that occurs, and cancel the "wait" for the other ones. In ParaSail, waiting for an event is generally represented by a call on an operation of a concurrent object with a queued operand.  For example, here is the concurrent buffer example from an earlier post:
concurrent interface Bounded_Buffer
  <Element_Type is Assignable<>;
   Index_Type is Integer<>> is
    function Create_Buffer(
      Max_In_Buffer : Index_Type {Max_In_Buffer > 0})
      -> Result : Bounded_Buffer;
      // Create buffer of given capacity

    procedure Put(Buffer : queued Bounded_Buffer; 
      Element : Element_Type);
      // Add element to bounded buffer; 
      // remain queued until there is room
      // in the buffer.

    function Get(Buffer : queued Bounded_Buffer)
      -> Element_Type;
      // Retrieve next element from bounded buffer;
      // remain queued until there is an element.
end interface Bounded_Buffer; 
To wait on a Get from two different buffers, you could do this using the select statement in ParaSail:
    var X := Get(Buf1) => ... // use X received from Buf1
    var X := Get(Buf2) => ... // use X received from Buf2
end select;
A select statement attempts to perform each of the queued operations concurrently. The first one to become dequeued causes the others to be canceled, and then proceeds to completion.

Time delays are also represented as queued operations on a concurrent object (a clock or a timer):
concurrent interface Clock<Time_Type is Assignable<>> 
    function Create_Clock(...) -> Clock;
    function Now(C : locked Clock) -> Time_Type;
    procedure Delay_Until
      (C : queued Clock; Wakeup : Time_Type);
end interface Clock;
Presumably a call of Delay_Until(C, Wakeup_Time) will queue the caller until Now >= Wakeup_Time. A possible implementation of the Delay_Until procedure could use that as its dequeue condition:
concurrent class Clock<Time_Type is Assignable<>>
    var Current_Time : Time_Type;
    function Now(C : locked Clock) -> Time_Type is
        return Current_Time;
    end Now;

    procedure Delay_Until
      (C : queued Clock; Wakeup : Time_Type)
      queued until Current_Time >= Wakeup is
        return// nothing more to do
    end Delay_Until;
end class Clock;
We can effectively put a time bound on a select statement by adding an alternative that is a call on an operation like Delay_Until:
    var X := Get(Buf1) => ...
    var X := Get(Buf2) => ...
    Delay_Until(Next_Time) => ...
end select;

Thursday, May 20, 2010

ParaSail without pointers?

We are working on defining the semantics for pointers in ParaSail.  As described in an earlier entry, we plan to use region-based storage management rather than more fine-grained garbage collection.  However, there is the question of how pointers are associated with their region, and more generally how pointers are handled in the type system.  We are considering a somewhat radical approach:  effectively eliminate most pointers, replacing them with generalized container indexing and expandable objects.  Why bother?  Because unrestricted use of pointers can create storage management complexity, including both dangling references and storage leakage, and because pointers create aliasing of variables, where there are multiple ways to get to the same variable, interfering with any attempts to use handoff semantics to avoid unsynchronized access to a (non-concurrent) variable between threads.

By generalized container indexing, we mean using an abstract notion of index into an abstract notion of container.  The most basic kind of container is an array, with the index being an integer value or an enumeration value.  Another kind of container is a hash table, with the index (the key) being something like a string value.  We could imagine that a single container might have multiple kinds of indices, in the same way that a relation in a relational DBMS can have multiple indices.  So for example, a hash table might have an index that identifies directly the node in the hash table using something that is the moral equivalent of a pointer, though it can presumably survive the node being deleted and will properly indicate that the node no longer exists, rather than simply acting as a dangling reference.  If we go to the extreme of a generalized region of heterogeneous objects as the container, and the index identifies the object within the region, we are very close to a pointer.  However, we would still require that such an index correctly handle deletion, indicating that the object no longer exists, rather than simply dangling.

Although indexing can also produce aliasing, it is very well defined and can be easily checked for at the point of the indexing.  Two expressions that index into the same container are aliased if and only if the reference into the container determined by the two indices identify the same element of the container.  For example, A[I] and A[J] are aliased only if I == J.  Or Table["abc"] and Table["def"] are aliased only if the two keys reference the same element of Table.

By expandable objects, we mean treating a declared but not initialized object (including a component of another object) as a kind of stub into which a value can be stored.  Even an initialized object could be overwritten via an assignment which might change the size of the object.  Finally, an initialized object could be set back to its stub state, where the object doesn't really exist anymore.  We discussed a related issue in a past entry on ParaSail references.

Expandable objects eliminate a common use of pointers as a level of indirection to support objects of variable or unknown size.  We propose to provide qualifiers optional and mutable on objects to indicate that they are expandable.  The qualifier optional means that the object starts out null, but may be assigned a non-null value.  The qualifier mutable means that the object may be assigned multiple values of different sizes during its lifetime.  If both are given, then the object starts out null, can be assigned non-null values of varying size during its lifetime, and can be assigned back to null.

Another source of pointers is for back pointers, that is pointers back to the enclosing object (or to the previous node in a doubly-linked list).  We propose to provide something analogous, where a component of an object has the equivalent of a (potentially null) reference to its enclosing object.  In this way a binary tree structure can be constructed by an object with Left and Right optional subobjects, plus an encloser reference to the enclosing object, if any.  Similarly a linked list could be represented as an object with an optional Next subobject, and an encloser reference to the Prev node/enclosing object.  Some kind of move or swap operation would be useful in addition to assignment to support rearranging a tree or linked list structured in this way.

Note that it is perfectly safe to use conventional pointers to objects that are no longer changing (or where any remaining dynamic parts are within concurrent subobjects), and that are known to outlive the object containing the pointer.  So ParaSail should probably allow such pointers, as they introduce no race conditions, nor any other safety issues.

Saturday, May 15, 2010

ParaSail enumeration types

We haven't talked about enumeration types in ParaSail yet.  One challenge is how to define an enumeration type by using the normal syntax for instantiating a module, which is the normal way a type is defined in ParaSail.  Generally languages resort to having special syntax for defining an enumeration type (Haskell being an interesting exception).  But that seems somewhat unsatisfying, given that we can fit essentially all other kinds of type definitions into the model of instantiating a module.  Another challenge with enumeration types is the representation of their literal values.  In many languages, enumeration literals look like other names, and are considered equivalent to a named constant, or in some cases a parameterless function or constructor.  Overloading of enumeration literals (that is using the same name for a literal in two different enumeration types declared in the same scope) may or may not be supported.

In ParaSail we propose the following model: We define a special syntax for enumerals (enumeration literals), of the form #name (e.g. #true, #false, #red, #green). We define a universal type for enumerals, Univ_Enumeration.  We allow a type to provide conversion routines from/to Univ_Enumeration, identified by operator "from_univ" and operator "to_univ".  If a type has a from_univ operator that converts from Univ_Enumeration, then that type is effectively an enumeration type, analogous to the notion that a type that has a from_univ that converts from Univ_Integer is essentially an integer type.  As with other types, double-bracket notation applied to a value of an enumeration type, e.g. [[enumval]], is equivalent to a call on to_univ(enumval), and produces a result of Univ_Enumeration type.  An additional special Boolean "in" operator which takes one Univ_Enumeration parameter determines which enumerals are values of the type.  The notation "X in T" is equivalent to a call on the operator T::"in"(X).  The precondition on from_univ would generally be {univ in Enum_Type}.

Here is an example:
interface Enum<Enumerals : Vector<Univ_Enumeration>> is
  operator "in"(Univ_Enumeration) -> Boolean<>; 
  operator "from_univ"
    (Univ : Univ_Enumeration {Univ in Enum}) -> Enum;
  operator "to_univ"(Val : Enum) 
    -> Result: Univ_Enumeration { Result in Enum };
end interface Enum;

type Color is Enum<[#red,#green,#blue]>;
var X : Color := #red;  // implicit call on from_univ

Here we presume the class associated with the Enum interface implements "in", "from_univ", and "to_univ" by using the Enumerals vector to create a mapping from Univ_Enumeration to the appropriate value of the Enum type, presuming the enumerals map to sequential integers starting at zero. A more complex interface for creating enumeration types might take such a mapping directly, allowing the associated values to be other than the sequential integers starting at zero. The built-in type Boolean is presumed to be essentially Enum<[#false,#true]>.

So why this model?  One is that it fits nicely with the way that other types with literals are defined in ParaSail, by using from_univ/to_univ.  Secondly, it naturally allows overloading of enumeration literals, in the same way that the types of numeric literals are determined by context.  Finally, it makes a strong lexical distinction between enumeration literals and other names.  This could be considered a bug, but on balance we believe it is useful for literals to stand out from, and not conflict with, normal names of objects, types, operations, and modules.  Note that this model is reminiscent of what is done in Lisp with the quote operator applied to names, e.g. 'foo.  It also borrows from the notation in Scheme of #f and #t for the literals representing false and true.