Wednesday, September 30, 2009

ParaSail extension, inheritance, and polymorphism

ParaSail fully supports object-oriented programming, including interface and class extension/inheritance/reuse, and both static (compile-time) and dynamic (run-time) polymorphism.

Interface extension is the most straightforward.  One interface can be defined to extend another, meaning that it inherits all of the operations and generic parameters of some existing interface, and optionally adds more of each:
  interface Extensible_Array extends Array  is
     operator "[]"(Arr : ref Extensible_Array;
        Index : Index_Type {Index >= First(Arr)})
         -> ref Element_Type {Last(Arr) >= Index};
  end interface Extensible_Array;
Here we have essentially the same operation, but now the array automatically grows (at only one end) if the indexing operation is applied to it with an index that is greater than the prior value of Last(Arr).  Note that the generic parameters for Extensible_Array are all inherited as is from Array.

The operations that are not redeclared in the new interface are inherited from Array, but with each occurrence of Array replaced systematically with Extensible_Array.

Unless Extensible_Array is declared as an abstract interface, there must be a corresponding Extensible_Array class that provides its default implementation.  The Extensible_Array class might be defined as an extension of the Array interface (and indirectly its associated class), but it need not be.  It could be a completely different implementation.   Here is what it would look like if it were an extension of the Array interface
  class Extensible_Array extends Parent: Array
  is
    ... 

  exports 
    operator "[]"(...) -> ... is ... end "[]"; 
    function Create_Array(...) -> Extensible_Array is ... end Create_Array;
  end class Extensible_Array;
Local operations and objects could be defined as needed, and exported operations could be redefined as appropriate.  An object of the interface being extended (the underlying interface) is created as a local variable of the class (this variable is the underlying object).  A name can be given to the underlying object -- we use "Parent" above.

Note that the class is extending an interface, rather than directly another class, and in fact any implementation of the interface can later be optionally specified when using Extensible_Array.  Effectively the actual class to use as the implementation of the underlying interface becomes another optional generic parameter.  If unspecified, it defaults to the default implementation of the underlying interface.

For those operations of the interface that are not provided explicitly in the class, an inherited operation is provided, based on the corresponding operation of the underlying interface.  The implementation of this inherited operation invokes the corresponding operation of the Array interface, passing it the underlying Array object of each operand of type Extended_Array.

An exported operation that has any results of the type of the extended interface must always be explicitly defined, because the corresponding operation of the underlying interface returns the wrong type of object (that of the underlying interface), and there is no implicitly provided mechanism for creating an object of the extended interface given an object of the underlying interface.  Hence we are obliged to explicitly define Create_Array, because in the underlying interface it returned a result of type Array, while to implement the extended interface it has to return a result of type Extended_Array.  Of course inside the new Create_Array we could (and probably would) call the Create_Array operation of the underlying interface to create the underlying object, and then presumably take additional actions needed to finish creating an Extended_Array.

Although an interface may extend any number of other interfaces, a class can only extend one interface.  Of course it may have local variables of many different interfaces, but only one of them is the underlying object, and only inherited operations corresponding to operations on the underlying interface will be automatically provided.

One important thing to note about how ParaSail implementation inheritance works.  It is a completely black box.  Each implicitly provided inherited operation calls the corresponding operation of the underlying interface, passing it the underlying objects of any operands of the type being defined.  The underlying interface operation operates only on the underlying object(s), having no knowledge that it was called "on behalf" of some extended class.  If the underlying operation calls other operations, they too are operating only on the underlying object(s).  There is no "redispatch" on these internal calls, so the extended class can treat these underlying operations as black boxes, and not worry that if it explicitly defines some operations while inheriting others, that that might somehow interact badly with how the underlying operations are implemented.

Which brings us to polymorphism.  Static, compile-time polymorphism has already been illustrated, through the use of generic parameters, and the name resolution rules.  We also see it here in that the particular class implementing the underlying interface for an extended class can also vary depending on the particular instantiation of the extended class.  Although we didn't mention the syntax for that, it makes sense for it to use the extends keyword as well at the point when we are providing the actual generic parameters:
    var XArr : Extended_Array 
       extends Sparse_Array := Create_Array(...);
Now what about dynamic, run-time polymorphism?  When do we actually need it?  A classic example is that of a heterogeneous tree structure, where all of the nodes in the tree are extensions of some basic Tree_Node type, but each is specialized to carry one particular kind of information or another.  Static polymorphism is great for creating general purpose algorithms and data structures, but they are inevitably homogeneous to some degree.  The algorithm manipulates numbers all of the same precision, or the data structure holds elements all of the same type.  With dynamic polymorphism, heterogeneity rules.

In ParaSail, dynamic polymorphism is indicated by appending a "+" to a type name.  What this means is that the corresponding object, parameter, or result might be of a type coming from any extension of the given type's interface, with the proviso that the generic parameters inherited from this original interface have the same bindings for the two types.  That is important because, without that, the operations shared between the two types would not necessarily take the same types of parameters.

To illustrate, presume we give a name to the student-array type we have been using in some of the above examples:
  subtype Student_Array is Array<String, Student_ID>;
Now if we indicate a parameter of type "Student_Array+" then we can take parameters of type Student_Array but also objects of type "Extended_Array<String, Student_Id>" and "Sparse_Array<String, Student_ID>".  Similarly, if we indicate a result of "Student_Array+", then the object returned might be of type Student_Array, or it might be of any other "similar" type.  A Student_Array "factory" might typically specify its result type as "Student_Array+" to allow run-time flexibility in choosing which particular type of array-like structure to return.

So now we have a polymorphic Student_Array+ object, say "SA_Plus," what can we do with it?  We can of course pass it to operations that take a parameter of type Student_Array+.  But what about all those nice operations of Array?  They take operations of type Array, not Array+.  What happens if we try to pass SA_Plus to one of those, say "First(SA_Plus)" or "SA_Plus[Joes_Id]"?  This is where dynamic, run-time polymorphism kicks in, and we go to the "appropriate" implementation of "First" or operator "[]", based on a run-time type identification carried around with objects of a polymorphic type.

If we happen to have a binary operation in our interface Array, for example Merge, which is defined to take two Array operands, and somehow combine them into a result Array, how does that work?  The requirement is that either all or none of the operands are polymorphic, and if all are polymorphic, then they must have the same run-time type identification (henceforth "type-id"), and the result will also be polymorphic but will be guaranteed to similarly have that same type-id.  Since all of the operands are required to have the same type-id, that type-id determines which particular implementation of the operation is actually invoked at run-time.

All of this only applies to operations declared in the interface defining the root type of the polymorphic type (Array in this example).  Operations declared elsewhere that take some particular array type have no special semantics.  But that makes sense because only the operations defined in the root interface are guaranteed to be present in all of the root interface's extensions.

What about operations that return an object of the enclosing interface type, but don't take any operands of the type?  This would be typical for a constructor function, such as Create_Array()->Array, or for something resembling an enumeration literal like "Red" for a color type.  Can such an operation be called in a "polymorphic" fashion?  Yes, if it is used as an operand in a call and some other operand in the call is polymorphic, then the rule that requires all such operands to have the same type-id is used as a kind of self-fulfilling prophecy: the type-id to use for the call on Create_Array is that of the other operand.  So if we use our hypothetical binary operation Merge and one of the operands is our polymorphic object SA_Plus, and the other is Create_Array(...), then we call the Create_Array associated with the type-id we get from SA_Plus, and then proceed to call Merge on the SA_Plus and the result of Create_Array, confident in the knowledge that the two operands will now have the same type-id.

This approach ensures that when you get into the implementation of a binary operation, the operands have the same type, and it is the type of the interface whose operation is actually being invoked.

That's probably enough for now!

No comments:

Post a Comment