We are toying with the idea of having a "mode" in the ParaSail compiler in which indenting is significant, as in Python. This would allow the programmer to omit the "end XYZ" part of a construct, provided that they start the next statement at the appropriate level of indentation. This would also allow the programmer to omit the ";" terminator of a statement or declaration so long as continuation lines are indented and new statements/declarations are not indented relative to the prior statement/declaration.
Because we would like to support both styles, we are considering having the "indentation is significant" flag be a function of the filename extension on the file. If the filename ends ".psi" or ".psl" then indentation is not significant, and ";" and "end XYZ" are needed as usual. If the filename ends ".psn" (for ParaSail, iNdented), then indentation is significant, and ";" and "end XYZ" are optional provided appropriate indenting is performed.
This is just an experiment so far, but any comments would be appreciated. This is almost certainly well into the religious zone of language design, so an extra effort toward courtesy in any comments would be very much appreciated...
This blog will follow the trials and tribulations of designing a new programming language designed to allow productive development of parallel, high-integrity (safety-critical, high-security) software systems. The language is tentatively named "ParaSail" for Parallel, Specification and Implementation Language.
Thursday, August 30, 2012
Example of doubly linked list in ParaSail
A question was posted asking how you would create in a pointer-free language like ParaSail a doubly-linked list, with constant time addition on either end of the list, and constant time removal anywhere within the list. One straightforward way to do this is to use a Vector to store the nodes, and then use indices into the vector to create the doubly linked list. Below is such an implementation.
// Illustration of a doubly-linked list implemented in ParaSail // Copyright (C) 2012, AdaCore, New York, NY // To be used only for Personal, Academic, or Evaluation Purposes; // Not for Commercial Production Use. // Report errors at http://groups.google.com/group/parasail-programming-language // There are two modules here, plus two test programs. // The first module, Doubly_Linked_Index, provides a doubly-linked // list of indices, the indices being usable with another vector // where the actual data can be stored. // The second module, Doubly_Linked_List, uses the Doubly_Linked_Index // to provide the "Elem_Id" type, which is used to identify nodes in the // linked list. It also has a formal type "Element_Type" which is the // type of the actual data to be stored in the linked list. interface Doubly_Linked_Index<Max_Len : Univ_Integer := 100_000> is // Doubly-linked list acting as index into a separate vector of data type Elem_Id is new Integer<1..Max_Len>; op "[]"() -> Doubly_Linked_Index; // Create an empty linked list func Append(var DLI : Doubly_Linked_Index) -> Elem_Id; // Add a new node onto end of linked list func Prepend(var DLI : Doubly_Linked_Index) -> Elem_Id; // Add a new node at beginning of linked list func Insert_Before(var DLI : Doubly_Linked_Index; Elem_Id) -> optional Elem_Id; // Insert new node before given element Id func Insert_After(var DLI : Doubly_Linked_Index; Elem_Id) -> optional Elem_Id; // Insert new node after given element Id func Remove(var DLI : Doubly_Linked_Index; Elem_Id); // Remove specified element from linked list (if present) op "in"(Elem_Id; DLI : Doubly_Linked_Index) -> Boolean; // Return True if given element is in linked list func Remove_First(var DLI : Doubly_Linked_Index) -> optional Elem_Id; // Remove first element from linked list func Remove_Last(var DLI : Doubly_Linked_Index) -> optional Elem_Id; // Remove last element from linked list func Remove_Any(var DLI : Doubly_Linked_Index) -> optional Elem_Id; // Remove arbitrary element from linked list func Count(DLI : Doubly_Linked_Index) -> Univ_Integer; // Return count of elements in linked list func Max_Id_In_Use(DLI : Doubly_Linked_Index) -> optional Elem_Id; // Return max Elem_Id in use end interface Doubly_Linked_Index;
interface Doubly_Linked_List <Element_Type is Assignable<>; Max_Len : Univ_Integer := 100_000> is // Doubly-linked list type DLI is new Doubly_Linked_Index<Max_Len>; type Elem_Id is DLI::Elem_Id; op "[]"() -> Doubly_Linked_List; // Create an empty linked list func Append(var DLL : Doubly_Linked_List; Elem : Element_Type) -> Elem_Id; // Add a new element onto end of linked list func Prepend(var DLL : Doubly_Linked_List; Elem : Element_Type) -> Elem_Id; // Add a new element at beginning of linked list func Insert_Before (var DLL : Doubly_Linked_List; Elem : Element_Type; Elem_Id) -> optional Elem_Id; // Insert new element before given element Id func Insert_After (var DLL : Doubly_Linked_List; Elem : Element_Type; Elem_Id) -> optional Elem_Id; // Insert new element after given element Id func Remove(var DLL : Doubly_Linked_List; Elem_Id); // Remove specified element from linked list (if present) op "in"(Elem_Id; DLL : Doubly_Linked_List) -> Boolean; // Return True if given element is in linked list func Remove_First(var DLL : Doubly_Linked_List) -> optional Element_Type; // Remove first element from linked list func Remove_Last(var DLL : Doubly_Linked_List) -> optional Element_Type; // Remove last element from linked list func Remove_Any(var DLL : Doubly_Linked_List) -> optional Element_Type; // Remove arbitrary element from linked list func Count(DLL : Doubly_Linked_List) -> Univ_Integer; // Return count of elements in linked list func Max_Id_In_Use(DLL : Doubly_Linked_List) -> optional Elem_Id; // Return max Elem_Id in use op "indexing"(ref DLL : Doubly_Linked_List; Elem_Id) -> ref optional Element_Type; // Return a reference to specified element end interface Doubly_Linked_List;
// Here are the implementations of the above two modules class Doubly_Linked_Index is interface Node<> is // Each node on list has next and prev Id var Next : Elem_Id; var Prev : Elem_Id; end interface Node; var Links : ZVector<Node>; // zero-th element is header var First_Free : Elem_Id; var Num_Free : Univ_Integer; func Add_Node(var DLI : Doubly_Linked_Index; Next, Prev : Elem_Id) -> New_Id : Elem_Id is // Add a node with given Next/Prev fields to Links vector var New_Node for DLI := Node::(Next => Next, Prev => Prev); if DLI.First_Free != 0 then // Reuse a free node` New_Id := DLI.First_Free; DLI.First_Free := DLI.Links[New_Id].Next; DLI.Num_Free -= 1; DLI.Links[New_Id] <== New_Node; else // Add a new node at end of Links vector New_Id := Length(DLI.Links); DLI.Links <|= New_Node; end if; DLI.Links[Prev].Next := New_Id; DLI.Links[Next].Prev := New_Id; return New_Id; end func Add_Node; // {Num_Free < Length(Links); First_Free in 0..<Length(Links)} // {(for all N of Links => N.Next in 0..<Length(Links) and then // N.Prev in 0..<Length(Links)} exports op "[]"() -> Doubly_Linked_Index is return (Links => [(Next => 0, Prev => 0)], First_Free => 0, Num_Free => 0); end op "[]"; func Append(var DLI : Doubly_Linked_Index) -> New_Id : Elem_Id is // Add a new node onto end of linked list return Add_Node(DLI, Next => 0, Prev => DLI.Links[0].Prev); end func Append; func Prepend(var DLI : Doubly_Linked_Index) -> Elem_Id is // Add a new node at beginning of linked list return Add_Node(DLI, Next => DLI.Links[0].Next, Prev => 0); end func Prepend; func Insert_Before(var DLI : Doubly_Linked_Index; Follower : Elem_Id) -> New_Id : optional Elem_Id is // Insert new node before given element Id if Follower in DLI then return Add_Node(DLI, Next => Follower, Prev => DLI.Links[Follower].Prev); else // Follower not in linked list return null; end if; end func Insert_Before; func Insert_After(var DLI : Doubly_Linked_Index; Elem_Id) -> optional Elem_Id is // Insert new node after given element Id if Elem_Id in DLI then return Add_Node(DLI, Next => DLI.Links[Elem_Id].Next, Prev => Elem_Id); else // Elem_Id not in linked list return null; end if; end func Insert_After; op "in"(Elem_Id; DLI : Doubly_Linked_Index) -> Boolean is // Return #true if given element is in linked list // NOTE: All elements on free list have Prev pointing to themselves return Elem_Id in 1..<Length(DLI.Links) and then DLI.Links[Elem_Id].Prev != Elem_Id; end op "in"; func Remove(var DLI : Doubly_Linked_Index; Elem_Id) is // Remove specified element from linked list (if present) if Elem_Id in DLI then // Not on the free list, so OK to remove ref Elem => DLI.Links[Elem_Id]; DLI.Links[Elem.Prev].Next := Elem.Next; DLI.Links[Elem.Next].Prev := Elem.Prev; // Mark as being on the free list Elem.Prev := Elem_Id; // Add to the free list Elem.Next := DLI.First_Free; DLI.First_Free := Elem_Id; DLI.Num_Free += 1; end if; end func Remove; func Remove_First(var DLI : Doubly_Linked_Index) -> optional Elem_Id is // Remove first element from linked list const First := DLI.Links[0].Next; if First == 0 then // List is empty return null; else Remove(DLI, First); return First; end if; end func Remove_First; func Remove_Last(var DLI : Doubly_Linked_Index) -> optional Elem_Id is // Remove last element from linked list const Last := DLI.Links[0].Prev; if Last == 0 then // List is empty return null; else Remove(DLI, Last); return Last; end if; end func Remove_Last; func Remove_Any(var DLI : Doubly_Linked_Index) -> optional Elem_Id is // Remove arbitrary element from linked list if Count(DLI) mod 2 == 0 then // Remove First if Count is odd, Remove Last if Count is even return Remove_Last(DLI); else return Remove_First(DLI); end if; end func Remove_Any; func Count(DLI : Doubly_Linked_Index) -> Univ_Integer is // Return count of elements in linked list return Length(DLI.Links) - DLI.Num_Free - 1; end func Count; func Max_Id_In_Use(DLI : Doubly_Linked_Index) -> optional Elem_Id is // Return max Elem_Id in use return Length(DLI.Links) - 1; end func Max_Id_In_Use; end class Doubly_Linked_Index;
class Doubly_Linked_List is var Index : DLI; var Data : ZVector<optional Element_Type>; // zeroth element is used to represent a missing element exports op "[]"() -> Doubly_Linked_List is // Create an empty linked list return (Index => [], Data => [null]); end op "[]"; func Append(var DLL : Doubly_Linked_List; Elem : Element_Type) -> Result : Elem_Id is // Add a new element onto end of linked list Result := Append(DLL.Index); DLL.Data[Result] := Elem; end func Append; func Prepend(var DLL : Doubly_Linked_List; Elem : Element_Type) -> Result : Elem_Id is // Add a new element at beginning of linked list Result := Prepend(DLL.Index); DLL.Data[Result] := Elem; end func Prepend; func Insert_Before (var DLL : Doubly_Linked_List; Elem : Element_Type; Elem_Id) -> Result : optional Elem_Id is // Insert new element before given element Id Result := Insert_Before(DLL.Index, Elem_Id); if Result not null then DLL.Data[Result] := Elem; end if; end func Insert_Before; func Insert_After (var DLL : Doubly_Linked_List; Elem : Element_Type; Elem_Id) -> Result : optional Elem_Id is // Insert new element after given element Id Result := Insert_After(DLL.Index, Elem_Id); if Result not null then DLL.Data[Result] := Elem; end if; end func Insert_After; func Remove(var DLL : Doubly_Linked_List; Elem_Id) is // Remove specified element from linked list (if present) Remove(DLL.Index, Elem_Id); end func Remove; op "in"(Elem_Id; DLL : Doubly_Linked_List) -> Boolean is // Return True if given element is in linked list return Elem_Id in DLL.Index; end op "in"; func Remove_First(var DLL : Doubly_Linked_List) -> Result : optional Element_Type is // Remove first element from linked list const Id := Remove_First(DLL.Index); if Id not null then // Remove element from Data vector Result <== DLL.Data[Id]; else // List is empty Result := null; end if; end func Remove_First; func Remove_Last(var DLL : Doubly_Linked_List) -> Result : optional Element_Type is // Remove last element from linked list const Id := Remove_Last(DLL.Index); if Id not null then // Remove element from Data vector Result <== DLL.Data[Id]; else // List is empty Result := null; end if; end func Remove_Last; func Remove_Any(var DLL : Doubly_Linked_List) -> Result : optional Element_Type is // Remove arbitrary element from linked list const Id := Remove_Any(DLL.Index); if Id not null then // Remove element from Data vector Result <== DLL.Data[Id]; else // List is empty Result := null; end if; end func Remove_Any; func Count(DLL : Doubly_Linked_List) -> Univ_Integer is // Return count of elements in linked list return Count(DLL.Index); end func Count; func Max_Id_In_Use(DLL : Doubly_Linked_List) -> optional Elem_Id is // Return max Elem_Id in use return Max_Id_In_Use(DLL.Index); end func Max_Id_In_Use; op "indexing"(ref DLL : Doubly_Linked_List; Elem_Id) -> ref optional Element_Type is // Return a reference to specified element if Elem_Id in DLL.Index and then Elem_Id in 1..<Length(DLL.Data) then return DLL.Data[Elem_Id]; else // return reference to null value stored in zeroth element return DLL.Data[0]; end if; end op "indexing"; end class Doubly_Linked_List;
// Two test programs, one for the Doubly_Linked_Index module // and one for the Doubly_Linked_List module. func Test_DLI() is var DLI : Doubly_Linked_Index := []; Println("Append three times"); const Id1 := Append(DLI); const Id2 := Append(DLI); const Id3 := Append(DLI); for Id in DLI forward loop Println("Found Elem_Id = " | Id); end loop; Println("Prepend twice"); const Id4 := Prepend(DLI); const Id5 := Prepend(DLI); Println("Count = " | Count(DLI)); for Id in DLI forward loop Println("Found Elem_Id = " | Id); end loop; Println("Insert before original second append"); const Id6 := Insert_Before(DLI, Id2); Println("Count = " | Count(DLI)); for Id in DLI forward loop Println("Found Elem_Id = " | Id); end loop; Println("Remove " | Id1 | " and " | Id4); Remove(DLI, Id1); Remove(DLI, Id4); Println("Count = " | Count(DLI)); for Id in DLI forward loop Println("Found Elem_Id = " | Id); end loop; Println("Reverse loop"); for Id in DLI reverse loop Println("Found Elem_Id = " | Id); end loop; Println("Append one and prepend two"); const Id7 := Append(DLI); const Id8 := Prepend(DLI); const Id9 := Prepend(DLI); Println("Count = " | Count(DLI)); for Id in DLI forward loop Println("Found Elem_Id = " | Id); end loop; Println("Reverse loop"); for Id in DLI reverse loop Println("Found Elem_Id = " | Id); end loop; Println("Unordered loop"); for Id in DLI loop Println("Found Elem_Id = " | Id); end loop; end func Test_DLI;
func Test_DLL() is var DLL : Doubly_Linked_List<Univ_Enumeration> := []; Println("Append three times"); const Id1 := Append(DLL, #one); const Id2 := Append(DLL, #two); const Id3 := Append(DLL, #three); Println("Count = " | Count(DLL)); Println("Item 2 = " | DLL[Id2]); Println("Item 3 = " | DLL[Id3]); var I := 0; for Elem in DLL forward loop Println("Found Elem = " | Elem); I += 1; if I > 5 then exit loop; end if; end loop; Println("Prepend twice"); const Id4 := Prepend(DLL, #four); const Id5 := Prepend(DLL, #five); Println("Count = " | Count(DLL)); for Elem in DLL forward loop Println("Found Elem = " | Elem); end loop; Println("Insert before original second append"); const Id6 := Insert_Before(DLL, #six, Id2); Println("Count = " | Count(DLL)); for Elem in DLL forward loop Println("Found Elem = " | Elem); end loop; Println("Remove " | Id1 | " and " | Id4); Remove(DLL, Id1); Remove(DLL, Id4); Println("Count = " | Count(DLL)); for Elem in DLL forward loop Println("Found Elem = " | Elem); end loop; Println("Reverse loop"); for Elem in DLL reverse loop Println("Found Elem = " | Elem); end loop; Println("Append one and prepend two"); const Id7 := Append(DLL, #seven); const Id8 := Prepend(DLL, #eight); const Id9 := Prepend(DLL, #nine); Println("Count = " | Count(DLL)); for Elem in DLL forward loop Println("Found Elem = " | Elem); end loop; Println("Reverse loop"); for Elem in DLL reverse loop Println("Found Elem = " | Elem); end loop; Println("Unordered loop"); for Elem in DLL loop Println("Found Elem = " | Elem); end loop; end func Test_DLL;
Wednesday, August 29, 2012
Rev 3.2 of alpha release 0.5 now available
We just released a new version of the ParaSail compiler and virtual machine. This includes one major new feature (using polymorphic types as actual parameters), and a number of smaller features:
1) Support instantiating a module with a polymorphic actual type, such as:
var VP : Vector<Expr+>;
This allows the construction of heterogeneous containers.
2) Improve detection of inappropriate calls on abstract operations. Detect when interface of operation uses polymorphic type for a parameter, while implementation uses non-polymorphic type, or vice-versa.
3) Support "continue" statement with named or multiple values, such as:
continue loop with X => X.Left;
and
continue loop with (I => I+1, J => J-1);"
4) Support defining an operation with a single expression, as in:
func Square(X : Integer) -> Integer is (X**2);
5) Support case expressions, such as:
(case C of [#red] => X; [#green] => Y; [#blue] => Z)
---------
Version 3.2 is available at the same URL as version 3.1:
http://bit.ly/Mx9DRb
1) Support instantiating a module with a polymorphic actual type, such as:
var VP : Vector<Expr+>;
This allows the construction of heterogeneous containers.
2) Improve detection of inappropriate calls on abstract operations. Detect when interface of operation uses polymorphic type for a parameter, while implementation uses non-polymorphic type, or vice-versa.
3) Support "continue" statement with named or multiple values, such as:
continue loop with X => X.Left;
and
continue loop with (I => I+1, J => J-1);"
4) Support defining an operation with a single expression, as in:
func Square(X : Integer) -> Integer is (X**2);
5) Support case expressions, such as:
(case C of [#red] => X; [#green] => Y; [#blue] => Z)
---------
Version 3.2 is available at the same URL as version 3.1:
http://bit.ly/Mx9DRb
Tuesday, August 14, 2012
Related work on pointer-free parallel programming
The prior posting on pointer-free parallel programming in ParaSail was a draft of a submission for the FOOL 2012 workshop. That draft lacked a "related work" section. Here is the Related Work section, with some associated references. Comments are welcome as usual.
--------------------
Related Work
There are few if any pointer-free languages currently under active development. Fortran 77 [2] was the last of the Fortran series that restricted itself to a pointer-free model of programming. Algol 60 lacked pointers [3], but Algol 68 introduced them [4]. Early versions of Basic had no pointers [5], but modern versions of Basic use pointer assignment semantics for most complex objects [6]. The first versions of Pascal, Ada, Modula, C, and C++ all used pointers for objects that were explicitly allocated on the heap, while still supporting stack-based records and arrays; these languages also required manual heap storage reclamation. The first versions of Eiffel, Java, and C# provided little or no support for stack-based records and arrays, moving essentially all complex objects into the heap, with pointer semantics on assignment, and automatic garbage collection used for heap storage reclamation.
In many cases, languages that originally did not require heavy use of pointers, as they evolved to support object-oriented programming, the use of pointers increased, often accompanied by a reliance on garbage collection for heap storage reclamation. For example, Modula-3 introduced “object” types, and all instances of such types were allocated explicitly on the heap, with pointer semantics on assignment, and automatic garbage collection for storage reclamation [7].
The SPARK language, a high-integrity subset of Ada with added proof annotations, omits pointers from the subset [8]. No particular attempt was made to soften the effect of losing pointers, so designing semi-dynamic data structures such as trees and linked-lists in SPARK requires heavy use of arrays [9].
Pure functional languages, such as Haskell [10], avoid many of the issues of pointers by adopting immutable objects, meaning that sharing of data creates no aliasing or race condition problems. However, mostly functional languages, such as those deriving from the ML language [11], include references to mutable objects, thereby re-introducing most of the potential issues with aliasing and race conditions. Even Haskell has found it necessary to introduce special monads such as the IO monad to support applications where side-effects are essential to the operation of the program. In such cases, these side-effects need to be managed in the context of parallel programming [12].
Minimizing use of a global heap through the use of region-based storage management was pioneered in the language Cy-clone [13]. However, Cyclone was not a pointer-free language. Instead, every pointer was associated with a particular region at compile time, allowing compile-time detection of dangling references. A global, garbage-collected heap was available, but local dynamic regions provided a safe, more efficient alternative.
Many functional (or mostly functional) languages have a notion similar to ParaSail’s optional objects. For example, in Haskell they are called maybe objects [10]. In ParaSail, because of its fundamental role in supporting recursive data structures, optional is a built-in property usable with every object, component, or type declaration, rather than being an additional level of type.
References
[1] S. Tucker Taft, Designing ParaSail: A New Programming Language, 2009, http://parasail-programming-language.blogspot.com (retrieved 8/10/2012).
[2] Ansi x3.9-1978. American National Standard – Programming Language FORTRAN. American National Standards Institute, 1978, http://www.fortran.com/fortran/F77_std/rjcnf.html (retrieved 8/10/2012).
[3] Peter Naur et al, Revised Report on the Algorithmic Language Algol 60, 1963 http://www.masswerk.at/algol60/report.htm (retrieved 8/10/2012)
[4] C.H. Lindsey, “A History of Algol 68,” Proceedings of HOPL-II: The second ACM SIGPLAN conference on History of programming languages, pp 97-132, ACM New York, NY, 1993.
[5] J.G. Kemeny and T.E. Kurtz, BASIC, 4th Edition,Trutees of Dartmouth College, 1968, http://www.bitsavers.org/pdf/dartmouth/BASIC_4th_Edition_Jan68.pdf (retrieved 8/10/2012).
[6] Microsoft, Visual Basic Concepts -- Programming with Objects, http://msdn.microsoft.com/en-us/library/aa716290.aspx (retrieved 8/10/2012).
[7] L. Cardelli et al, Modula-3 Report (revised), http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-52.pdf (retrieved 8/10/2012).
[8] R. Chapman and P. Amey, SPARK-95 – The SPADE Ada Kernel (including RavenSPARK), Altran-Praxis, 2008 http://www.altran-praxis.com/downloads/SPARK/technicalReferences/SPARK95_RavenSPARK.pdf (retrieved 8/10/2012).
[9] P. Thornley, SPARKSure Data Structures, 2009, http://www.sparksure.com/resources/SPARK_Data_Structures_10_09.zip (retrieved 8/10/2012).
[10] S. Marlow, Haskell 2010 Language Report, 2010, http://www.haskell.org/onlinereport/haskell2010/ (retrieved 8/10/2012).
[11] R. Harper, Programming in Standard ML, Carnegie Mellon University, 2005, http://www.cs.cmu.edu/~rwh/smlbook/book.pdf (retrieved 8/10/2012).
[12] S. Marlow et al, “A Monad for Deterministic Parallelism,” Haskell '11 Proceedings of the 4th ACM symposium on Haskell, pp. 71-82, ACM New York, NY, 2011.
[13] D. Grossman et al, “Region-Based Memory Management in Cyclone,” PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation, pp. 282 – 293, ACM New York, NY, 2002 http://www.eecs.harvard.edu/~greg/cyclone/papers/cyclone-regions.pdf (retrieved 8/10/2012)
--------------------
Related Work
There are few if any pointer-free languages currently under active development. Fortran 77 [2] was the last of the Fortran series that restricted itself to a pointer-free model of programming. Algol 60 lacked pointers [3], but Algol 68 introduced them [4]. Early versions of Basic had no pointers [5], but modern versions of Basic use pointer assignment semantics for most complex objects [6]. The first versions of Pascal, Ada, Modula, C, and C++ all used pointers for objects that were explicitly allocated on the heap, while still supporting stack-based records and arrays; these languages also required manual heap storage reclamation. The first versions of Eiffel, Java, and C# provided little or no support for stack-based records and arrays, moving essentially all complex objects into the heap, with pointer semantics on assignment, and automatic garbage collection used for heap storage reclamation.
In many cases, languages that originally did not require heavy use of pointers, as they evolved to support object-oriented programming, the use of pointers increased, often accompanied by a reliance on garbage collection for heap storage reclamation. For example, Modula-3 introduced “object” types, and all instances of such types were allocated explicitly on the heap, with pointer semantics on assignment, and automatic garbage collection for storage reclamation [7].
The SPARK language, a high-integrity subset of Ada with added proof annotations, omits pointers from the subset [8]. No particular attempt was made to soften the effect of losing pointers, so designing semi-dynamic data structures such as trees and linked-lists in SPARK requires heavy use of arrays [9].
Pure functional languages, such as Haskell [10], avoid many of the issues of pointers by adopting immutable objects, meaning that sharing of data creates no aliasing or race condition problems. However, mostly functional languages, such as those deriving from the ML language [11], include references to mutable objects, thereby re-introducing most of the potential issues with aliasing and race conditions. Even Haskell has found it necessary to introduce special monads such as the IO monad to support applications where side-effects are essential to the operation of the program. In such cases, these side-effects need to be managed in the context of parallel programming [12].
Minimizing use of a global heap through the use of region-based storage management was pioneered in the language Cy-clone [13]. However, Cyclone was not a pointer-free language. Instead, every pointer was associated with a particular region at compile time, allowing compile-time detection of dangling references. A global, garbage-collected heap was available, but local dynamic regions provided a safe, more efficient alternative.
Many functional (or mostly functional) languages have a notion similar to ParaSail’s optional objects. For example, in Haskell they are called maybe objects [10]. In ParaSail, because of its fundamental role in supporting recursive data structures, optional is a built-in property usable with every object, component, or type declaration, rather than being an additional level of type.
References
[1] S. Tucker Taft, Designing ParaSail: A New Programming Language, 2009, http://parasail-programming-language.blogspot.com (retrieved 8/10/2012).
[2] Ansi x3.9-1978. American National Standard – Programming Language FORTRAN. American National Standards Institute, 1978, http://www.fortran.com/fortran/F77_std/rjcnf.html (retrieved 8/10/2012).
[3] Peter Naur et al, Revised Report on the Algorithmic Language Algol 60, 1963 http://www.masswerk.at/algol60/report.htm (retrieved 8/10/2012)
[4] C.H. Lindsey, “A History of Algol 68,” Proceedings of HOPL-II: The second ACM SIGPLAN conference on History of programming languages, pp 97-132, ACM New York, NY, 1993.
[5] J.G. Kemeny and T.E. Kurtz, BASIC, 4th Edition,Trutees of Dartmouth College, 1968, http://www.bitsavers.org/pdf/dartmouth/BASIC_4th_Edition_Jan68.pdf (retrieved 8/10/2012).
[6] Microsoft, Visual Basic Concepts -- Programming with Objects, http://msdn.microsoft.com/en-us/library/aa716290.aspx (retrieved 8/10/2012).
[7] L. Cardelli et al, Modula-3 Report (revised), http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-52.pdf (retrieved 8/10/2012).
[8] R. Chapman and P. Amey, SPARK-95 – The SPADE Ada Kernel (including RavenSPARK), Altran-Praxis, 2008 http://www.altran-praxis.com/downloads/SPARK/technicalReferences/SPARK95_RavenSPARK.pdf (retrieved 8/10/2012).
[9] P. Thornley, SPARKSure Data Structures, 2009, http://www.sparksure.com/resources/SPARK_Data_Structures_10_09.zip (retrieved 8/10/2012).
[10] S. Marlow, Haskell 2010 Language Report, 2010, http://www.haskell.org/onlinereport/haskell2010/ (retrieved 8/10/2012).
[11] R. Harper, Programming in Standard ML, Carnegie Mellon University, 2005, http://www.cs.cmu.edu/~rwh/smlbook/book.pdf (retrieved 8/10/2012).
[12] S. Marlow et al, “A Monad for Deterministic Parallelism,” Haskell '11 Proceedings of the 4th ACM symposium on Haskell, pp. 71-82, ACM New York, NY, 2011.
[13] D. Grossman et al, “Region-Based Memory Management in Cyclone,” PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation, pp. 282 – 293, ACM New York, NY, 2002 http://www.eecs.harvard.edu/~greg/cyclone/papers/cyclone-regions.pdf (retrieved 8/10/2012)
Wednesday, August 1, 2012
A pointer-free balanced "AA_Tree" in ParaSail
Talking about pointer-free data structures without an example is difficult. Here is an example from the "standard library" of ParaSail. It is a balanced "AA" tree. First we have the interface to the AA_Tree module, then the class that implements it, and finally a simple test function "Test_AA_Tree."
Note that this is just part of the ParaSail standard library, which is included in the downloadable prototype compiler and virtual machine. See a separate blog entry for the most recent downloadable version. You can run the test program after installing the ParaSail release by the command "pslc aaa.psi -command Test_AA_Tree 2 9 4". See the appendix in the ParaSail reference manual for more information on running the ParaSail compiler and virtual machine.
Note that this is just part of the ParaSail standard library, which is included in the downloadable prototype compiler and virtual machine. See a separate blog entry for the most recent downloadable version. You can run the test program after installing the ParaSail release by the command "pslc aaa.psi -command Test_AA_Tree 2 9 4". See the appendix in the ParaSail reference manual for more information on running the ParaSail compiler and virtual machine.
// ParaSail Prototype Standard Library // Copyright (C) 2011-2012, S. Tucker Taft, Lexington MA, USA // To be used only for Personal, Academic, or Evaluation Purposes; // Not for Commercial Production Use. // Report errors at http://groups.google.com/group/parasail-programming-language interface AA_Tree<Element is Comparable<>> is // This module implements a balanced "AA" tree, originally // described by Arne Andersson in the "Proceedings of the Workshop // on Algorithms and Data Structures," pp 60-71, Springer Verlag, 1993. // The following algorithm and descriptions were taken from the // WikiPedia article on AA_Tree: // http://en.wikipedia.org/wiki/AA_tree // Note that various additional checks for a null tree have been added. // Only two operations are needed for maintaining balance in an AA tree. // These operations are called skew and split. Skew is a right rotation // when an insertion or deletion creates a left horizontal link. Split // is a conditional left rotation when an insertion or deletion creates two // horizontal right links, which once again corresponds to two // consecutive red links in red-black trees. op "[]"() -> optional AA_Tree; // Create an empty tree func Insert(var T : optional AA_Tree; X : Element); // input: X, the value to be inserted, and // T, the root of the tree to insert it into. // output: A balanced T' including X. func Delete(var T : optional AA_Tree; X : Element); // input: X, the value to delete, and T, // the root of the tree from which it should be deleted. // output: T', balanced, without the value X. op "in"(X : Element; T : optional AA_Tree) -> Boolean; func Overlapping(T : optional AA_Tree; X : Element) -> optional Element; // input: X, the value to find, and T, // the root of the tree to be searched. // output: the element equal to or "unordered" relative to X. op "|="(var T : optional AA_Tree; X : Element) is Insert; op "<|="(var T : optional AA_Tree; var X : optional Element); // Move X into AA_Tree, leaving X null. func First(T : optional AA_Tree) -> optional Element; // Return first (smallest) element in tree func Last(T : optional AA_Tree) -> optional Element; // Return last (greatest) element in tree func Remove_First(var T : optional AA_Tree) -> optional Element; // Remove first (smallest) element in tree func Remove_Last(var T : optional AA_Tree) -> optional Element; // Remove last (greatest) element in tree func Remove_Any(var T : optional AA_Tree) -> optional Element; // Remove some element from tree func Count(T : optional AA_Tree) -> Univ_Integer; // Return a count of the nodes in the tree func Is_Empty(T : optional AA_Tree) -> Boolean; // Return True if the tree is empty end interface AA_Tree; class AA_Tree is var Value : Element; var Level : Univ_Integer := 0; var Left : optional AA_Tree; var Right : optional AA_Tree; func Node(var Value : optional Element; Level : Univ_Integer; Left, Right : optional AA_Tree) -> AA_Tree is // Create a new tree; move Value into it. return (Value <== Value, Level => Level, Left => Left, Right => Right); end func Node; func Is_Leaf(T : optional AA_Tree) -> Boolean is return T not null and then T.Left is null and then T.Right is null; end func Is_Leaf; func Leftmost(ref T : optional AA_Tree) -> ref optional AA_Tree is for L => T loop if L not null and then L.Left not null then // Continue with Left until we reach null continue loop with L.Left; else // Found left-most return L; end if; end loop; end func Leftmost; func Successor(T : optional AA_Tree) -> optional Element is // Return element in tree greater than but closest to T.Value if T.Right not null then const Succ := Leftmost(T.Right); {Succ not null}; return Succ.Value; else return null; end if; end func Successor; func Rightmost(ref T : optional AA_Tree) -> ref optional AA_Tree is for R => T loop if R not null and then R.Right not null then // Keep following down Right side continue loop with R.Right; else // Found right-most return R; end if; end loop; end func Rightmost; func Predecessor(T : optional AA_Tree) -> optional Element is // Return element in tree less than but closest to T.Value if T.Left not null then return Rightmost(T.Left).Value; else return null; end if; end func Predecessor; func Skew(var T : optional AA_Tree) is // input: T, a node representing an AA tree that needs to be rebalanced. // output: T' Another node representing the rebalanced AA tree. if T not null and then T.Left not null and then T.Left.Level == T.Level then // The current T.Left becomes new root // Exchange value of T.Left with root T.Value <=> T.Left.Value; // Move old root and T.Left.Right over to right side of tree T.Left.Right <=> T.Right; T.Left.Left <=> T.Right; T.Left <=> T.Right; end if; end func Skew; func Split(var T : optional AA_Tree) is // input: T, a node representing an AA tree that needs to be rebalanced. // output: T' Another node representing the rebalanced AA tree. if T not null and then T.Right not null and then T.Right.Right not null and then T.Level == T.Right.Right.Level then // T.Right becomes the new root // Exchange value and level between root and T.Right T.Value <=> T.Right.Value; T.Level <=> T.Right.Level; // Move old root and T.Right.Left to left side of tree T.Left <=> T.Right.Right; T.Right.Left <=> T.Right.Right; T.Left <=> T.Right; // Increment level T.Level += 1; end if; end func Split; func Decrease_Level(var T : optional AA_Tree) is // input: T, a tree for which we want to remove links that skip levels. // output: T with its level decreased. if T is null then return; end if; var Should_Be : Univ_Integer := 1; if T.Left not null then Should_Be := T.Left.Level + 1; end if; if T.Right not null then Should_Be := Min(Should_Be, T.Right.Level + 1); end if; if Should_Be < T.Level then T.Level := Should_Be; if T.Right not null and then Should_Be < T.Right.Level then T.Right.Level := Should_Be; end if; end if; end func Decrease_Level; exports op "[]"() -> optional AA_Tree is // Create an empty tree return null; end op "[]"; // Insertion begins with the normal binary tree search and insertion // procedure. Then, as the call stack unwinds (assuming a recursive // implementation of the search), it's easy to check the validity of the // tree and perform any rotations as necessary. If a horizontal left link // arises, a skew will be performed, and if two horizontal right links // arise, a split will be performed, possibly incrementing the level of the // new root node of the current subtree. Note, in the code as given above, // the increment of T.Level. This makes it necessary to continue checking // the validity of the tree as the modifications bubble up from the leaves. op "<|="(var T : optional AA_Tree; var X : optional Element) is // Move X into AA_Tree, leaving X null. // input: X, the value to be inserted, and // T, the root of the tree to insert it into. // output: A balanced T' including X. // Do the normal binary tree insertion procedure. // Set the result of the recursive call to the correct // child in case a new node was created or the // root of the subtree changes. if T is null then // Create a new leaf node with X. T := Node(X, 1, null, null); return; end if; case X =? T.Value of [#less] => T.Left <|= X; [#greater] => T.Right <|= X; [#equal | #unordered] => // Note that the case of X == T.Value is unspecified. // As given, an insert will have no effect. // The implementor may desire different behavior. X := null; return; end case; // Perform skew and then split. // The conditionals that determine whether or // not a rotation will occur or not are inside // of the procedures, as given above. Skew(T); Split(T); end op "<|="; func Insert(var T : optional AA_Tree; X : Element) is // Just pass the buck to the "<|=" operation var X_Copy for T := X; T <|= X_Copy; end func Insert; // As in most balanced binary trees, the deletion of an internal node can // be turned into the deletion of a leaf node by swapping the internal node // with either its closest predecessor or successor, depending on which are // in the tree or on the implementor's whims. Retrieving a predecessor is // simply a matter of following one left link and then all of the remaining // right links. Similarly, the successor can be found by going right once // and left until a null pointer is found. Because of the AA property of // all nodes of level greater than one having two children, the successor // or predecessor node will be in level 1, making their removal trivial. // // To re-balance a tree, there are a few approaches. The one described by // Andersson in his original paper is the simplest, and it is described // here, although actual implementations may opt for a more optimized // approach. After a removal, the first step to maintaining tree validity // is to lower the level of any nodes whose children are two levels below // them, or who are missing children. Then, the entire level must be skewed // and split. This approach was favored, because when laid down // conceptually, it has three easily understood separate steps: // // Decrease the level, if appropriate. // Skew the level. // Split the level. // // However, we have to skew and split the entire level this time instead of // just a node, complicating our code. func Delete(var T : optional AA_Tree; X : Element) is // input: X, the value to delete, and T, // the root of the tree from which it should be deleted. // output: T', balanced, without the value X. if T is null then // Not in tree -- should we complain? return; end if; case X =? T.Value of [#less] => Delete(T.Left, X); [#greater] => Delete(T.Right, X); [#equal] => // If we're a leaf, easy, otherwise reduce to leaf case. if Is_Leaf(T) then T := null; elsif T.Left is null then // Get successor value and delete it from right tree, // and set root to have that value const Succ := Successor(T); Delete(T.Right, Succ); T.Value := Succ; else // Get predecessor value and delete it from left tree, // and set root to have that value const Pred := Predecessor(T); Delete(T.Left, Pred); T.Value := Pred; end if; [#unordered] => // Not in tree; should we complain? return; end case; // Rebalance the tree. Decrease the level of all nodes in this level if // necessary, and then skew and split all nodes in the new level. if T is null then return; end if; Decrease_Level(T); Skew(T); Skew(T.Right); if T.Right not null then Skew(T.Right.Right); end if; Split(T); Split(T.Right); end func Delete; op "in"(X : Element; T : optional AA_Tree) -> Result : Boolean is for P => T while P not null loop case X =? P.Value of [#less] => continue loop with P.Left; [#greater] => continue loop with P.Right; [#equal] => return #true; [#unordered] => return #false; end case; end loop; return #false; // Not found end op "in"; func First(T : optional AA_Tree) -> optional Element is // Return first (smallest) element in tree if T is null then return null; else return Leftmost(T).Value; end if; end func First; func Last(T : optional AA_Tree) -> optional Element is // Return last (greatest) element in tree if T is null then return null; else return Rightmost(T).Value; end if; end func Last; func Remove_First(var T : optional AA_Tree) -> Result : optional Element is // Remove first (smallest) element in tree Result := First(T); if Result not null then Delete(T, Result); end if; end func Remove_First; func Remove_Last(var T : optional AA_Tree) -> Result : optional Element is // Remove last (greatest) element in tree Result := Last(T); if Result not null then Delete(T, Result); end if; end func Remove_Last; func Remove_Any(var T : optional AA_Tree) -> Result : optional Element is // Remove some element from tree if T is null then return null; end if; Result := T.Value; if Result not null then Delete(T, Result); end if; end func Remove_Any; func Is_Empty(T : optional AA_Tree) -> Boolean is // Return True if the tree is empty return T is null; end func Is_Empty; func Count(T : optional AA_Tree) -> Univ_Integer is // Return a count of the nodes in the tree if T is null then return 0; else return Count(T.Left) + Count(T.Right) + 1; end if; end func Count; func Overlapping(T : optional AA_Tree; X : Element) -> optional Element is // input: X, the value to find, and T, // the root of the tree to be searched. // output: the element equal to or "unordered" relative to X. if T is null or else T.Value is null then return null; else case X =? T.Value of [#less] => return Overlapping(T.Left, X); [#greater] => return Overlapping(T.Right, X); [#equal | #unordered] => // Close enough return T.Value; end case; end if; end func Overlapping; end class AA_Tree; func Test_AA_Tree(A : Univ_Integer; B : Univ_Integer; C : Univ_Integer) is type Univ_Tree is AA_Tree<Univ_Integer>; var T : Univ_Tree := []; var X : Univ_Integer := A; Insert(T, A); Println("Count = " | Count(T) | " after insert of " | A); Insert(T, B); Println("Count = " | Count(T) | " after insert of " | B); Insert(T, C); Println("Count = " | Count(T) | " after insert of " | C); Insert(T, A); Println("Count = " | Count(T) | " after another insert of " | A); Println(A | " in T = " | (A in T)); Println(B | " in T = " | (B in T)); Println(C | " in T = " | (C in T)); Println("7 in T = " | (7 in T)); for E := Remove_First(T) then Remove_First(T) while E not null loop Println("Remove_First = " | E); end loop; Println("Count after loop : " | Count(T)); for I in 1..10 forward loop Insert(T, I); Println("Count = " | Count(T) | " after insert of " | I); end loop; for L := Remove_Last(T) then Remove_Last(T) while L not null loop Println("Remove_Last = " | L); end loop; Println("Count after loop : " | Count(T)); for J in 1..10 reverse loop Insert(T, J); Println("Count = " | Count(T) | " after insert of " | J); end loop; Println("Count after loop : " | Count(T)); Println("Overlapping(T, 5) = " | Overlapping(T, 5)); for Z := Remove_Any(T) then Remove_Any(T) while Z not null loop Println("Remove_Any = " | Z); end loop; Println("Count after loop : " | Count(T)); for K in 1..10 loop Insert(T, K); Println("Count = " | Count(T) | " after insert of " | K); end loop; for F := Remove_First(T) then Remove_First(T) while F not null loop Println("Remove_First = " | F); end loop; Println("Count after loop : " | Count(T)); end func Test_AA_Tree;
A Pointer-Free Path to Object-Oriented Parallel Programming
The following is a first draft of a submission for the forthcoming "Foundations of Object-Oriented Languages" (FOOL 2012) workshop accompanying this year's SPLASH/OOPSLA conference in Tuscon. Any comments would be welcome.
-----------
ParaSail: Pointer-Free Path to Object-Oriented Parallel Programming
Pointers are ubiquitous in modern object-oriented programming languages, and many data structures such as trees, lists, graphs, hash tables, etc. depend on them heavily. Unfortunately, pointers can add significant complexity to programming. Pointers can make storage management more complex, pointers can make assignment and equality semantics more complex, pointers can increase the ways two different names (access paths) can designate the same object, pointers can make program analysis and proof more complex, and pointers can make it harder to "divide and conquer" a data structure for parallel processing.
Is there an alternative to using pointers? ParaSail, a new parallel object-oriented programming language, adopts a different paradigm for defining data structures. Rather than using pointers, ParaSail supports flexible data structuring using "expandable" (and shrinkable) objects, along with generalized indexing. By eliminating pointers, ParaSail significantly reduces the complexity for the programmer, while also allowing ParaSail to provide pervasive, safe, object-oriented parallel programming.
Expandable and Optional Objects
An "expandable" object is one which can grow without using pointers, much as a house can grow through additions. Where once there was a door to the back yard, a new screened-in porch can be added. Where once there was only one floor, a new floor can be added. The basic mechanism for expansion in ParaSail is that every type has one additional value, called "null." A component can initially be null, and then be replaced by a non-null value, thereby expanding the enclosing object. At some later point the enclosing object could shrink, by replacing a non-null component with null.
Not every component of an object is allowed to be null. The component must be declared as "optional" if it is allowed to take on a null value. For example, a Tree structure might have a (non-optional) "Payload" component, and then two additional components, "Left" and "Right," which are each declared as "optional Tree." Similarly, a stand-alone object may be declared to be of a type "T," or of a type "optional T." Only if it is declared "optional" may it take on the null value. The value of an object X declared as "optional" may be tested for nullness using "X is null" or "X not null."
Another example of a data structure using optional components would be a linked list, with each node having two components, one "Payload" component, and a "Tail" component of type "optional List." There is also a built-in parameterized type, "Basic_Array<Component_Type>" which allows the Component_Type to be specified as "optional." This allows the construction of a hash table with buckets represented as linked-lists, by declaring the "backbone" of the hash table as a "Basic_Array<optional List<Hash_Table_Item>>." The components of the hash table would start out as null, but as items are added to the hash table, one or more of the component lists would begin to grow.
Assignment, Move, and Swap Operations
Because there are no pointers, the semantics of assignment in ParaSail are very straightforward, namely the entire right-hand-side object is copied and assigned into the left-hand side, replacing whatever prior value was there. However, there are times when it is desirable to "move" a component from one object to another, or "swap" two components. Because implementing these on top of an assignment that uses copying would be painful, in ParaSail, "move" and "swap" are separate operations. The semantics of "move" is that the value of the left-hand-side is replaced with the value of the right-hand-side, and the right-hand-side ends up null. For "swap," the values of the left- and right-hand-side are swapped. Syntactically, ParaSail uses ":=" for (copying) assignment, "<==" for move, and "<=>" for swap. The ParaSail compiler is smart enough to automatically use "move" semantics when the right-hand-side is the result of a computation, rather than an object or component that persists after the assignment.
As an example of where "move" might be used, if our hash table grows to the point that it would be wise to lengthen the backbone, we could create a new Basic_Array twice as large (for example), and then "move" each list node from the old array into the new array in an appropriate spot, rebuilding each linked list, and then finally "move" the new array into the original hash-table object, replacing the old array. The "swap" operation is also useful in many contexts, for example when balancing a tree structure, or when sorting an array.
Cyclic Data Structures and Generalized Indexing
Expandable objects allow the construction of many kinds of data structures, but a general, possibly cyclic graph is not one of them. For this, ParaSail provides generalized indexing. The array-indexing syntax, "A[I]," is generalized in ParaSail to be usable with any container-like data structure, where A is the container and I is the key into that data structure. A directed graph in ParaSail could be represented as an indexed set of Nodes, where the index is a unique node "Id" of some sort, with edges represented as Predecessor and Successor indice sets that are components of each Node. There is no harm in following a "dangling" unique node-id in ParaSail, as that is easy to detect because the node-id would be missing from the indexed set. In a language with pointers, either a dangling pointer will cause a storage leak, because the target node cannot be reclaimed, or it will lead to a potentially destructive reference to reclaimed storage.
Region-Based Storage Management
Storage management without pointers is significantly simplified. All of the objects declared in a given scope are associated with a storage "region," essentially a local heap. As an object grows, all new storage for it is allocated out of this region. As an object shrinks, the old storage can be immediately released back to this region. When a scope is exited, the entire region is reclaimed. There is no need for asynchronous garbage collection, as garbage never accumulates.
Every object identifies its region, and in addition, when a function is called, the region in which the result object should be allocated is passed as an implicit parameter. This "target" region is determined by how the function result is used. If it is a temporary, then it will be allocated out of a temporary region associated with the point of call. If it is assigned into a longer-lived object, then the function will be directed to allocated the result object out of the region associated with this longer-lived object. The net effect is that there is no copying at the call site upon function return, since the result object is already sitting in the correct region.
Note that pointers are likely still used "behind the scenes" in any implementation of ParaSail, but eliminating them from the surface syntax and semantics can eliminate essentially all of the complexity associated with pointers. That is, a semantic model of expandable and shrinkable objects, operating under "value semantics," rather than a semantic model of nodes connected with pointers, operating under "reference semantics," provides a number of benefits, such as simpler storage management, simpler assignment semantics, easier analyzability, etc.
Parallel and Distributed Programming
In addition to removing pointers, certain other simplifications are made in ParaSail to ease parallel and distributed programming. In particular, there are no global variables; functions may only update objects passed to them as "var" (in-out) parameters. Furthermore, as part of passing an object as a "var" parameter, it is effectively "handed off" to the receiving function, and no further reference may be made to the object, until the function completes. In particular, no part of the "var" parameter may be passed to any other function, or even to this same function as a separate parameter. This eliminates the possibility of aliasing between a "var" parameter and any other object visible to the function. These two additional rules, coupled with the lack of pointers, means that all parameter evaluation may happen in parallel (e.g. in "F(G(X), H(Y))", G(X) and H(Y) may be evaluated in parallel), and function calls may easily cross address-space boundaries, since the objects are self-contained (with no incoming or outgoing references), and only one function at a time can update a given object.
Concurrent Objects
All of the above rules apply to objects that are not designed for concurrent access. ParaSail also supports the construction of concurrent objects, which allow lock-free, locked, and queued simultaneous access. These objects are *not* "handed off" as part of parameter passing, but instead provide operations which synchronize any attempts at concurrent access. Three kinds of synchronization are supported. "Lock-free" synchronization relies on low-level hardware-supported operations such as atomic load and store, and compare-and-swap. "Locked" synchronization relies on automatic locking as part of calling a "locked" operation of a concurrent object, and automatic unlocking as part of returning from the operation. Finally, "queued" synchronization is provided, which evaluates a "dequeue condition" upon call (under a lock), and only if the condition is satisfied is the call allowed to proceed, still under the lock. A typical "dequeue condition" might be that a buffer is not full, or that a mailbox has at least one element in it. If the dequeue condition is not satisfied, then the caller is added to a queue. At the end of any operation on the concurrent object that might change the result of the dequeue condition for a queued caller, the dequeue condition is evaluated and if true, the operation requested by the queued caller is performed before the lock is released. If there are multiple queued callers, then they are serviced in turn until there are none with satisfied dequeue conditions.
Pointer-Free Object-Oriented Parallel Programming
The pointer-free nature of ParaSail is not just an interesting quirk. Rather, we believe it represents a significantly simpler way to build large object-oriented systems. By itself it simplifies storage management, assignment semantics, and analyzability, and when combined with the elimination of global variables and parameter aliasing, it allows for the easy parallelization of all expression evaluation, and the easy distribution of computations across address spaces, without having to make the jump to a pure side-effect-free functional language.
-----------
ParaSail: Pointer-Free Path to Object-Oriented Parallel Programming
Pointers are ubiquitous in modern object-oriented programming languages, and many data structures such as trees, lists, graphs, hash tables, etc. depend on them heavily. Unfortunately, pointers can add significant complexity to programming. Pointers can make storage management more complex, pointers can make assignment and equality semantics more complex, pointers can increase the ways two different names (access paths) can designate the same object, pointers can make program analysis and proof more complex, and pointers can make it harder to "divide and conquer" a data structure for parallel processing.
Is there an alternative to using pointers? ParaSail, a new parallel object-oriented programming language, adopts a different paradigm for defining data structures. Rather than using pointers, ParaSail supports flexible data structuring using "expandable" (and shrinkable) objects, along with generalized indexing. By eliminating pointers, ParaSail significantly reduces the complexity for the programmer, while also allowing ParaSail to provide pervasive, safe, object-oriented parallel programming.
Expandable and Optional Objects
An "expandable" object is one which can grow without using pointers, much as a house can grow through additions. Where once there was a door to the back yard, a new screened-in porch can be added. Where once there was only one floor, a new floor can be added. The basic mechanism for expansion in ParaSail is that every type has one additional value, called "null." A component can initially be null, and then be replaced by a non-null value, thereby expanding the enclosing object. At some later point the enclosing object could shrink, by replacing a non-null component with null.
Not every component of an object is allowed to be null. The component must be declared as "optional" if it is allowed to take on a null value. For example, a Tree structure might have a (non-optional) "Payload" component, and then two additional components, "Left" and "Right," which are each declared as "optional Tree." Similarly, a stand-alone object may be declared to be of a type "T," or of a type "optional T." Only if it is declared "optional" may it take on the null value. The value of an object X declared as "optional" may be tested for nullness using "X is null" or "X not null."
Another example of a data structure using optional components would be a linked list, with each node having two components, one "Payload" component, and a "Tail" component of type "optional List." There is also a built-in parameterized type, "Basic_Array<Component_Type>" which allows the Component_Type to be specified as "optional." This allows the construction of a hash table with buckets represented as linked-lists, by declaring the "backbone" of the hash table as a "Basic_Array<optional List<Hash_Table_Item>>." The components of the hash table would start out as null, but as items are added to the hash table, one or more of the component lists would begin to grow.
Assignment, Move, and Swap Operations
Because there are no pointers, the semantics of assignment in ParaSail are very straightforward, namely the entire right-hand-side object is copied and assigned into the left-hand side, replacing whatever prior value was there. However, there are times when it is desirable to "move" a component from one object to another, or "swap" two components. Because implementing these on top of an assignment that uses copying would be painful, in ParaSail, "move" and "swap" are separate operations. The semantics of "move" is that the value of the left-hand-side is replaced with the value of the right-hand-side, and the right-hand-side ends up null. For "swap," the values of the left- and right-hand-side are swapped. Syntactically, ParaSail uses ":=" for (copying) assignment, "<==" for move, and "<=>" for swap. The ParaSail compiler is smart enough to automatically use "move" semantics when the right-hand-side is the result of a computation, rather than an object or component that persists after the assignment.
As an example of where "move" might be used, if our hash table grows to the point that it would be wise to lengthen the backbone, we could create a new Basic_Array twice as large (for example), and then "move" each list node from the old array into the new array in an appropriate spot, rebuilding each linked list, and then finally "move" the new array into the original hash-table object, replacing the old array. The "swap" operation is also useful in many contexts, for example when balancing a tree structure, or when sorting an array.
Cyclic Data Structures and Generalized Indexing
Expandable objects allow the construction of many kinds of data structures, but a general, possibly cyclic graph is not one of them. For this, ParaSail provides generalized indexing. The array-indexing syntax, "A[I]," is generalized in ParaSail to be usable with any container-like data structure, where A is the container and I is the key into that data structure. A directed graph in ParaSail could be represented as an indexed set of Nodes, where the index is a unique node "Id" of some sort, with edges represented as Predecessor and Successor indice sets that are components of each Node. There is no harm in following a "dangling" unique node-id in ParaSail, as that is easy to detect because the node-id would be missing from the indexed set. In a language with pointers, either a dangling pointer will cause a storage leak, because the target node cannot be reclaimed, or it will lead to a potentially destructive reference to reclaimed storage.
Region-Based Storage Management
Storage management without pointers is significantly simplified. All of the objects declared in a given scope are associated with a storage "region," essentially a local heap. As an object grows, all new storage for it is allocated out of this region. As an object shrinks, the old storage can be immediately released back to this region. When a scope is exited, the entire region is reclaimed. There is no need for asynchronous garbage collection, as garbage never accumulates.
Every object identifies its region, and in addition, when a function is called, the region in which the result object should be allocated is passed as an implicit parameter. This "target" region is determined by how the function result is used. If it is a temporary, then it will be allocated out of a temporary region associated with the point of call. If it is assigned into a longer-lived object, then the function will be directed to allocated the result object out of the region associated with this longer-lived object. The net effect is that there is no copying at the call site upon function return, since the result object is already sitting in the correct region.
Note that pointers are likely still used "behind the scenes" in any implementation of ParaSail, but eliminating them from the surface syntax and semantics can eliminate essentially all of the complexity associated with pointers. That is, a semantic model of expandable and shrinkable objects, operating under "value semantics," rather than a semantic model of nodes connected with pointers, operating under "reference semantics," provides a number of benefits, such as simpler storage management, simpler assignment semantics, easier analyzability, etc.
Parallel and Distributed Programming
In addition to removing pointers, certain other simplifications are made in ParaSail to ease parallel and distributed programming. In particular, there are no global variables; functions may only update objects passed to them as "var" (in-out) parameters. Furthermore, as part of passing an object as a "var" parameter, it is effectively "handed off" to the receiving function, and no further reference may be made to the object, until the function completes. In particular, no part of the "var" parameter may be passed to any other function, or even to this same function as a separate parameter. This eliminates the possibility of aliasing between a "var" parameter and any other object visible to the function. These two additional rules, coupled with the lack of pointers, means that all parameter evaluation may happen in parallel (e.g. in "F(G(X), H(Y))", G(X) and H(Y) may be evaluated in parallel), and function calls may easily cross address-space boundaries, since the objects are self-contained (with no incoming or outgoing references), and only one function at a time can update a given object.
Concurrent Objects
All of the above rules apply to objects that are not designed for concurrent access. ParaSail also supports the construction of concurrent objects, which allow lock-free, locked, and queued simultaneous access. These objects are *not* "handed off" as part of parameter passing, but instead provide operations which synchronize any attempts at concurrent access. Three kinds of synchronization are supported. "Lock-free" synchronization relies on low-level hardware-supported operations such as atomic load and store, and compare-and-swap. "Locked" synchronization relies on automatic locking as part of calling a "locked" operation of a concurrent object, and automatic unlocking as part of returning from the operation. Finally, "queued" synchronization is provided, which evaluates a "dequeue condition" upon call (under a lock), and only if the condition is satisfied is the call allowed to proceed, still under the lock. A typical "dequeue condition" might be that a buffer is not full, or that a mailbox has at least one element in it. If the dequeue condition is not satisfied, then the caller is added to a queue. At the end of any operation on the concurrent object that might change the result of the dequeue condition for a queued caller, the dequeue condition is evaluated and if true, the operation requested by the queued caller is performed before the lock is released. If there are multiple queued callers, then they are serviced in turn until there are none with satisfied dequeue conditions.
Pointer-Free Object-Oriented Parallel Programming
The pointer-free nature of ParaSail is not just an interesting quirk. Rather, we believe it represents a significantly simpler way to build large object-oriented systems. By itself it simplifies storage management, assignment semantics, and analyzability, and when combined with the elimination of global variables and parameter aliasing, it allows for the easy parallelization of all expression evaluation, and the easy distribution of computations across address spaces, without having to make the jump to a pure side-effect-free functional language.
Subscribe to:
Posts (Atom)