Designing ParaSail, a new programming language: Related work on pointer-free parallel programming

Tuesday, August 14, 2012

Related work on pointer-free parallel programming

The prior posting on pointer-free parallel programming in ParaSail was a draft of a submission for the FOOL 2012 workshop. That draft lacked a "related work" section. Here is the Related Work section, with some associated references. Comments are welcome as usual.

--------------------
Related Work

There are few if any pointer-free languages currently under active development. Fortran 77 [2] was the last of the Fortran series that restricted itself to a pointer-free model of programming. Algol 60 lacked pointers [3], but Algol 68 introduced them [4]. Early versions of Basic had no pointers [5], but modern versions of Basic use pointer assignment semantics for most complex objects [6]. The first versions of Pascal, Ada, Modula, C, and C++ all used pointers for objects that were explicitly allocated on the heap, while still supporting stack-based records and arrays; these languages also required manual heap storage reclamation. The first versions of Eiffel, Java, and C# provided little or no support for stack-based records and arrays, moving essentially all complex objects into the heap, with pointer semantics on assignment, and automatic garbage collection used for heap storage reclamation.

In many cases, languages that originally did not require heavy use of pointers, as they evolved to support object-oriented programming, the use of pointers increased, often accompanied by a reliance on garbage collection for heap storage reclamation. For example, Modula-3 introduced “object” types, and all instances of such types were allocated explicitly on the heap, with pointer semantics on assignment, and automatic garbage collection for storage reclamation [7].

The SPARK language, a high-integrity subset of Ada with added proof annotations, omits pointers from the subset [8]. No particular attempt was made to soften the effect of losing pointers, so designing semi-dynamic data structures such as trees and linked-lists in SPARK requires heavy use of arrays [9].

Pure functional languages, such as Haskell [10], avoid many of the issues of pointers by adopting immutable objects, meaning that sharing of data creates no aliasing or race condition problems. However, mostly functional languages, such as those deriving from the ML language [11], include references to mutable objects, thereby re-introducing most of the potential issues with aliasing and race conditions. Even Haskell has found it necessary to introduce special monads such as the IO monad to support applications where side-effects are essential to the operation of the program. In such cases, these side-effects need to be managed in the context of parallel programming [12].

Minimizing use of a global heap through the use of region-based storage management was pioneered in the language Cy-clone [13]. However, Cyclone was not a pointer-free language. Instead, every pointer was associated with a particular region at compile time, allowing compile-time detection of dangling references. A global, garbage-collected heap was available, but local dynamic regions provided a safe, more efficient alternative.

Many functional (or mostly functional) languages have a notion similar to ParaSail’s optional objects. For example, in Haskell they are called maybe objects [10]. In ParaSail, because of its fundamental role in supporting recursive data structures, optional is a built-in property usable with every object, component, or type declaration, rather than being an additional level of type.

References

[1] S. Tucker Taft, Designing ParaSail: A New Programming Language, 2009, http://parasail-programming-language.blogspot.com (retrieved 8/10/2012).

[2] Ansi x3.9-1978. American National Standard – Programming Language FORTRAN. American National Standards Institute, 1978, http://www.fortran.com/fortran/F77_std/rjcnf.html (retrieved 8/10/2012).

[3] Peter Naur et al, Revised Report on the Algorithmic Language Algol 60, 1963 http://www.masswerk.at/algol60/report.htm (retrieved 8/10/2012)

[4] C.H. Lindsey, “A History of Algol 68,” Proceedings of HOPL-II: The second ACM SIGPLAN conference on History of programming languages, pp 97-132, ACM New York, NY, 1993.

[5] J.G. Kemeny and T.E. Kurtz, BASIC, 4th Edition,Trutees of Dartmouth College, 1968, http://www.bitsavers.org/pdf/dartmouth/BASIC_4th_Edition_Jan68.pdf (retrieved 8/10/2012).

[6] Microsoft, Visual Basic Concepts -- Programming with Objects, http://msdn.microsoft.com/en-us/library/aa716290.aspx (retrieved 8/10/2012).

[7] L. Cardelli et al, Modula-3 Report (revised), http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-52.pdf (retrieved 8/10/2012).

[8] R. Chapman and P. Amey, SPARK-95 – The SPADE Ada Kernel (including RavenSPARK), Altran-Praxis, 2008 http://www.altran-praxis.com/downloads/SPARK/technicalReferences/SPARK95_RavenSPARK.pdf (retrieved 8/10/2012).

[9] P. Thornley, SPARKSure Data Structures, 2009, http://www.sparksure.com/resources/SPARK_Data_Structures_10_09.zip (retrieved 8/10/2012).

[10] S. Marlow, Haskell 2010 Language Report, 2010, http://www.haskell.org/onlinereport/haskell2010/ (retrieved 8/10/2012).

[11] R. Harper, Programming in Standard ML, Carnegie Mellon University, 2005, http://www.cs.cmu.edu/~rwh/smlbook/book.pdf (retrieved 8/10/2012).

[12] S. Marlow et al, “A Monad for Deterministic Parallelism,” Haskell '11 Proceedings of the 4th ACM symposium on Haskell, pp. 71-82, ACM New York, NY, 2011.

[13] D. Grossman et al, “Region-Based Memory Management in Cyclone,” PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation, pp. 282 – 293, ACM New York, NY, 2002 http://www.eecs.harvard.edu/~greg/cyclone/papers/cyclone-regions.pdf (retrieved 8/10/2012)

19 comments:

AnonymousAugust 15, 2012 at 1:33 AM
Hi! I have several questions about ParaSail:

1. If i want to print debug info i need "var IO" as parameter, no other ways. Is this correct? If yes - maybe it's better to allow global vars but require them to be concurent objects. Also files, databases are all global state. Or you represent them in your language or FFI.

2. Could you explain refs? Manual has little information about it. (Personaly i never used refs in C#.)

func Nth_Element(ref L : List; N : Univ_Integer) -> ref optional Element;

What is a difference between "ref L : List" and "L : List"?
I cannot write "L := ..." but i can return ref, correct?

3. I think better explanation on region management required. Maybe explicit comparing to popular language like Java: here features which you loose and here features which you get. Concrete examples with picures. You don't uses pics in your blog and this is not cool.

4. What do you think about performance? Comparable to Java/C#, maybe C++ even? Where are bottlenecks? Developers need some guaranties about performance. What is fast, what is slow. This is not clear for me right now.

From last tiobe.com index: "But, Microsoft announced recently the revival of C++ (in favor of C#) within its company. C# appeared to be too high level to build high performance systems. This is confirmed by what we see happening to TIOBE's embedded software customers: after years of enthusiastic adoption of C#, there is now no significant growth any more."

5. From "A Tutorial on Parallel and Concurrent Programming in Haskell" by Simon Peyton Jones: "...Given the lack of side effects it may seem that we can productively evaluate every sub-expression of a Haskell program in parallel. In practice this does not work out well because it creates far too many small items of work which can not be efficiently scheduled and parallelism is limited by fundamental data dependencies in the source program."

How you adress first problem? Not sure what he mean in the second one...

6. What is the open issues in your design? (And of course when to expect release?)

7. Language is quite verbose, to "not helpful" degree. I think from a syntax point of view you should consider "next Java" languages: Kotlin, Ceylon, Dart maybe. What audience you trying to catch? Go language was built for C developers in mind but catch Python enthusiasts :)

8. What about GUI development?
- usualy 1 thread, more for network and computation
- global state for screen representation
- language should support tree construction + data binding, something like

Window { Title = "Hello",
StackPanel {
Button { "OK", Width = 75 },
Button { "Cancel", Height = 23 },
Textbox { Text = binding Name } // Customer.Name
}
}

This usable in many places not only in GUI.

9. When object expand/shrink how data is moving?

10. I hope for anwser, sorry for such a long comment :)

ReplyDelete
Replies
Tucker TaftAugust 15, 2012 at 9:20 AM
Here are responses to a comment recently received:

> Hi! I have several questions about ParaSail:
>
> 1. If i want to print debug info i need "var IO" as parameter, no other ways. Is this correct? If yes - maybe it's better to allow global vars but require them to be concurent objects. Also files, databases are all global state. Or you represent them in your language or FFI.

There is a version of "Println" which is useful for debugging which takes no "file" parameter, and produces output on a standard output file. This output is totally unsynchronized, and would not be useful for "production" output.

For "production" output, the "var IO" parameter will be provided to the "main" func of a program, and can be passed to any other part of the program that needs it. The IO parameter is also used for gaining access to other parts of the environment, such as a database.

>
> 2. Could you explain refs? Manual has little information about it. (Personaly i never used refs in C#.)
>
> func Nth_Element(ref L : List; N : Univ_Integer) -> ref optional Element;
>
> What is a difference between "ref L : List" and "L : List"?
> I cannot write "L := ..." but i can return ref, correct?

A "ref" is a short-lived reference to an object or a component of an object, almost like a temporary "renaming." To implement user-defined indexing which can be used on the left-hand-side of an assignment, it is necessary to return a "ref" to the particular component that is to be updated:

A[I] := 3;

This is expanded into a call on "indexing" (or "var_indexing" if provided):

"indexing"(A, I) := 3;

The "indexing" operator returns a reference to the "I"th element of A, where I is a "key" into the "A" container.

A reference "looks" and "smells" much like a pointer, but because it is short-lived, and you can't change where it points after it is created, and any use of it automatically refers to the referenced object, it avoids most of the problems with pointers. In ParaSail, unlike in C++, a function can only *return* a ref to a part of a parameter that it receives as a ref. The compiler "knows" this at the call site as well, so it presumes that the returned "ref" refers to some part of a "ref" parameter from the point of view of alias analysis.

[to be continued...]
ReplyDelete
Replies
GeorgeAugust 16, 2012 at 9:09 AM
As related work, might Luc Bläser's Composita language (http://www.composita.net/) be relevant?
ReplyDelete
Replies
AnonymousAugust 19, 2012 at 7:42 AM
How XML file could look like in standard library? Big immutable thing, big mutable thing or splitted into nodes with indexes for children?
ReplyDelete
Replies
AnonymousAugust 26, 2012 at 5:13 PM
I'm worrying about one thing. Whether container aggregates would support XML (and other deep tree structures like GUI, parse tree, configs)?

Related question: is it possible to introduce "shallow var" parameter when you could modify only attributes but not content (marked somehow)? So XML like structures could be represented as a whole without indeces and still support parallelism.
ReplyDelete
Replies

Add comment

Designing ParaSail, a new programming language

Tuesday, August 14, 2012

Related work on pointer-free parallel programming

19 comments:

Contributors

Blog Archive