Thursday, November 21, 2013

First release of Javallel, Parython; new release 5.1 of ParaSail and Sparkel

The ParaSail family of languages is growing, with two more additions now available for experimentation.  We have made a new release 5.1 which includes all four members of the family -- ParaSail itself, Sparkel based on the SPARK subset of Ada, Javallel based on Java, and Parython based on Python.  Binaries plus examples for these are all available in a single (large) download:

   http://bit.ly/ps51bin

As before, if you are interested in sources, visit:

   http://sparkel.org

The biggest change in ParaSail was a rewrite of the region-based storage manager (actually, this same storage manager is used for all four languages), to dramatically reduce the contention between cores/processors related to storage management.  The old implementation was slower, and nevertheless still had a number of race conditions.  This one is faster and (knock on wood) free of (at least those ;-) race conditions.

As far as how Javallel relates to Java, here are some of the key differences:
  1. Classes require a "class interface" to declare their visible operations and fields
  2. There is no notion of a "this" parameter -- all parameters must be declared
  3. There is no notion of "static" -- a method is effectively static if it doesn't have any parameters whose type matches the enclosing class; no variables are static
  4. You only say "public" once, in the class, separating the private stuff (before the word "public") from the implementation of the visible methods.
  5. Semi-colons are optional at the end of a line
  6. Parentheses are optional around the condition of an "if"
  7. "for" statements use a different syntax; e.g:
    •  for I in 1..10 [forward | reverse | parallel] { ... }
  8. "{" and "}" are mandatory for all control structures
  9. You can give a name to the result of a method via:  
    • Vector createVec(...) as Result { ... Result = blah; ... } 
    and then use that name (Result) inside as a variable whose final value is returned
  10. You have to say "T+" rather than simply "T" if you want to accept actual parameter that are of any subclass of T (aka polymorphic).  "T" by itself only allows actuals of exactly class T.
  11. Object declarations must start with "var," "final," or "ref" corresponding to variable objects, final objects, or ref objects (short-lived renames).
  12. There are no special constructors; any method that returns a value of the enclosing is effectively a constructor;  objects may be created inside a method using a tuple-like syntax "(a => 2, b => 3)" whose type is determined from context
  13. X.Foo(Y) is equivalent to Foo(X, Y)
  14. Top-level methods are permitted, to simplify creating a "main" method
  15. uses "and then" and "or else" instead of "&&" and "||"; uses "||" to introduce explicit parallelism.
  16. "synchronized" applies to classes, not to methods or blocks
  17. enumeration literals start with "#"
There are examples in javallel_examples/*.jl?, which should give a better idea of what javallel is really like.  Parython examples are in parython_examples/*.pr?
 

Thursday, November 14, 2013

Using ParaSail as a Modeling Language for Distributed Systems

The ACM HILT 2013 conference just completed in Pittsburgh, and we had some great tutorials, keynotes, and sessions on model-based engineering, as well as on formal methods applied to both modeling and programming languages.  One of the biggest challenges identified was integrating complex systems with components defined in various modeling or domain-specific languages, along with an overall architecture, which might be specified in a language like AADL or SysML or might just be sketches on a whiteboard somewhere.  A big part of the challenge is that different languages have different underlying semantic models, with different type systems, different notions of time, different concurrency and synchronization models (if any),  etc.  The programming language designer in me wants to somehow bring these various threads (so to speak) together in a well-defined semantic framework, ideally founded on a common underlying language.

One way to start is by asking how can you "grow" a programming language into a modeling language (without killing it ;-)?  ParaSail has some nice features that might fit well at the modeling level, in that its pointer-free, implicitly parallel control and data semantics are already at a relatively high level, and don't depend on a single-address-space view, nor a central-processor-oriented view.  As an aside, Sebastian Burckhardt from Microsoft Research gave a nice talk on "cloud sessions" at the recent SPLASH/OOPSLA conference in Indianapolis (http://splashcon.org/2013/program/991, http://research.microsoft.com/apps/pubs/default.aspx?id=163842), and we chatted afterward about what a perfect fit the ParaSail pointer-free type model was to the Cloud Sessions indefinitely-persistent data model. Modeling often abstracts away some of the details of the distribution and persistence of processing and data, so the friendliness of the ParaSail model to Cloud Sessions might also bode well for its friendliness to modeling of other kinds of long-lived distributed systems.

ParaSail's basic model is quite simple, involving parameterized modules, with separate definition of interface and implementation, types as instantiations of modules, objects as instances of types, and operations defined as part of defining modules, operating on objects.  Polymorphism is possible in that an object may be explicitly identified as having a polymorphic type (denoted by T+ rather than simply T) and then the object carries a run-time type identifier, and the object can hold a value of any type that implements the interface defined by the type T, including T itself (if T is not an abstract type), as well as types that provide in their interface all the same operations defined in T's interface.

So how does this model relate to a modeling language like Simulink or a Statemate?  Is a Simulink "block" a module, a type, an object, or an operation (or something entirely different)?  What about a box on a state-machine chart?  For Simulink, one straightforward answer is that a Simulink block is a ParaSail object.  The associated type of the block object defines a set of operations or parameter values that determine how it is displayed, how it is simulated, how it is code-generated, how it is imported/exported using some XML-like representation, etc.  A Simulink graph would be an object as well, being an instance of a directed graph type, with a polymorphic type, say "Simulink_Block+," being the type of the elements in the graph (e.g. DGraph).

Clearly it would be useful to define new block types using the Simulink-like modeling language itself, rather than having to "drop down" to the underlying programming language.  One could imagine a predefined block type "User_Defined_Block" used to represent such blocks, where the various display/simulation/code-generation/import/export operations would be defined in a sub-language that is itself graphical, but relies on some additional (built-in) block types specifically designed for defining such lower-level operations.  Performing code-generation on these graphs defining the various primitive operations of this new user-defined block type would ideally create code in the underlying programming language (e.g. ParaSail) that mimics closely what a (ParaSail) programmer might have written to define a new block type directly in the (ParaSail) programming language.  This begins to become somewhat of a "meta" programming language, which always makes my head spin a little...

A practical issue at the programming language level when you go this direction is that, what was a simple interface/implementation module model, may want to support "sub-modules" in various dimensions.  In particular, there may be sets of operations associated with a given type devoted to relatively distinct problems, such as display vs. code generation, and it might be useful to allow both the interface, and certainly the implementation of a block-type-defining module to be broken up into sub-modules.  The ParaSail design includes this notion, which we have called "interface extensions" (which is a bit ambiguous, so the term "interface add-on" might be clearer).  These were described in:
http://parasail-programming-language.blogspot.com/2010/08/eliminating-need-for-visitor-pattern-in.html
but have not as of yet been implemented.  Clearly interface add-ons, for say, [#display] or [#code_gen], could help separate out the parts of the definition of a given block type.

A second dimension for creating sub-modules would be alternative implementations of the same interface, with automatic selection of the particular implementation based on properties of the parameters specified when the module is instantiated.  In particular, each implementation might have its own instantiation "preconditions" which indicate what must be true about the actual module parameters provided before a given implementation is chosen.  In addition, there needs to be some sort of a preference rule to use when more than one implementations' preconditions are satisfied by a given instantiation.  For example, presume we have one implementation of an Integer interface that handles 32-bit ranges of integers, a second that handles 64-bit ranges, and one that handles infinite range.  Clearly the 32-bit implementation would have a precondition that the range required be within +/- 2^31, the 64-bit one would require the range be within +/- 2^63, and the infinite-range-handling implementation would have no precondition.  If we were to instantiate this Integer module with a 25-bit range, the preconditions of all three of the implementations would be satisfied, but there would presumably be a preference to use the 32-bit implementation over the other two.  The approach we have considered for this is to allow a numeric "preference" level to be specified when providing an implementation of a module along with the implementation "preconditions," with the default level being "0" and the default precondition being "#true." The compiler would choose the implementation with the maximum preference level with satisfied preconditions.  It would complain if there were a tie, requiring the user to specify explicitly which implementation of the module is to be chosen at the point of instantiation.

Wednesday, November 6, 2013

Tech talk and Tutorial at SPLASH 2013 on parallel programming

I gave an 80-minute "tech talk" and a 3-hour tutorial on parallel programming last week at SPLASH 2013 in Indianapolis.  The audiences were modest but enthusiastic. 

The tech talk was entitled:

  Living without Pointers: Bringing Value Semantics to Object-Oriented Parallel Programming

Here is the summary:

The heavy use of pointers in modern object-oriented programming languages interferes with the ability to adapt computations to the new distributed and/or multicore architectures. This tech talk will introduce an alternative pointer-free approach to object-oriented programming, which dramatically simplifies adapting computations to the new hardware architectures. This tech talk will illustrate the pointer-free approach by demonstrating the transformation of two popular object-oriented languages, one compiled, and one scripting, into pointer-free languages, gaining easier adaptability to distributed and/or multicore architectures, while retaining power and ease of use.

Here is a link to the presentation:  http://bit.ly/sp13pf

The tutorial was entitled:

   Multicore Programming using Divide-and-Conquer and Work Stealing

Here is the summary for the tutorial:

This tutorial will introduce attendees to the various paradigms for creating algorithms that take advantage of the growing number of multicore processors, while avoiding the overhead of excessive synchronization overhead. Many of these approaches build upon the basic divide-and-conquer approach, while others might be said to use a multiply-and-conquer paradigm. Attendees will also learn the theory and practice of "work stealing," a multicore scheduling approach which is being adopted in numerous multicore languages and frameworks, and how classic work-list algorithms can be restructured to take best advantage of the load balancing inherent in work stealing. Finally, the tutorial will investigate some of the tradeoffs associated with different multicore storage management approaches, including task-local garbage collection and region-based storage management.
  
Here is a link to the presentation:  http://bit.ly/sp13mc

Comments welcome!

-Tuck