Sunday, February 10, 2019

ParaSail Published in Programming Journal Vol. 3, Issue 3

We very recently had a long paper on ParaSail published in the relatively new Journal of the Art, Science, and Engineering of Programming (http://programming-journal.org).  This is an interesting journal, which is trying to rejuvenate the writing of Computer Science journal articles, rather than having essentially all interesting research showing up as Computer Science conference papers.  Conference papers often tend to be incremental, being a report on recent progress in CS research activities, but not necessarily having the space or time to provide an archival-level quality and depth that might be possible in a journal article.  In any case, here is the abstract for the ParaSail paper:

   http://programming-journal.org/2019/3/7/

and from there you can click through to the full PDF (the journal is all open access).  Many thanks to the editors and reviewers for their guidance in bringing this paper up to journalistic standards.

In fact, the journal has kind of flipped things around.  If a paper is accepted for publication in the journal in a given year, they invite the author(s) to present the paper at their annual conference, which this year is in Genoa, Italy, the first week in April:

   https://2019.programming-conference.org/

So come on down to Genoa in April if you are in the neighborhood.  The presentation on ParaSail will be on Thursday, April 4th.

String interpolation comes to ParaSail 8.0

ParaSail supports generic functions where the type of the parameter is determined at the point of call.  The most common use of this capability is with the string concatenation operator "|":

  interface PSL::Core::Univ_String<> is
      ...
    op "|"(Left, Right : optional Univ_String) -> Univ_String
      is import(#concat_string)

     op "|"(Left : optional Univ_String;
           Right : optional Right_Type is Imageable<>)
      -> Univ_String

    op "|"(Left : optional Left_Type is Imageable<>;
           Right : optional Univ_String)
      -> Univ_String

      ...
  end interface PSL::Core::Univ_String



The first "|" provides the builtin string concatenation.  The other two are generic functions defined in terms of this builtin concatenation, as follows:


  class PSL::Core::Univ_String is
     ...
   exports
     ...
    op "|"(Left : optional Univ_String;
           Right : optional Right_Type is Imageable<>)
      -> Univ_String is
        if Right is null then
            return Left | "null"
        else
            return Left | Right_Type::To_String(Right)
        end if
    end op "|"

    op "|"(Left : optional Left_Type is Imageable<>;
           Right : optional Univ_String)
      -> Univ_String is
        if Left is null then
            return "null" | Right
        else
            return Left_Type::To_String(Left) | Right
        end if
    end op "|"
      ...

  end class PSL::Core::Univ_String


---


What the above means is that when you write:

    "X = " | X

this will be allowed so long as "X" is of a type that satisfies the "Imageable" interface, which is defined as follows:

 abstract interface PSL::Core::Imageable<> is
    func To_String(Val : Imageable) -> Univ_String<>

    optional func From_String(Str : Univ_String<>) -> optional Imageable

    // NOTE: We include Hashable<> operations here
    //       so that Set works nicely.
    //       Clearly if something is Imageable it is possible
    //       to implement "=?" and Hash using the string image,
    //       so we might as well requires these operations too.

    op "=?"(Left, Right : Imageable) -> Ordering
    func Hash(Val : Imageable) -> Univ_Integer
 end interface PSL::Core::Imageable
 

---
The availability of the To_String operation is the critical aspect of Imageable that we use for the "|" operator.  We expand "X = " | X into:

   "X = " | To_String(X)

and so long as X's type has an appropriate To_String operation, everything works smoothly.  What this means is that you almost never have to write a call on To_String explicitly, by just alternating between string literals and expressions that you want to print, you can write things like:

 Println ("X = " | X | ", Y = " |  Y | ", X + Y = " | X + Y | "!");

which is clearly less verbose than:

 Println ("X = " | To_String(X) | ", Y = " |  To_String(Y) | ", X + Y = " | To_String(X + Y) | "!");

Nevertheless, even in the more concise version, all of those alternating quotes and '|' characters can become hard for the human to parse, and it is easy to forget one of the needed characters.  So what to do?

STRING INTERPOLATION

A (growing) number of languages support the ability to "interpolate" the values of variables, or in some cases expressions, directly into string literals.  This has been true for simple variables since the beginning for languages like the Unix/GNU-Linux shell:

    echo "X = $X, Y = $Y"

Interpolating the value of more complex expressions is sometimes supported, but generally requires some additional syntax.  

Very early on, most variants of the LISP language supported the ability to construct list structures explicitly, using the "quote" operation, along with an ability to insert a LISP expression (aka "S-expression") in the middle that was to be evaluated and substituted into the list structure that was being defined.  E.g.:

     (define celebrate (lambda (name) (quote (Happy Birthday to (unquote (capitalize name))))))

and then "(celebrate (quote sam))" evaluates to "(Happy Birthday to Sam)".

As a short-hand, in most versions of LISP:

   (quote (a b c))

becomes something like:

    '(a b c)

and (unquote blah) becomes, depending on the version of LISP, something like:

   `blah
or
   ,blah

So the above definition of "celebrate" using these short-hands becomes:

   (define celebrate (lambda (name) '(Happy Birthday to ,(capitalize name))))

with (celebrate 'sam) becoming (Happy Birthday to Sam)

PARASAIL STRING INTERPOLATION

Ok, so how does this relate to ParaSail?  So the explicit use of the "|" operator and having to end a string literal, insert the expressions between | ... |, and then restart the string literal gets old.  Given that there aren't an infinite number of ParaSail programs already in existence, and the fact that the back-quote character is almost never used, we decided to hi-jack the meaning of back-quote when it appears in the middle of a string literal to reduce all the back and forth needed when using an explicit "|" operator.  Hence, we now allow:

   Println("X = `(X), Y = `(Y),  X + Y = `(X + Y)!");

In other words, you can "interpolate" the value of an arbitrary (Imageable) expression into the middle of a ParaSail string literal by using `().  This is actually recognized during lexical analysis, and a mechanical expansion is performed by the lexical analyzer, of a back-quote character (unless preceded by '\') into the two characters:  " |

The lexical analyzer then expects a left parenthesis to be the next character, and counts matching parentheses, and when it reaches the matching right parenthesis, it emits the two characters: | "

The net effect is that "X = `(X), ..." becomes:

      "X = " | (X) | ", ..."

Note that the parentheses are preserved, to avoid issues with precedence between the "|" operator and operators that might occur within the parentheses.  Also note that the parenthesized expression can be separated from the back-quote by white space, and the expression can also span multiple lines if necessary.  The lexer is smart enough to use a stack, so that the parenthesized expression can also have nested string literals, that themselves use interpolation.

If you have looked at past source releases of the ParaSail LLVM-based compiler (which is itself written in Parasail -- see lib/compiler.psl), you might want to take a look at the version of lib/compiler.psl in the 8.0 release.   It makes heavy use of string interpolation, and hopefully is easier to read because of that.
 

 



 



Saturday, February 9, 2019

Release 8.0 of ParaSail Interpreter, Compiler, and Debugger

Finally, at long last a new release of ParaSail.  This new release incorporates an interactive debugger that is automatically invoked when the interpreter encounters an assertion, precondition, or postcondition that fails at run-time.  This is also the first release where pre- and postconditions are fully analyzed, and checked at run-time.  Finally, this has one modest enhancement to the ParaSail syntax -- "interpolation" of the value of an expression directly into a string literal.  For example:

   Println("The value of X + Y is `(X + Y).");

The back-quote character followed by a parenthesized expression may now appear within a string literal, and the value of the expression is "interpolated" into the middle of the string, in place of the back-quoted expression.  Hence, presume X is 31 and Y is 11, the above will print:

   The value of X + Y is 42.

In any case, here is the new release:


It is a "zip" archive of sources and binaries for Mac, Linux, and Windows.  Mac and Linux include executable binaries for a bootstrapped LLVM-generating ParaSail compiler.  Windows only contains the executable binaries for the ParaSail interpreter.

The web page for ParaSail has also been updated:


From a documentation point of view, the most exciting news is that we have just published a full description of the ParaSail language in the new academic "Programming Journal":


Please take a look at this article to see a comprehensive description of ParaSail, and how it fits into the world of programming language design.

Here is the latest reference manual (which is also linked from the ParaSail web page, and included in the "zip" archive for release 8.0):


We will be posting some follow-up blog entries about some of the new features in this release, in particular the debugger, and the implementation of postconditions that require automatically saving some values only available on entry to an operation.

 Enjoy!

Wednesday, November 2, 2016

Release 7.0 of ParaSail interpreter, VM, and compiler -- sources plus linux/mac binaries

We have just made a release of the ParaSail interpreter, VM, llvm-based compiler, and ParaScope static analysis tool (aka "static catcher of programming errors").  This is version 7.0.  It includes a nearly complete re-write of the llvm-based compiler.  After attempting other approaches, we finally went back to the ParaSail front end and had it annotate every PSVM instruction with a set of virtual register numbers in addition to a stack offset for local variables.  This allows the backend to generate much better LLVM (and eventually will also simplify the job of doing more advanced static analysis).  The compiler also automatically inlines small routines, including routines from the ParaSail standard library, so function call overhead in the presence of layers of abstraction can be much reduced.

This release is the first in a while to include binaries, though it only includes binaries for Mac and Linux (Windows is being difficult ... ;-{).  The release is at:

   http://bit.ly/psl7rel

If you have any problems, please report them on the ParaSail google group:

 https://groups.google.com/forum/#!forum/parasail-programming-language

Friday, November 6, 2015

ParaSail source release 6.5 now available.

We have just made a release of the ParaSail interpreter, VM, and compiler.  This is version 6.5, and is mostly a bug-fix release relative to 6.3.  We are working on a static analyzer and optimizer for ParaSail, and this release contains prototypes for both of those (ParaScope, and a combination of ParaScope and the compiler).  The static analyzer can be invoked using "bin/scope.csh file1.psl file2.psl ...".  The optimizer is not quite ready for prime time at this point, but feel free to investigate lib/vn_il.ps? to see how the compiler and static analyzer are being integrated to provide LLVM optimization.  As before, the source release is available at:

   http://bit.ly/ps6xsrc

If you have any problems, please report them at:

   http://groups.google.com/group/parasail-programming-language

Monday, January 5, 2015

Compiling ParaSail to LLVM, first in a series...

We now have an LLVM-based compiler for ParaSail, thanks to great work this past summer by Justin Hendrick, a summer intern from Cornell.  To download the latest source release for both the ParaSail interpreter and the LLVM-based compiler, visit http://parasail-lang.org 

There are a number of interesting design issues associated with creating a compiler for a pointer-free, pervasively parallel language, and it seemed worth writing about some of them here.  So here is the first in a series of such blog entries.

Clearly pervasive fine-grained parallelism is one of the things that makes ParaSail interesting and a bit different from typical languages.  So let's start with that -- how are picothreads, or parallel work items (as we have taken to describing them in some contexts), represented in the generated LLVM?  The ParaSail front end already does the work to decide where it is worth creating a separate thread of control as part of generating instructions for the ParaSail Virtual Machine (PSVM instructions).  In the PSVM, these are represented as nested blocks within a single PSVM routine.  That is, a single ParaSail operation (func or op) produces a single PSVM routine, but that routine may have zero or more nested blocks to represent code suitable for execution by a separate picothread.  A nested block is invoked by the Start_Parallel_Op or Add_Parallel_Op PSVM instructions, awaited by the Wait_For_Parallel_Op, and terminated by an Exit_Op instruction.  On the other hand, a whole routine is invoked by a Call_Op, Start_Parallel_Call_Op, or Add_Parallel_Call_Op instruction, and terminated by the Return_Op instruction -- Wait_For_Parallel_Op is used for parallel invocations of both nested blocks and whole routines.

At the LLVM level, we made the decision to make the calling sequence essentially the same whether it was an invocation of a nested block or a routine.  Every such call has three arguments.  These are the Context, the Param_List, and the Static_Link.  The Param_List points to a stack-based parameter list, where the first parameter is the result, if any, and the other parameters are the input parameters.  The Context points to a data structure that is created when a new picothread is created, and is updated when a new storage region is created (more about that in a later blog entry).  The Context is effectively the per-picothread information.  The Static_Link either points at a type descriptor, for a routine that corresponds to an operation of a module, or to the enclosing stack frame, for a nested block or a routine that corresponds to a nested operation.  The first word of a stack frame holds a copy of the passed-in static link, so this supports up-level references across multiple levels, as well as access to the type descriptor passed into an enclosing module operation.

Type descriptors are fundamental to the ParaSail run-time model.  Remember that at the source level, all ParaSail modules are considered parameterized, and a ParaSail type is produced by instantiating a module.  Operations are generally associated with modules, and unlike in C++ or Ada, instantiating a module does not produce a new set of code for these operations.  Instead, the code is generated once for a given module, and then used for all instantiations.  But that means that the code needs to be passed some information that identifies the particular instantiation for which it is being executed.  This is what a type descriptor provides -- all of the information that is dependent on the particular instantiation being used. 

For example, a Set module might have various operations like Union and Intersection and Insert and Count, but the type of the elements being manipulated is determined by the particular instantiation, be it Set<Integer>, Set<String>, or Set<Map<String, Tree<Float>>>.  Presuming it is a hashed Set, the only requirements on the element type is that it have the operations of the abstract interface Hashable, which generally means Hash and "=?" (aka compare).  So when we call the Insert operation of Set, we pass it the type descriptor (as the Static_Link argument) for the particular instance of Set we are using.  Since Set only has one type parameter, the type descriptor of Set consists primarily of a reference to a type descriptor for this one type parameter.  In addition to having references to the type descriptors for the type parameters, a type descriptor also includes the equivalent of a C++ virtual function table.  That is, for each of the operations of Set, we have a reference to the (compiled or interpreted) routine for that operation.  This is not generally needed for the code within Set itself, since that code generally knows at compile time what routines are used for all of the other Set operations.  However, since we also use a type descriptor for the type parameter of Set, we will need to use the routine table within that type descriptor to gain access to the Hash and "=?" operation for the particular element type being used.  So in the type descriptor for Set<Integer>, there will be a reference to the type descriptor for Integer, and from there you can gain access to the Integer Hash and Integer "=?" routines, which will certainly be needed at least for the Insert routine.

One last wrinkle on type descriptors -- because ParaSail uses a variant of the class-and-interface model of inheritance, it supports multiple inheritance, so different types that implement the Hashable interface might have different sets of routines in their routine table.  So how does the code for the Insert routine know where it can find the Hash and "=?" routines for Integer or String, given that they might also have many other routines?  The solution here is an extra level of indirection.  We actually have two types of type descriptors, one is the full type descriptor, and has all of the information so far described (and more, such as nested constants).  The second kind of type descriptor is called an operation map (or op map for short).  This consists of a reference to the full type descriptor for the type, plus a simple mapping from an interface's operation index to the underlying type's operation index.  If we sequentially number the operations of any given module's interface, then that gives us an operation index which we can use at run-time for selecting from the routine table in a type descriptor for an instance of that module.  The op map provides the necessary conversion when the operations defined in the abstract interface (specified for the element's type when declaring a module like Set) are numbered differently than the actual element type's operations.  An op-map for a type that implements Hashable might be defined by a simple mapping such as (1 => 5, 2 => 3), meaning that the first operation of Hashable is in fact the fifth operation of the actual type, and the second operation of Hashable is the third operation of the actual type.  An op-map provides for constant-time execution of calls on operations, even in the presence of multiple inheritance.

Note that this op-map approach does not work for languages where all objects carry around a reference to their type descriptor at run-time, since a single type descriptor must work for all uses of the object.  One of the features of ParaSail is that typical objects have no type descriptor overhead.  The compiler generally knows where to find a type descriptor for any object, without the object having to carry it around.  The one exception to this are explicitly polymorphic objects.  In ParaSail these are declared with a syntax of a type name followed by a "+".   For example, Set<Imageable+> would be a set of objects that implement the Imageable interface, where each object might be of a different type.  This means that each Imageable+ object consists of a reference to the underlying object, plus a reference to a type descriptor identifying the actual type of the object.  But even these sorts of objects can preserve the constant-time execution of operation calls, because an object is polymorphic for a particular interface (Imageable in this case), and so the type descriptor that such a polymorphic object carries around will likely be an op-map tailored to the particular interface being implemented.   The ParaSail Imageable interface actually has four operations (To_String, From_String, Hash, and "=?"), so such an op-map might look like (1 => 13, 2 => 5, 3 =>9, 4 => 11), which provides constant-time access to any of the operations of Imageable.

One final point worth noting -- type descriptors are used in various contexts, as suggested above (as a static link, as a type parameter, and for run-time-type identification).  In most of these contexts op-maps can be used instead of a full type descriptor.  An important exception to this is a type descriptor used as a static link.  In that case, it is always a full type descriptor, and the compiled code can rely on that fact.

Sunday, January 4, 2015

Expanded paper on ParaSail pointer-free parallelism

Here is a PDF for a recent paper describing in more depth ParaSail's pointer-free approach to parallelism:

   http://bit.ly/ps15pfp

Here is the abstract from the paper:


ParaSail is a language specifically designed to simplify the construction of programs that make full, safe use of parallel hardware. ParaSail achieves this largely through simplification of the language, rather than by adding numerous rules. In particular, ParaSail eliminates global variables, parameter aliasing, and most significantly, re-assignable pointers. ParaSail has adopted a pointer-free approach to defining data structures. Rather than using pointers, ParaSail supports flexible data structuring using expandable (and shrinkable) objects, along with generalized indexing. By eliminating global variables, parameter aliasing, and pointers, ParaSail reduces the complexity for the programmer, while also allowing ParaSail to provide pervasive, safe, object-oriented parallel programming.