Sunday, February 10, 2019

String interpolation comes to ParaSail 8.0

ParaSail supports generic functions where the type of the parameter is determined at the point of call.  The most common use of this capability is with the string concatenation operator "|":

  interface PSL::Core::Univ_String<> is
    op "|"(Left, Right : optional Univ_String) -> Univ_String
      is import(#concat_string)

     op "|"(Left : optional Univ_String;
           Right : optional Right_Type is Imageable<>)
      -> Univ_String

    op "|"(Left : optional Left_Type is Imageable<>;
           Right : optional Univ_String)
      -> Univ_String

  end interface PSL::Core::Univ_String

The first "|" provides the builtin string concatenation.  The other two are generic functions defined in terms of this builtin concatenation, as follows:

  class PSL::Core::Univ_String is
    op "|"(Left : optional Univ_String;
           Right : optional Right_Type is Imageable<>)
      -> Univ_String is
        if Right is null then
            return Left | "null"
            return Left | Right_Type::To_String(Right)
        end if
    end op "|"

    op "|"(Left : optional Left_Type is Imageable<>;
           Right : optional Univ_String)
      -> Univ_String is
        if Left is null then
            return "null" | Right
            return Left_Type::To_String(Left) | Right
        end if
    end op "|"

  end class PSL::Core::Univ_String


What the above means is that when you write:

    "X = " | X

this will be allowed so long as "X" is of a type that satisfies the "Imageable" interface, which is defined as follows:

 abstract interface PSL::Core::Imageable<> is
    func To_String(Val : Imageable) -> Univ_String<>

    optional func From_String(Str : Univ_String<>) -> optional Imageable

    // NOTE: We include Hashable<> operations here
    //       so that Set works nicely.
    //       Clearly if something is Imageable it is possible
    //       to implement "=?" and Hash using the string image,
    //       so we might as well requires these operations too.

    op "=?"(Left, Right : Imageable) -> Ordering
    func Hash(Val : Imageable) -> Univ_Integer
 end interface PSL::Core::Imageable

The availability of the To_String operation is the critical aspect of Imageable that we use for the "|" operator.  We expand "X = " | X into:

   "X = " | To_String(X)

and so long as X's type has an appropriate To_String operation, everything works smoothly.  What this means is that you almost never have to write a call on To_String explicitly, by just alternating between string literals and expressions that you want to print, you can write things like:

 Println ("X = " | X | ", Y = " |  Y | ", X + Y = " | X + Y | "!");

which is clearly less verbose than:

 Println ("X = " | To_String(X) | ", Y = " |  To_String(Y) | ", X + Y = " | To_String(X + Y) | "!");

Nevertheless, even in the more concise version, all of those alternating quotes and '|' characters can become hard for the human to parse, and it is easy to forget one of the needed characters.  So what to do?


A (growing) number of languages support the ability to "interpolate" the values of variables, or in some cases expressions, directly into string literals.  This has been true for simple variables since the beginning for languages like the Unix/GNU-Linux shell:

    echo "X = $X, Y = $Y"

Interpolating the value of more complex expressions is sometimes supported, but generally requires some additional syntax.  

Very early on, most variants of the LISP language supported the ability to construct list structures explicitly, using the "quote" operation, along with an ability to insert a LISP expression (aka "S-expression") in the middle that was to be evaluated and substituted into the list structure that was being defined.  E.g.:

     (define celebrate (lambda (name) (quote (Happy Birthday to (unquote (capitalize name))))))

and then "(celebrate (quote sam))" evaluates to "(Happy Birthday to Sam)".

As a short-hand, in most versions of LISP:

   (quote (a b c))

becomes something like:

    '(a b c)

and (unquote blah) becomes, depending on the version of LISP, something like:


So the above definition of "celebrate" using these short-hands becomes:

   (define celebrate (lambda (name) '(Happy Birthday to ,(capitalize name))))

with (celebrate 'sam) becoming (Happy Birthday to Sam)


Ok, so how does this relate to ParaSail?  So the explicit use of the "|" operator and having to end a string literal, insert the expressions between | ... |, and then restart the string literal gets old.  Given that there aren't an infinite number of ParaSail programs already in existence, and the fact that the back-quote character is almost never used, we decided to hi-jack the meaning of back-quote when it appears in the middle of a string literal to reduce all the back and forth needed when using an explicit "|" operator.  Hence, we now allow:

   Println("X = `(X), Y = `(Y),  X + Y = `(X + Y)!");

In other words, you can "interpolate" the value of an arbitrary (Imageable) expression into the middle of a ParaSail string literal by using `().  This is actually recognized during lexical analysis, and a mechanical expansion is performed by the lexical analyzer, of a back-quote character (unless preceded by '\') into the two characters:  " |

The lexical analyzer then expects a left parenthesis to be the next character, and counts matching parentheses, and when it reaches the matching right parenthesis, it emits the two characters: | "

The net effect is that "X = `(X), ..." becomes:

      "X = " | (X) | ", ..."

Note that the parentheses are preserved, to avoid issues with precedence between the "|" operator and operators that might occur within the parentheses.  Also note that the parenthesized expression can be separated from the back-quote by white space, and the expression can also span multiple lines if necessary.  The lexer is smart enough to use a stack, so that the parenthesized expression can also have nested string literals, that themselves use interpolation.

If you have looked at past source releases of the ParaSail LLVM-based compiler (which is itself written in Parasail -- see lib/compiler.psl), you might want to take a look at the version of lib/compiler.psl in the 8.0 release.   It makes heavy use of string interpolation, and hopefully is easier to read because of that.



No comments:

Post a Comment