Designing ParaSail, a new programming language: May 2011

Wednesday, May 25, 2011

When null isn't just null -- value representation in ParaSail

The qualifier optional may be applied to any type in ParaSail, effectively producing a type with one additional value, namely the null value. In many languages, null is only used for pointer types, and is often represented as a zero address. In ParaSail, we want to be able to add the null value to any type without presuming that the zero value is available for that purpose, so this means that the representation for null will vary depending on the particular type.

For simple types, coming up with a null value is pretty straightforward. For unsigned integer types, a value one greater than the maximum value could be used for null. For example, given an unsigned type going from 0 to 100, the value 101 could be used to represent null. Similarly, for enumeration types that use an internal unsigned integer representation, such as 0 for #false and 1 for #true, the null value could be represented by one more than the maximum internal unsigned integer code, for example 2 for the null value of an optional Boolean type. For a signed integer type, one less than the minimum value might make sense, particularly on a 2's complement machine where there is one "extra" negative value anyway. So for a 64-bit signed integer, one might use -2**63+1 .. 2**63-1 for "normal" values of the type, and reserve -2**63 for the null value. Most floating point representations include the notion of "Not a Number" (NaN), and some NaN value would make sense for null. Since there are no run-time checks in ParaSail (checking all being done at compile-time), it would be fine to use a "non-signaling" NaN for null.

For more complex types, the representation for null is a bit more interesting. One common kind of type would be a simple "wrapper," that is, a type defined by a class that has exactly one var component. For example:

class Box<Contents_Type is Assignable<>> is
    var Contents : Contents_Type;
  exports
    function Create(Value : Contents_Type) -> Box is
      return (Contents => Value);
    end function Create;
    ...
end class Box;

In this case, it would be nice to have the wrapper type use exactly the same representation as that of the underlying component type (e.g. Contents_Type). This would mean that the null value for the wrapper would be the same as the null value for the component type. This does mean that the component type must not itself be marked as optional, as then there would be no way to distinguish the wrapper being non-null but containing a single null component, from the case where the wrapper itself is null.

So our conclusion is that a wrapper type can use the same representation as its component type so long as the component type is not itself marked optional. If the component type is itself marked optional, then the wrapper needs to allow for its "own" representation for null, which might in some cases be accommodated by simply allowing for yet one more value if the component type is "simple," or a level of indirection for a more complex component type.

Now what about more complex types? For example, a type defined by a class with multiple var components:

class Pair<Element_Type is Assignable<>> is
    var Left : Element_Type;
    var Right : Element_Type;
  exports
    function Cons(Left, Right : Element_Type) -> Pair is
      return (Left => Left, Right => Right);
    end function Cons;
    ...
end class Pair;

One obvious representation for a type with multiple components is as a sequence of memory cells long enough to accommodate the representation of each of the components, and then some provision for representing null, which could be by piggy-backing on one of the components if it is not itself allowed to be null, or by adding an additional bit to indicate null-ness. However, in our initial (prototype, ParaSail-Virtual-Machine-based) implementation, we have chosen to represent every object using a single 64-bit memory cell. This basically means that if the value cannot fit in a single 64-bit cell, it will need to use a level of indirection. To simplify further, we won't be doing any "packing" initially, so even if the components are each quite short (such as booleans), we will nevertheless go to an indirect representation. We do anticipate supporting packed arrays, but that would initially be handled by doing explicit shifting and masking, rather than by building in the notion of packing into the ParaSail Virtual Machine instruction set. In the ParaSail Virtual Machine, pretty much everything occupies a single 64-bit word.

So back to the initial question -- how will we represent objects with multiple components (or with a single component whose type is marked optional)? And how will we represent null for such objects? One important thing to remember is that (large) ParaSail objects live in regions, and the regions are associated with stack frames. Whenever assigning a value to an object, if the new value can't "fit" in the same space as its current value, then the space for the old value needs to be released and the space for the new value needs to be allocated, in the appropriate region. Since a "ref var" parameter (or subcomponent thereof) might be assigned a new value, it won't necessarily be known at compile-time what region the object being assigned "lives" in. This suggests that every value for a large object must identify its region, so that its space can be released back to that region when it is freed, and so that the new value can be allocated in that region. This same requirement applies even if the object has a null value to begin with. Hence, we are left with the conclusion that a null value for a "large" type needs to identify its region.

Another issue for "large" objects is that we need to know how to reclaim the entire "subtree" of subcomponents when we release the enclosing object. In some cases we will know the type at compile-time, but in the case of polymorphic objects, or in cases where the type of the object is a formal type parameter of the enclosing module, we won't necessarily know. This implies that we may want to have some kind of "type-id" associated with large objects. This then results in the following representation for large objects:

    A 64-bit pointer to:
         [Region, Type-Id, component1, component2, component3, ...]

where the Type-Id provides enough information to know which of the components are themselves "large" and hence need to be recursively released. In addition, there would probably be a special null Type-Id, which would allow the "is null" operation to be implemented relatively efficiently by comparing the Type-Id against some kind of standard "Null-Type-Id" value. Each region would only need a single null value, which pointed back to the region, and had the Null-Type-Id. Such null objects would not be reclaimed if they were the "old" value of the left-hand side of an assignment, since there is only one such value per region, and it is shared among all objects in that region currently having a null value.

So to summarize, we now see that null is not a single value, but rather each simple type potentially has its own representation for null, and for "large" types that use a level of indirection, there is a separate null value for each region. So the "is null" operation is not necessarily a simple check for equality with the global null value, but instead would depend on the type, and in particular for "large" objects, would be a check of the Type-Id field of the object to see if it has the Null-Type-Id.

On a final note, when a function returns "large" values, it needs to be "told" in which region to allocate the value(s) being returned. One simple way to do this is for the "large" output parameter(s) to be initialized with a (large) null value by the caller. Since such a null value identifies the region, the function can use that information when allocating the space for the returned value.

Monday, May 9, 2011

Updated YACC grammar for ParaSail

Here is a more up-to-date YACC grammar from ParaSail. It uses "--" for comments, but otherwise is pretty vanilla "yacc" syntax.

--------------------------------------
-- YACC Grammar for ParaSail
--------------------------------------

-- Single-character delimiters --
%token ',' ';' ':' '.'
%token '+' '-' '*' '/' 
%token '?'
%token '(' ')' '[' ']' '<' '>' ''
%token '|' 
%token '='    -- for error recovery only
%token PRIME -- '''

-- Compound delimiters --
%token COMPARE -- "=?"
%token EQ   -- "=="
%token NEQ  -- "!="
%token GEQ  -- ">="
%token LEQ  -- "<="
%token POWER  -- "**"
%token ASSIGN -- ":="
%token SWAP   -- ":=:"
%token DOT_DOT -- ".."
%token OPEN_CLOSED_INTERVAL -- "<.."
%token OPEN_INTERVAL -- "<..<"
%token CLOSED_OPEN_INTERVAL -- "..<"
%token DOUBLE_COLON  -- "::"
%token REFERS_TO  -- "=>"
%token GIVES    -- "->"
%token IMPLIES    -- "==>"
%token SEQUENCE   -- ";;"
%token PARALLEL   -- "||"
%token PLUS_ASSIGN -- "+="
%token MINUS_ASSIGN -- "-="
%token TIMES_ASSIGN -- "*="
%token DIVIDE_ASSIGN -- "/="
%token POWER_ASSIGN -- "**="
%token CONCAT_ASSIGN -- "|="
%token AND_ASSIGN -- "and="
%token OR_ASSIGN -- "or="
%token XOR_ASSIGN -- "xor="

-- Literals --
%token Char_Literal
%token Enum_Literal
%token Integer_Literal 
%token Real_Literal
%token String_Literal

-- Identifier --
%token Identifier 

-- Reserved words --
%token ABS_kw
%token ABSTRACT_kw
%token ALL_kw
%token AND_kw
%token BEGIN_kw   -- used for error recovery only
%token BLOCK_kw
%token CASE_kw
%token CLASS_kw
%token CONCURRENT_kw
%token CONST_kw
%token CONTINUE_kw
%token EACH_kw
%token ELSE_kw
%token ELSIF_kw
%token END_kw
%token EXIT_kw
%token EXPORTS_kw
%token EXTENDS_kw
%token FOR_kw
%token FORWARD_kw
%token FUNCTION_kw
%token GLOBAL_kw
%token IF_kw
%token IMPLEMENTS_kw
%token IMPORT_kw
%token IN_kw
%token INTERFACE_kw
%token IS_kw
%token LAMBDA_kw
%token LOCKED_kw
%token LOOP_kw
%token MOD_kw
%token MUTABLE_kw
%token NEW_kw
%token NOT_kw
%token NULL_kw
%token OF_kw
%token OPERATOR_kw
%token OPTIONAL_kw
%token OR_kw
%token PRIVATE_kw
%token PROCEDURE_kw
%token QUEUED_kw
%token REF_kw
%token REM_kw
%token RETURN_kw
%token REVERSE_kw
%token SELECT_kw
%token SOME_kw
%token THEN_kw
%token TYPE_kw
%token UNTIL_kw
%token VAR_kw
%token WHILE_kw
%token WITH_kw
%token XOR_kw

%start module_list

%%

module_list : 
    module
  | module_list module
  ;

module : 
    import_clauses interface_declaration ';' 
  | import_clauses class_definition ';' 
  | import_clauses standalone_operation_definition ';'
  | import_clauses error ';'
  ;

import_clauses : 
  | import_clauses IMPORT_kw qualified_name_list ';' 
  ;

qualified_name_list : 
    qualified_name 
  | qualified_name_list ',' qualified_name 
  ;

interface_declaration : 
   opt_interface_qualifier INTERFACE_kw module_defining_name 
     formals_and_implemented_interfaces
     IS_kw
      interface_element_list
   END_kw opt_INTERFACE_kw module_defining_name 
   ; 

opt_INTERFACE_kw : INTERFACE_kw
  | 
  ;
   
opt_interface_qualifier : 
    interface_qualifier 
  | 
  ;

interface_qualifier : 
    class_qualifier 
  | ABSTRACT_kw opt_class_qualifier 
  | PRIVATE_kw opt_class_qualifier 
  ;

opt_class_qualifier : 
    class_qualifier 
  | 
  ;

class_qualifier : CONCURRENT_kw ;

standalone_operation_definition : 
    function_definition 
  | procedure_definition 
  | operator_definition 
  | operation_import 
  ;

formals : '<' opt_module_formal_list '>' ;

formals_and_implemented_interfaces :
    opt_formals opt_implements_list 
  | opt_formals EXTENDS_kw interface_name opt_implements_list 
  ; 

opt_formals : 
    formals  
  | 
  ;

opt_implements_list : 
    implements_list 
  | 
  ;

implements_list : IMPLEMENTS_kw interface_name_list  ;

interface_name_list :
    interface_name 
  | interface_name_list ',' interface_name 
  ;

interface_name : 
    module_name 
  | module_instantiation 
  ;
   
module_name : qualified_name ;

module_defining_name : 
    qualified_name 
  | qualified_name add_on_label 
  ;

add_on_label : 
    '[' operation_actual_list ']' ;

opt_module_formal_list : 
    module_formal_list 
  | ;

module_formal_list : 
    annotated_module_formal 
  | module_formal_list ';' annotated_module_formal 
  ;

annotated_module_formal : 
    opt_annotation type_formal opt_annotation 
  | opt_annotation operation_formal 
  | opt_annotation value_formal opt_annotation 
  ;

opt_annotation : 
    annotation 
  | 
  ;

type_formal : 
    id IS_kw module_instantiation 
  | module_instantiation 
  ;

operation_formal : 
    operation_declaration opt_operation_default  
  ;

opt_operation_default : 
    IS_kw simple_expression 
  | 
  ;

value_formal : 
    id_list ':' opt_output_modifier operand_type_specifier 
  | id_list ':' opt_output_modifier operand_type_specifier 
      ASSIGN_or_equal simple_expression  
  | ref_or_global_modifier operand_type_specifier 
  ;

id : Identifier 
   ;

id_list : 
    id 
  | id_list ',' id 
  ;

type_name : 
    qualified_name 
  | polymorphic_type_name 
  ;

polymorphic_type_name : qualified_name '+' ;

qualified_name : 
    id_or_string_literal 
  | qualified_name DOUBLE_COLON id_or_string_literal ;

id_or_string_literal :
    id 
  | String_Literal 
  ;
  
module_instantiation : 
    module_name '<' opt_module_actual_list '>' 
  | name '[' opt_operation_actual_list ']' '<' opt_module_actual_list '>' 
  ;

opt_add_on_label :
    add_on_label 
  | 
  ;

opt_module_actual_list : 
    module_actual_list 
    | 
    ;

module_actual_list :
    module_actual 
  | module_actual_list ',' module_actual 
  ;

module_actual : 
    simple_type_specifier_or_expression 
  | id REFERS_TO simple_type_specifier_or_expression 
  ;

-- simple_expression subsumes simple type_name in this rule
simple_type_specifier_or_expression : 
    qualified_name annotation 
  | qualified_type_specifier opt_annotation 
  | simple_expression              
    -- simple_expr to avoid problems with '>'
  | lambda_expression 
  | module_instantiation 
  ;
  
annotated_type_specifier : 
    type_specifier 
  | type_specifier annotation 
  ;

type_specifier :
    basic_type_specifier 
  | qualified_type_specifier 
  ;

basic_type_specifier : 
    type_name 
  | module_instantiation 
  | module_instantiation EXTENDS_kw type_specifier 
  ;

qualified_type_specifier : 
    type_qualifier type_name 
  | type_qualifier module_instantiation 
  | type_qualifier module_instantiation 
      EXTENDS_kw type_specifier 
  ;

type_qualifier :
    OPTIONAL_kw opt_MUTABLE_kw opt_CONCURRENT_kw 
  | MUTABLE_kw opt_CONCURRENT_kw 
  | CONCURRENT_kw 
  ;

opt_CONCURRENT_kw : 
    CONCURRENT_kw 
  | 
  ;

interface_element_list : 
  | interface_element_list interface_element ';' 
  | interface_element_list operation_import ';' 
  | interface_element_list error ';' 
  ;

interface_element : 
    operation_declaration 
  | object_declaration 
  | interface_declaration 
  | type_declaration 
  ;

operation_import :
    function_declaration IS_kw import_operation 
  | procedure_declaration IS_kw import_operation 
  | operator_declaration IS_kw import_operation 
  ;

class_definition :
   opt_class_qualifier CLASS_kw module_defining_name 
      class_formals_and_implemented_interfaces
   IS_kw
      class_element_list
   END_kw opt_CLASS_kw module_defining_name ; 

opt_CLASS_kw : CLASS_kw
  | 
  ;
   
class_formals_and_implemented_interfaces :
    formals_and_implemented_interfaces 
  | opt_formals EXTENDS_kw id ':' interface_name opt_implements_list 
  ; 


class_element_list : 
    local_class_element_list
  EXPORTS_kw 
    exported_class_element_list 
  | local_class_element_list
    annotation
  EXPORTS_kw 
    exported_class_element_list 
  | local_class_element_list 
  ;

local_class_element_list : 
  | local_class_element_list local_class_element ';' 
  ;

local_class_element : 
    interface_element  
  | operation_import 
  | annotated_exported_class_element 
  ;
  
exported_class_element_list : 
  | exported_class_element_list annotated_exported_class_element ';' 
  | exported_class_element_list operation_import ';' 
  | exported_class_element_list interface_element ';' 
  | exported_class_element_list error ';' 
  ;

annotated_exported_class_element : 
  | exported_class_element 
  | annotation 
  | annotation exported_class_element 
  ;

exported_class_element : 
    operation_definition  
  | object_definition  
  | class_definition  
  ;
  
   
annotation : '' 
  | annotation '' 
  ;

annotation_element_list : 
    annotation_element 
  | annotation_element_list ';' annotation_element 
  | annotation_element_list ';' error 
  ;

annotation_element : 
    interface_element 
  | operation_import 
  | condition 
  | quantified_expression 
  | annotation 
  ;

condition : expression  ;


operation_declaration : 
    function_declaration 
  | procedure_declaration 
  | operator_declaration 
  ;

function_declaration :
  opt_ABSTRACT_or_OPTIONAL_kw FUNCTION_kw id_or_string_literal
    operation_inputs opt_annotation 
      GIVES_or_RETURN_kw operation_outputs opt_annotation ;

GIVES_or_RETURN_kw : GIVES
  | RETURN_kw 
  ;

procedure_declaration :
  opt_ABSTRACT_or_OPTIONAL_kw PROCEDURE_kw id 
    operation_inputs opt_annotation ;

operator_declaration :
    opt_ABSTRACT_or_OPTIONAL_kw OPERATOR_kw operator_designator 
      operation_inputs opt_annotation 
  | opt_ABSTRACT_or_OPTIONAL_kw OPERATOR_kw operator_designator 
    operation_inputs opt_annotation 
      GIVES_or_RETURN_kw operation_outputs opt_annotation 
  ;

opt_ABSTRACT_or_OPTIONAL_kw :
    ABSTRACT_kw 
  | OPTIONAL_kw 
  | 
  ;

operator_designator : String_Literal ;
  
operation_inputs :
    simple_operation_input 
  | '(' opt_annotated_operation_input_list ')' 
  | '(' id_list ',' id ')' 
  | 
  ;

simple_operation_input :   -- avoids trailing use of "IS"
    id ':' opt_input_modifier simple_operand_type_specifier 
  | input_modifier simple_operand_type_specifier 
  | simple_operand_type_specifier 
  ;

opt_annotated_operation_input_list : 
    annotated_operation_input_list 
  | 
  ;

annotated_operation_input_list : 
    annotated_operation_input 
  | annotated_operation_input_list ';' annotated_operation_input 
  ;

annotated_operation_input : 
    operation_input opt_annotation 
  | annotation operation_input opt_annotation 
  | function_declaration opt_operation_default 
  | annotation function_declaration opt_operation_default 
  | procedure_declaration opt_operation_default 
  | annotation procedure_declaration opt_operation_default 
  ;

operation_input : 
    id_list ':' opt_input_modifier operand_type_specifier opt_ASSIGN_expression
  | input_modifier operand_type_specifier opt_ASSIGN_expression 
  | operand_type_specifier opt_ASSIGN_expression 
  ;

opt_input_modifier : 
    input_modifier 
  | 
  ;
  
simple_operand_type_specifier :
    type_name 
  | module_instantiation 
  ;

operand_type_specifier : 
    simple_operand_type_specifier 
  | id IS_kw module_instantiation 
  ;

input_modifier : 
    output_modifier 
  | QUEUED_kw mutable_or_var_or_const 
  | LOCKED_kw mutable_or_var_or_const 
  ;

mutable_or_var_or_const :
    MUTABLE_kw opt_VAR_kw 
  | VAR_kw 
  | CONST_kw 
  ;

opt_VAR_kw : 
    VAR_kw 
  | 
  ;

operation_outputs : 
    simple_operation_output 
  | annotation simple_operation_output 
  | '(' annotated_operation_output_list ')' 
  | '(' id_list ',' id ')' 
  ;

simple_operation_output :   -- avoids trailing use of "IS"
    id ':' opt_output_modifier simple_operand_type_specifier 
  | output_modifier simple_operand_type_specifier 
  | simple_operand_type_specifier 
  ;

annotated_operation_output_list :
    annotated_operation_output 
  | annotated_operation_output_list ';' annotated_operation_output 
  ;

annotated_operation_output : 
    operation_output opt_annotation 
  | annotation operation_output opt_annotation  
  ;

operation_output : 
  id_list ':' opt_output_modifier operand_type_specifier 
  | output_modifier operand_type_specifier  
  | operand_type_specifier  
  ;

opt_output_modifier :  
    output_modifier 
  | 
  ;

output_modifier :  
    OPTIONAL_kw 
  | ref_or_global_modifier 
  ;

ref_or_global_modifier :
    REF_or_GLOBAL_opt_optional_mutable 
  | REF_or_GLOBAL_opt_optional_mutable VAR_kw 
  | REF_or_GLOBAL_opt_optional_mutable CONST_kw 
  ;

REF_or_GLOBAL_opt_optional_mutable :
    REF_or_GLOBAL_kw 
  | REF_or_GLOBAL_kw OPTIONAL_kw 
  | REF_or_GLOBAL_kw MUTABLE_kw 
  | REF_or_GLOBAL_kw OPTIONAL_kw MUTABLE_kw 
  ;

REF_or_GLOBAL_kw : 
    REF_kw 
  | GLOBAL_kw 
  ;

object_declaration : 
    VAR_kw id ':' 
      annotated_type_specifier 
      opt_ASSIGN_expression 
  | CONST_kw id ':' annotated_type_specifier 
      opt_ASSIGN_expression 
  | id ':' annotated_type_specifier 
      opt_ASSIGN_expression 
  ;

opt_ASSIGN_expression : 
    ASSIGN_or_equal expression 
  | 
  ;
   
object_definition :
    CONST_kw id ASSIGN_or_equal expression 
  | VAR_kw id ASSIGN_or_equal expression 
  | CONST_kw id REFERS_TO name 
  | VAR_kw id REFERS_TO name 
  ;

opt_OPTIONAL_kw : 
    OPTIONAL_kw 
  | 
  ;
opt_MUTABLE_kw : 
    MUTABLE_kw 
  | 
  ;

type_declaration : 
    TYPE_kw id IS_kw opt_NEW_kw annotated_type_specifier ;

opt_NEW_kw : 
    NEW_kw 
  | 
  ;

operation_definition : 
    function_definition 
  | procedure_definition 
  | operator_definition 
  ;

function_definition : 
  function_declaration IS_kw opt_queued_clause 
     statement_list_with_semi 
  END_kw opt_FUNCTION_kw id 
  ;

opt_FUNCTION_kw : FUNCTION_kw
  | 
  ;

procedure_definition : 
  procedure_declaration IS_kw opt_queued_clause 
     statement_list_with_semi 
  END_kw opt_PROCEDURE_kw id 
  ;

opt_PROCEDURE_kw : PROCEDURE_kw
  | 
  ;

operator_definition : 
  operator_declaration IS_kw 
     statement_list_with_semi 
  END_kw opt_OPERATOR_kw operator_designator  
  ;

opt_OPERATOR_kw : OPERATOR_kw
  | 
  ;

opt_queued_clause : 
    queued_clause 
  | 
  ;

queued_clause :
    QUEUED_kw WHILE_or_UNTIL_kw condition THEN_kw ;

import_operation :
    IMPORT_kw '(' opt_operation_actual_list ')' ;

statement_list_with_semi : 
    parallel_sequence_with_semi 
  | statement_list_no_semi THEN_kw parallel_sequence_with_semi 
  | statement_list_with_semi THEN_kw parallel_sequence_with_semi 
  | statement_list_with_semi use_BEGIN_kw parallel_sequence_with_semi 
  | use_BEGIN_kw parallel_sequence_with_semi 
  ;

use_BEGIN_kw : BEGIN_kw ;


statement_list_no_semi : 
    parallel_sequence_no_semi 
  | statement_list_no_semi THEN_kw parallel_sequence_no_semi 
  | statement_list_with_semi THEN_kw parallel_sequence_no_semi 
  ;

parallel_sequence_with_semi : 
    statement_sequence_with_semi 
  | parallel_sequence_no_semi PARALLEL statement_sequence_with_semi 
  | parallel_sequence_with_semi PARALLEL statement_sequence_with_semi 
  ;

parallel_sequence_no_semi :
    statement_sequence 
  | parallel_sequence_no_semi PARALLEL statement_sequence 
  | parallel_sequence_with_semi PARALLEL statement_sequence 
  ;

statement_sequence_opt_semi :
    statement_sequence_with_semi 
  | statement_sequence 
  ;

statement_sequence_with_semi : statement_sequence ';' ;

statement_sequence :
    annotated_statement 
  | statement_sequence SEQUENCE annotated_statement 
  | statement_sequence ';' SEQUENCE annotated_statement 
  | statement_sequence ';' annotated_statement 
  ;
  
annotated_statement : 
    opt_annotation local_declaration 
  | opt_annotation statement opt_annotation 
  | annotation 
  ;

statement : 
    local_definition 
  | simple_statement 
  | label compound_statement 
  | compound_statement 
  ;

simple_statement :
    primitive_statement 
  | name equal_as_assign expression 
  | NULL_kw 
  | name '(' opt_operation_actual_list ')' 
  | RETURN_kw expression 
  | RETURN_kw opt_WITH_values  
  | CONTINUE_kw LOOP_kw opt_id opt_WITH_values  
  | EXIT_kw compound_statement_kind opt_id opt_WITH_values 
  ;

primitive_statement :
    name assign_operator_not_divide expression 
  | name DIVIDE_ASSIGN expression 
  | name SWAP name 
  | '(' opt_operation_actual_list ')' ASSIGN expression 
  ;

opt_operation_actual_list : 
    operation_actual_list 
  | 
  ;

opt_WITH_values : 
    WITH_values 
  | 
  ;

WITH_values : WITH_kw operation_actual ;

opt_id : 
    id 
  | 
  ;

compound_statement_kind : 
    LOOP_kw 
  | IF_kw 
  | CASE_kw 
  | SELECT_kw 
  | BLOCK_kw 
  ;

local_declaration : 
    operation_declaration 
  | type_declaration 
  | object_declaration
  ;

local_definition :
    object_definition 
  | operation_definition 
  ;

label : '*' id '*' ;

compound_statement :
    if_statement 
  | case_statement 
  | indefinite_loop_statement 
  | while_until_loop_statement 
  | for_loop_statement 
  | block_statement  
  | select_statement 
  | error ';' 
  ;

if_statement : 
  IF_kw condition THEN_kw 
     statement_list_with_semi
  opt_else_part
  END_kw IF_kw opt_id opt_WITH_values ;

opt_else_part : 
    ELSIF_kw condition THEN_kw
      statement_list_with_semi 
    opt_else_part 
  | ELSE_kw statement_list_with_semi 
  | 
  ;


case_statement : 
  CASE_kw expression OF_kw
    case_alt_list
    opt_default_alt
  END_kw CASE_kw opt_id opt_WITH_values ;

case_alt_list : 
    case_alt 
  | case_alt_list case_alt 
  ;

case_alt :
    '[' simple_expression_opt_named ']' REFERS_TO statement_list_with_semi 
  ;

simple_expression_opt_named :
    simple_expression 
  | id ':' simple_expression 
  ;
  
opt_default_alt : 
    '[' dot_dot_opt_named ']' REFERS_TO statement_list_with_semi 
  | 
  ;

dot_dot_opt_named :
    dot_dot_as_interval 
  | id ':' dot_dot_as_interval 
  ;
    
dot_dot_as_interval : DOT_DOT ;


indefinite_loop_statement :
  LOOP_kw 
    statement_list_with_semi
  END_kw LOOP_kw opt_id opt_WITH_values ;

while_until_loop_statement :
  WHILE_or_UNTIL_kw condition LOOP_kw
    statement_list_with_semi
  END_kw LOOP_kw opt_id opt_WITH_values ;

WHILE_or_UNTIL_kw :
    WHILE_kw 
  | UNTIL_kw 
  ;

for_loop_statement :
  FOR_kw iterator_spec opt_direction LOOP_kw
    statement_list_with_semi
  END_kw LOOP_kw opt_id opt_WITH_values ;

iterator_spec : 
    iterator 
  | '(' iterator_list ')' 
  ;

iterator_list :
    iterator 
  | iterator_list ';' iterator 
  ;

iterator :
    index_set_iterator 
  | EACH_kw element_iterator 
  | initial_next_while_iterator 
  | initial_value_iterator 
  ;

index_set_iterator :
    id opt_COLON_type_specifier IN_kw opt_REVERSE_kw expression ;

opt_REVERSE_kw : 
  | REVERSE_kw ;

element_iterator :
    id opt_COLON_type_specifier OF_kw expression 
  | '[' id REFERS_TO id ']' OF_kw expression 
  ;

initial_next_while_iterator :
    id opt_COLON_type_specifier ASSIGN_or_equal expression 
      opt_THEN_next_value_list 
      WHILE_or_UNTIL_kw condition 
  | id opt_COLON_type_specifier REFERS_TO name 
      opt_THEN_next_name_list
      WHILE_or_UNTIL_kw condition 
  ;

opt_THEN_next_value_list :
    THEN_kw next_value_list 
  | 
  ;

opt_THEN_next_name_list :
    THEN_kw next_name_list 
  | 
  ;

initial_value_iterator :
    id opt_COLON_type_specifier ASSIGN_or_equal expression 
  | id opt_COLON_type_specifier REFERS_TO name 
  ;

opt_COLON_type_specifier : 
    ':' type_specifier 
  | 
  ;

next_value_list : 
    expression  
  | next_value_list PARALLEL expression 
  ;

next_name_list : 
    name  
  | next_name_list PARALLEL name 
  ;

opt_direction : direction  
  | 
  ;

direction : 
    CONCURRENT_kw 
  | FORWARD_kw 
  | REVERSE_kw 
  ;

select_statement :
  SELECT_kw 
     select_alt_list
  END_kw SELECT_kw opt_id opt_WITH_values ;

select_alt_list : 
    select_alt
  | select_alt_list PARALLEL select_alt
  ;

select_alt : 
  '[' statement_sequence_opt_semi ']' REFERS_TO statement_sequence_with_semi ;

block_statement :
    BLOCK_kw
      statement_list_with_semi
    END_kw opt_BLOCK_kw opt_id opt_WITH_values 
  ;

opt_BLOCK_kw : BLOCK_kw
  | 
  ;
   
expression :
    expression_no_err 
  | expression_no_err divide_assign_as_not_equal expression_no_err 
  ;

divide_assign_as_not_equal :
    DIVIDE_ASSIGN 
  ;

expression_no_err :
    logical_expression 
  | logical_expression '?' expression ':' expression_no_err 
  | lambda_expression 
  ;

lambda_expression :
    LAMBDA_kw operation_inputs opt_annotation IS_kw 
       simple_expression_or_expr_stmt_seq 
  | LAMBDA_kw operation_inputs opt_annotation
      GIVES operation_outputs opt_annotation IS_kw 
    simple_expression_or_expr_stmt_seq 
  ;

simple_expression_or_expr_stmt_seq :
    simple_expression 
  | '(' expr_statement_seq ')' 
  ;

expr_statement_seq : expr_statement ';' expr_statement 
  | expr_statement_seq ';' expr_statement 
  ;

expr_statement : 
    primitive_statement 
  | expression_no_err 
  ;

logical_expression :  
    comparison_expression 
  | logical_expression logical_operator comparison_expression 
  ;

comparison_expression :  -- comparisons are non associative
    simple_expression 
  | simple_expression comparison_operator simple_expression 
  | adding_expression IN_kw simple_expression 
  | adding_expression NOT_kw IN_kw simple_expression 
  | adding_expression IS_kw NULL_kw 
  | adding_expression NOT_kw NULL_kw 
  ;

simple_expression : -- used to avoid use of '>' in module instantiation
    simple_expression_component 
  | simple_expression '|' simple_expression_component 
  ;

simple_expression_component :
    adding_expression 
  | adding_expression interval_operator adding_expression 
  | adding_expression interval_operator 
  | interval_operator adding_expression 
  | adding_expression '+' 
  ;

adding_expression :
    term 
  | adding_expression adding_operator term 
  ;

term : 
    factor 
  | term multiplying_operator factor 
  ;

factor : 
    primary 
  | primary power_operator factor 
  | unary_operator factor  
  ;

primary :
    name 
  | literal 
  | '(' conditional_expression ')' 
  | '(' quantified_expression ')' 
  | aggregate 
  ;
  
literal:  -- NOTE: See "name" for String_Literal
    Integer_Literal 
  | Real_Literal 
  | Char_Literal 
  | Enum_Literal 
  | NULL_kw 
  ;

name :
    qualified_name_and_property opt_PRIME 
  | qualified_name DOUBLE_COLON literal 
  | name '(' opt_operation_actual_list ')' 
  | name '[' opt_operation_actual_list ']' 
  | name '.' selector 
  ;

qualified_name_and_property :
    qualified_name 
  | qualified_name_and_property Enum_Literal 
  ;

opt_PRIME : 
    PRIME 
  | PRIME Identifier 
  | 
  ;

operation_actual_list : 
    operation_actual 
  | operation_actual_list ',' operation_actual 
  ;

operation_actual : 
    expression_no_err 
  | id REFERS_TO expression 
  ;

selector : id ;

unary_operator : 
    '+' 
  | '-' 
  | ABS_kw 
  | NOT_kw 
  ;

adding_operator : 
  '+' 
  | '-' 
  ;

multiplying_operator : 
    '*' 
  | '/' 
  | MOD_kw 
  | REM_kw 
  ;

power_operator : POWER ;

assign_operator : assign_operator_not_divide 
  | DIVIDE_ASSIGN 
  ;

assign_operator_not_divide :
    ASSIGN 
  | PLUS_ASSIGN 
  | MINUS_ASSIGN 
  | TIMES_ASSIGN 
  | CONCAT_ASSIGN 
  | AND_ASSIGN 
  | OR_ASSIGN 
  | XOR_ASSIGN 
  ;

ASSIGN_or_equal : ASSIGN 
  | equal_as_assign 
  ;

equal_as_assign : '=' 
  ;

comparison_operator : 
    COMPARE 
  | EQ 
  | NEQ 
  | '<' 
  | LEQ 
  | '>' 
  | GEQ 
  | '<' '<' 
  | '>' '>' 
  | '=' 
  ;

logical_operator :
    AND_kw 
  | OR_kw 
  | XOR_kw  
  | AND_kw THEN_kw 
  | OR_kw ELSE_kw 
  | IMPLIES 
  ;

interval_operator :
    DOT_DOT 
  | OPEN_INTERVAL 
  | CLOSED_OPEN_INTERVAL 
  | OPEN_CLOSED_INTERVAL 
  ;

aggregate : 
    class_aggregate  
  | container_aggregate  
  ;

class_aggregate : 
    '(' opt_operation_actual_list ')' 
  | '(' name DIVIDE_ASSIGN expression ')' 
  | qualified_name DOUBLE_COLON '(' opt_operation_actual_list ')' 
  ;

container_aggregate : 
    '[' opt_container_element_list ']' 
  | qualified_name DOUBLE_COLON '[' opt_container_element_list ']' 
  ;
  
opt_container_element_list : 
    container_element_list 
  | DOT_DOT 
  | 
  ;

container_element_list : 
    container_element 
  | container_element_list ',' container_element 
  ;

container_element : 
    expression 
  | simple_expression REFERS_TO filtered_expression_stream 
  | DOT_DOT REFERS_TO filtered_expression_stream 
  | FOR_kw iterator REFERS_TO filtered_expression_stream 
  ;

filtered_expression_stream : 
    expression 
  | expression ':' condition 
  ;

conditional_expression :
    if_expression 
  | case_expression 
  ;

if_expression : 
  IF_kw condition THEN_kw 
     expression
  opt_else_expr ;

opt_else_expr : 
    ELSIF_kw condition THEN_kw 
      expression 
    opt_else_expr 
  | ELSE_kw expression 
  | 
  ;

case_expression : 
  CASE_kw expression OF_kw
    case_expr_alt_list ;

case_expr_alt_list : 
    case_expr_alt 
  | case_expr_alt_list ';' case_expr_alt 
  ;

case_expr_alt : 
    '[' simple_expression_opt_named ']' REFERS_TO expression 
  | '[' dot_dot_opt_named ']' REFERS_TO expression 
  ;

quantified_expression :
    FOR_kw ALL_or_SOME_kw quantified_iterator REFERS_TO
      condition_or_quantified_expression ;

condition_or_quantified_expression :
    condition 
  | if_expression 
  | quantified_expression 
  ;

ALL_or_SOME_kw : 
    ALL_kw 
  | SOME_kw 
  ;

quantified_iterator :
    index_set_iterator 
  | element_iterator 
  | initial_next_while_iterator 
  ;

%%

Sunday, May 8, 2011

ParaSail Reference Manual -- first complete draft posted

We have posted the first complete draft of the reference manual for the ParaSail programming language to the ParaSail google group:

http://groups.google.com/group/parasail-programming-language

Comments welcomed!

Designing ParaSail, a new programming language