Nemerle Language Reference

1. Introduction

This document presents in semi-formal way syntax and semantics of the Nemerle language. It is not meant to be a tutorial.

We often refer to .NET terminology [FIXME: reference it].

2. Lexical Conventions

Programs are written using the Unicode character set. Every Nemerle source file is reduced to a sequence of lexical units (tokens) separated by sequences of white characters (blanks).

2.1. Tokens

There are five classes of lexical tokens:

/* A comment. */
// Also a comment

foo               // identifier
foo_bar foo' foo3 // other identifiers
42                // integer literal
0x2a              // hexadecimal integer literal
0o52              // octal integer literal
0b101010          // binary integer literal
'a'               // character literal
'\n'              // also a character literal
"foo\nbar"        // string literal
@"x\n"            // same as "x\\n"
@if               // keyword used as identifier

2.2. Blanks

Spaces, vertical and horizontal tabulation characters, new-page characters, new-line characters and comments (called blanks altogether) are discarded, but can separate other lexical tokens.

A traditional comment begins with a /*, and ends with */. An end-of-line comment starts with //, and ends with the line terminator (ASCII LF character).

2.3. Type variables

Type variables occur in polymorfic definition of types. It can be any valid identifier.

2.4. Identifiers

Ordinary identifiers consist of letters, digits, underscores and apostrophe, but cannot begin with a digit nor an apostrophe. Identifiers may be quoted with the @ character, which is stripped. It removes any lexical and syntactic meaning from the following string of characters until blank, thus enabling programmer to use keywords as identifiers.

There is an important difference between identifiers starting with underscore character _ and the other ones. When you define local value with name starting with _ and won't use it, compiler won't complain about it. It will warn about other unused values though.

Symbolic identifiers consist of following characters: =, <, >, @, ^, |, &, +, -, *, /, $, %, !, ?, ~, ., :, #. Symbolic identifiers are treated as standard identifiers except to the fact that they are always treated as infix operators.

2.5. Keywords

Following identifiers are used as keywords, and may not be used in any other context unquoted: [[FIXME: update this list.]] _, abstract, and, array, as, base, class, const, def, else, ensure, enum, extern, finally, fun, if, in, interface, internal, let, macro, match, module, mutable, namespace, new, null, using, out, private, protected, public, throw, ref, require, sealed, static, struct, then, this, try, tymatch, type, variant, void, volatile, where, with.

Following infix identifiers are reserved keywords: =, $, ?, |, <-, ->, =>, <[, ]>, &&, ||.

2.6. Literals

There are few kinds of literals:

3. Compilation units

compilation_unit ::=

A Nemerle program consists of one or more compilation units. Compilation units are text files with the .n extension. Compilation unit consists of namespace-related declarations and types within them.

toplevel_declaration ::=

Add the specified namespace (which, unlike in C#, can also be a type name) to the symbol search path. Every symbol till end of current namespace or compilation unit (if not within namspace) will be searched also in location specified by this path.

toplevel_declaration ::=
namespace IDENTIFIER = qualified_identifier ;

Define an alias for a namespace. After namespace Foo = Bar.Baz; any reference to Foo.bar will be expanded to Bar.Baz.bar.

toplevel_declaration ::=

Put declarations within the specified namespace. Namespaces can be nested, creating a tree of namespaces.

toplevel_declaration ::=

Define a new top level type.

4. Stuff that doesn't fit anywhere else

This section lists grammar rules common to most of other sections.

4.1. Identifier or dummy

identifier_or_dummy ::=
IDENTIFIER
identifier_or_dummy ::=
_

In several places it is possible to use the _ keyword to denote the intent to ignore a parameter or return value. Semantics of _ in such places is to generate a new temporary name.

4.2. Qualified identifier

qualified_identifier ::=
IDENTIFIER { . IDENTIFIER }

Identifiers can be qualified with namespaces.

5. Type declarations

Types are defined at the top level, within namespaces or within other types. Top level type names are prefixed with the namespace they are defined in. Nested type names are prefixed with the parent type name. Nesting affects accessibility of a type.

5.1. Type header

type_header ::=
IDENTIFIER [ type_parameters ] [ : type { , type } ] where_constraints

Type header is similar to .NET. The main difference is the optional type_parameters list, defined below.

5.2. Type parameters

type_parameters ::=
< TYPE_VARIABLE { , TYPE_VARIABLE } >
where_constraints ::=
{ where TYPE_VARIABLE : type { , type } }

When defining polymorphic type one has to specify list of type variables in declaration. It can have following form:

class Foo <a, b>

Optional list of where parameters can be used to add constraints to the type variables (type coercion).

where a : Nemerle.Collections.IEnumerable, IComparable
where b : Nemerle.Collections.IDictionary

5.3. Type alias

type_declaration ::=

This type declaration creates an alias to another type.

5.4. Interface definition

type_declaration ::=

Nemerle interfaces are similar to .NET.

5.5. Class definition

type_declaration ::=

Class definition is similar to .NET.

5.6. Module definition

type_declaration ::=

Module is much like a class, but all module members are static. There is no need to place static attributes on module members. It is also not possible to create instances of module types.

5.7. Variant definition

type_declaration ::=

Variant declaration consist of a type name and a list of bar-separated constructors enclosed in brackets.

variant_option ::=
IDENTIFIER [ { { field_definition } } ]

The constructor declaration describe the constructor associated to this variant type. A constructor may take an argument. Constructor name must be capitalized.

6. Attributes

The semantics of attributes is the same as in C#.

attributes ::=
{ attribute }
attribute ::=
new
attribute ::=
public
attribute ::=
protected
attribute ::=
internal
attribute ::=
private
attribute ::=
abstract
attribute ::=
sealed
attribute ::=
override
attribute ::=
static

7. Declarations within types

Following fields are allowed in class or module body:

type_member ::=
type_member ::=
type_member ::=

7.1. Field definition

field_definition ::=
attributes [ mutable ] IDENTIFIER : type ;

Unless the optional mutable keyword is used the field can be modified only inside the constructor.

7.2. Interface member

interface_member ::=
[ new ] method_header ;

Keyword new is necessary when declared method hides inherited one from another interface.

7.3. Method definition

method_definition ::=

This is definition of method within class or module. Program entry point is method static Main.

7.4. Method header

method_type_parameters ::=
< TYPE_VARIABLE { , TYPE_VARIABLE } >

Declaration of polymorphic method needs its type variables listed after the identifier.

method_header ::=

This is declaration of method. Unlike C# type is specified after parameters list.

method_header ::=

Special method named this specifies constructor. This declaration cannot contain method type and method has to have type void.

method_implements ::=

7.5. Method parameters

method_parameter ::=

Method parameter is a pair consisting of identifier or _ and it's type specification. Type declaration can be omitted in local functions definitions.

method_parameters ::=

Method parameters are comma-separated list of parameter specification.

7.6. Method body

method_body ::=
= extern STRING_LITERAL ;

Method body can be linked to external function, example:

static ps (s : string) : void = extern "System.Console.Write";

This feature is used to give meaning to infix operators. It isn't currently fully supported. It should not be considered rock-stable feature.

method_body ::=
;

Empty method body (a ;) is a method declaration.

method_body ::=

This is classical method definition.

8. Type expressions

Type expression relate to type declarations much like function calls relate to function and value definitions. Type declarations define ways the types can be constructed and type expressions define actual types.

Types are both static and dynamic characterization of values. Static type of expression depends on its building blocks and is defined in paragraph describing given expression. Dynamic (runtime) type is bound to the value at the moment it is created, and remains there until the value is garbage collected.

The type system is modeled after .NET Generics design, except for tuple and function types, which are new, but can be easily simulated using generics.

8.1. Type constructor application

primary_type ::=

Type constructor (defined with type declaration) can be applied to zero or more arguments forming type expression. Number of type arguments in type application must match number of type arguments in definition. Moreover actual type arguments must solve where constraints imposed on formal type arguments.

8.2. Type variable reference

primary_type ::=
TYPE_VARIABLE

Refer to type substituted to given type variable. The type variable has to be defined (bound, quantified) before it is used. Type variable can be defined in type arguments or method header (of global or local function).

8.3. Grouping types

primary_type ::=
( type )

This construct has no semantic meaning -- it exists only to enforce particular syntax decomposition.

8.4. Void type

primary_type ::=
void

This is mostly an alias for System.Void -- a type with exactly one inhibiting value it is however first class value -- can be passed as function parameter as well as returned from functions.

The name comes from System.Void, but should be in fact unit.

8.5. Ref and Out types

primary_type ::=
primary_type ::=

These are for parameters passed by reference. This is not implemented yet, but will have semantics similar as in C#.

8.6. Tuple type

type ::=

Construct product (tuple) type. This operator is not associative, which means that each two of following types are different:

int * int * int
(int * int) * int
int * (int * int)

8.7. Function type

type ::=
type -> type

Construct function type with specified argument and return types respectively. The -> operator is right associative, which means that the following type are equivalent:

int -> int -> int
int -> (int -> int)

Multi-argument function types are written using tuple notation, for example after local declaration:

def some_function (a : int, b : string) : float { ... }

the expression some_function has type int * string -> float.

9. Literal expressions

These are used in expressions and patterns.

9.1. Boolean

literal ::=
true
literal ::=
false

These literals have type bool and represent respectively true (false) boolean value

9.2. Null

literal ::=
null

Represents null reference, one that does not refer to any object. It posses types of all reference types -- can be used in any context reference type is required. It does not however posses the void type nor any value type (like System.Int32 or System.Single).

9.3. Void

literal ::=
( )

Indicates returning no value. It is the only possible value of type void. See also void type.

9.4. String

literal ::=
STRING_LITERAL

Represents string constant. Nemerle supports two forms of string:

A regular string literal consists of zero or more characters enclosed in double quotes and may include both simple escape sequences (such as \n for the newline character) and hexadecimal and Unicode escape sequences (See character literals for details).

A verbatim string literal consists of an @ character followed by a double-quote character, zero or more characters, and a closing double-quote character. In a verbatim string literal, the characters between the double-quotes are recognized verbatim, the only exception is a sequence "" (used to indicate '"' character) (Note that simple escape sequences and hexadecimal and Unicode escape sequences are not recognized in verbatim string literals). A verbatim string literal may span multiple lines.

Examples:

def s1 = "Nemerle string !";            // Nemerle string !
def s2 = @"Nemerle string !";           // Nemerle string !
def s3 = "Nemerle\tstring !";           // Nemerle    string !
def s4 = @"Nemerle\tstring !";          // Nemerle\tstring !
def s5 = "I heard \"zonk !\"";          // I heard "zonk !"
def s6 = @"I heard ""zonk !""";         // I heard "zonk !"
def s7 = "\\\\trunk\\ncc\\ncc.exe";     // \\trunk\ncc\ncc.exe
def s8 = @"\\trunk\ncc\ncc.exe";        // \\trunk\ncc\ncc.exe
def s9 = "\"Nemerle\"\nstring\n!";      // "Nemerle"
                                        // string
                                        // !
def s10 = @"""Nemerle""                 // same as s9
rocks
!";

String s10 is a verbatim string literal that spans 3 lines.

9.5. Number

literal ::=
NUMBER_LITERAL

Represents one of numeric types. See literals for details of representing particular numerical types.

9.6. Character

literal ::=
CHARACTER_LITERAL

Character literal consist of one character enclosed in single-quotes (' ') or escape character of form '\X' where X can be one of following: [FIXME : characters with (N) are not implemented yet (will they ?)]

It has type char.

10. Primary expressions

Primary expressions is grammar category referring to expressions that have closed structure and are otherwise simple. Primary expressions and plain expressions do not differ at the semantic level.

10.1. Literal expression

primary_expr ::=

The value and type of expression being literal is the value and type of respective literal.

10.2. Variable reference

primary_expr ::=

This expression result is a variable itself (not its value). [[FIXME: hem?!]] Type of this expression is ref 'a where 'a is a type of variable being referenced.

10.3. this pointer reference

primary_expr ::=
this

This expression can only be used within non-static methods and indicates a reference to the current instance of the class (which posses this method).

Expression like this.foo can be shortened to foo unless it would generate an identifier ambiguity with some variable being in this lexical scope.

10.4. Grouping expression

primary_expr ::=
( expr )

Grouping expression allow to enforce particular syntax decomposition of expression.

10.5. Type cast

primary_expr ::=
( expr :> type )

This expression allows dynamic type coercion. It is done during runtime and if it cannot be realized then System.InvalidCastException is thrown. If it succeeds the type of this expression is equal to the type of type.

10.6. Type enforcement

primary_expr ::=
( expr : type )

This expression allows static type enforcement. It is checked during compile-time and error is reported if expr type is not a subtype of type. It allows only type widening. If it succeeds the type of this expression is equal to the type of type.

10.7. Member reference

primary_expr ::=
primary_expr . IDENTIFIER

This expression allows referring to the field or method that object represented by primary_expr contains.

10.8. Tuple constructor

primary_expr ::=
( expr { , expr }+ )

This expression allows creating a tuple of expr whose types may differ. The type of that tuple is type_1 * ... * type_n where type_1 and the following are types of corresponding expressions.

10.9. Indexer reference

primary_expr ::=
expr [ expr { , expr } ]

This expression allows to refer to indexed (even by multiple indexes) fields of objects represented by leftmost expr where second (and further) expr are indexes values of field we want to refer to. expr must refer to indexing object.

11. Core Expressions

11.1. Primary expression

expr ::=

The value and type are the same as primary_expr we are referring to.

11.2. Function call

expr ::=

Call a function with given parameters. The type of the function call expression is the same as the type of the function return value; that is, if function's type is 'a -> 'b, then the type of the function call expression is 'b. The value of the whole expression is the return value of the function.

11.3. Assignment

expr ::=

Assign a value to a variable. Left side of the assignment expression must evaluate to a mutable variable. The type of the assignment is always void.

11.4. Match expression

expr ::=
match ( expr ) { [ | ] match_case { | match_case } }

expr is matched sequentially to the patterns in given match cases. If one of the patterns is consistent with the value of expr then the corresponding computation branch of the match case is evaluated. Patterns in all the match cases must be of the same type. Expressions being computation branches in all the match cases must be of the same type, as well. The type of the match expression is the same as the type of the computation branch in all the match cases.

11.5. Throw expression

expr ::=
throw expr

Throws given exception. The expression given must be of type System.Exception.

11.6. Try..catch expression

expr ::=
try expr catch { [ | ] try_catch_handler { | try_catch_handler } }

If the evaluation of expr does not throw any exception, then the result is that of the evaluation of expr. Otherwise, the runtime type of the exception which was thrown is compared against each type description in handlers. First matching handler is executed and its value returned. If none of the handlers matches the exception is propagated. The type of the whole expression is the same as type of guarded expression. The value is the value of expression or lunched handler. Consult .NET specification if you want to know more about exceptions.

11.7. Try..finally expression

expr ::=
try expr finally expr

Evaluates the first expression and -- regardless of whether the evaluation has finished correctly or some exception has been thrown during the evaluation -- the second expression is evaluated. The value (and thus the type) of the whole expression is the value of the first expression.

11.8. Unary operator application

expr ::=
OPERATOR expr

11.9. Binary operator application

expr ::=
expr OPERATOR expr

11.10. Block expression

expr ::=

The value (and thus the type) of the whole expression is the value of the last expression in the sequence.

11.11. Array constructor

expr ::=
array [ [ { expr , } expr ] ]

Create an array consisting of given elements. All elements must be of the same type. If the elements are of the type 'a then the whole expression is of the type array ('a).

11.12. Value definition

expr ::=
def pattern = expr

Defines the binding between the variables in the pattern and the value of the expression expr which will be known to all subsequent expressions in the current block.

11.13. Local function definition

expr ::=

Defines the functions which will be known to all subsequent expressions in the current block. Names of all defined functions are put into the symbol space before their bodies are parsed.

11.14. Mutable value definition

expr ::=
mutable IDENTIFIER <- expr

Defines new variable, value of which can be changed at any time using the assignment expression.

12. Secondary Expressions

This section describes expressions that are in fact just syntactic sugar over Core Expressions. We just present translation of Secondary Expressions into Core Expressions.

12.1. Conditional expression

expr ::=
if ( expr ) expr else expr

Standard branch, which executes and returns value of first expression if condition evaluates to true or second elsewhere.

Internally it is translated into

match (cond) {
  | true => expr1
  | false => expr2
}

12.2. While loop

expr ::=
while ( expr ) expr

Loop, executing body expression as long as condition is true. Its value is always checked before execution of body and if it evaluates to false, then loop ends. Body must be of type void.

While loop is translated internally into following code

def loop () {
  if (cond) 
    { body; loop () }
  else
    ()
};
loop ()

12.3. When expression

expr ::=
when ( expr ) expr

Version of if condition, but having only one branch -- execution of body only when condition is satisfied. If its value if false, then nothing is done (i. e. () is returned).

Its semantics is the same as

if (cond) body else ()

12.4. Unless expression

expr ::=
unless ( expr ) expr

Opposite version of when. It executes and returns value of body only if conditions if not satisfied (i. e. evaluates to false).

Its semantics is the same as

if (cond) () else body

12.5. Lambda expression

expr ::=

Lambda expressions can be thought as of anonymous local functions. This construct defines such a function and returns it as a functional value. This value can be used just like the name of regular local function.

Example:

List.Iter (fun (x) { printf ("%d\n", x) }, intList)

is equivalent to

def tmpfunc (x) { printf ("%d\n", x) };
List.Iter (tmpfunc, intList)

Lambda expression is indeed translated internally to

expr ::=

where temporary_name is automatically created by compiler.

12.6. List constructor

expr ::=
[ [ { expr , } expr ] ]

[1, 2, 3] is translated to Cons (1, Cons (2, Cons (3, Nil ()))).

13. Expression helpers

This section describes some constructs used in Expressions section.

13.1. Sequence

sequence ::=
expr { ; expr } [ ; ]

Expressions in the sequence are evaluated sequentially, and the value (and thus the type) of the sequence is the value of the last expression in the sequence.

Value of expression (except for the last one) are ignored, and thus if the type of some expression is not void -- a warning is generated.

13.2. Block

block ::=
{ sequence }

This is just a standard execution of sequence of expressions. Value (and type) of this block is the same as last expression in a sequence.

block ::=

This syntax is a shortcut for matching parameters of defined function with given list of patterns. It is equivalent to making a tuple from parameters of function and creating match expression.

def f (p1, p2, p3) { 
  | (1, 3, "a") => 1
  | _ => 2
}

translates to

def f (p1, p2, p3) { 
  match ((p1, p2, p3)) {
    | (1, 3, "a") => 1
    | _ => 2
  }
}

It is also to note, that when function has only one parameter, matching goes just on this parameter itself (there is no one element tuples).

13.3. Try..catch handler

try_catch_handler ::=

13.4. Function parameter

parameter ::=
[ ref ] expr

parameter ::=
[ ref ] IDENTIFIER = expr

ref is used to denote parameter passes by reference. This is not implemented yet, but will have semantics similar as in C#.

13.5. Match case

guarded_pattern ::=
pattern [ when expr ]

Guarded pattern requires expression expr to be of type bool. Given some expression e this expression satisfies the guarded pattern only if it is pattern-matched with pattern and expression expr is evaluated to true.

match_case ::=

Given some expression e this expression satisfies this match case if and only if it satisfies one of the guarded patterns in this match case.

14. Patterns

Patterns are form of accessing data structures, especially trees. Patterns can match values. Definition of the term to match is given with each pattern construct. However the main idea behind patterns is that they match values that look like them.

Pattern are used in match expression and value definitions.

14.1. Constructor pattern

pattern ::=

The identifier should refer to name of variant option. This pattern matches value iff it is specified variant option, and sub-pattern matches variant option payload.

14.2. Throw-away pattern

pattern ::=
_

This pattern matches any value.

14.3. Record pattern

pattern ::=
{ IDENTIFIER = pattern { ; IDENTIFIER = pattern } [ ; ] }

This pattern matches value of class, that has all specified fields (this is checked statically), and value of each field matches respective pattern.

14.4. As binding

pattern ::=
( pattern ) as IDENTIFIER

This pattern matches the same value as enclosed pattern does. However in addition value matched by enclosed pattern is bound to specified variable, which can be used in when guard or match body.

14.5. Tuple pattern

pattern ::=
( pattern { , pattern } )

This pattern matches a tuple with specified contents (each tuple member is matched be respective pattern).

In addition, when tuple pattern is seen, where record pattern would be otherwise expected -- tuple pattern is transformed to record pattern by adding field identifiers in order they appear in definition of given class. Tuple pattern transformed to record pattern cannot match fields inherited from the base class.

14.6. List constructor pattern

pattern ::=

The following two lines are equivalent:

pattern1 :: pattern2
Cons (pattern1, pattern2)

14.7. List literal pattern

pattern ::=
[ [ { pattern , } pattern [ , ] ] ]

The following are equivalent:

[ pattern1 , pattern2 , ... , patternN ]
Cons (pattern1, Cons (pattern2, ... Cons (pattern2, Nil) ... ))

14.8. Literal pattern

pattern ::=

This pattern matches specified constant value.

15. Macros

Please refer to macros.html for now.