Specification Syntax

Not logged in

Arend's specification syntax is based on Prolog, but is more flexible, with fewer predefined operators.

Lexical syntax

Arend's lexical syntax is largely defined in terms of Unicode character categories. Only a few characters are treated specially.

Whitespace is ignored, but required to separate some tokens. Comments normally count as whitespace.

Comments come in two forms:

// Single-line comment
/* Multi-line comment */

Tokens are broken into three groups:

A raw token is any other type of token enclosed in backquotes. A raw token is exactly equivalent to the unquoted token, except that for symbolic tokens it is not assumed to be infix. Thus, the following are equivalent (assuming the term-based parsing described below):

a + b
`+`(a,b)

Grammatical syntax

Although Arend's specification language has only a handful of built-in operations, it's syntax includes support for a number of common extensions, under the assumption that specifications will want to use them in their natural forms. Thus, e.g., arithmetic operators are parsed according to the usual precedence, even though there are no built-in arithmetic operations.

The specification parser outputs a list of terms, from the terms module. Atomic and variable tokens are translated into Atoms and Variables, respectively. Compound terms of the form head(args...) or the infix form lhs OP rhs are translated into Compounds; for infix operators, the OP becomes the head.

Bracket-enclosed lists are parsed specially. The internal syntax for lists is a:b:c:[], being equivalent to [a,b,c]. The elements of the list can be any term.

Curly-braces are currently reserved for future extension, as are single- and double-quoted strings.

The grammatical syntax of Arend's specification language is roughly

specification = definition* 

definition = head ":-" body "."
           / head "."

head = term

body = term

term = atomicTerm
     / infixTerm
     / prefixTerm
     / listTerm
     / "(" term ")"

atomicTerm = atom

infixTerm = term OP term   // oversimplification

termList = (term ("," term)*)?

prefixTerm = atom "(" termList ")"

listTerm = "[" termList "]"

Note that, due to the ambiguity with ,, infix terms with , as their head inside lists must either be enclosed in parentheses or use the raw form of ,.

Operator precedence

Infix operators are parsed according to the following precedence table, from lowest to highest. The second column lists associativity.

Operator          Assoc.  Description
================= ======= ==============================
h(...)   (t)              Compounds, parenthesized terms

- !                       Prefix minus, not
^                 right   Exponentiation
* / %             left    Multiplication, division, mod
+ -               left    Addition, subtraction
< <= > >= == !=   non     Comparison
:                 right   Cons

<user-defined infix operators>

=                 other* Unification
, &               r,l    Conjunction
| ;               left   Disjunction
:-                left   Predicate definition

Note that in the first group of operators, the descriptions are just the "standard" interpretations. These operators have no predefined meaning in Arend. The second group of operators includes those which have some semantic meaning in Arend.

Associativity of unification: A = B = C is parsed as A = B, B = C, except that the entire unification is setup all at once, so it's slightly more efficient.

Also note that while , is right associative, & is left associative.

Note that, in its current state, Arend's grammar requires memoization to be enabled in order to avoid exponential parsing time in some cases. Hopefully this will be remedied in the future.