The Fuchsia shell syntax is defined as a Parsing Expression Grammar. This means alternation in the specification of the grammar is explicitly sequential; A ← B | C will always match B if it can. By convention, we use the syntax A ← B / C to make it explicit that alternation is sequential.
We will use the following syntax to specify our grammar in this document:
'while' or '|>'Expression or FunctionBody⊔ indicates a sequence of one or more whitespace characters. See below.* to indicate zero or more repetitions, + to indicate one or more repetitions, ? to indicate zero or one occurrences, {m,n} to indicate between m and n repetitions, and {n} to indicate exactly n repetitions./ are sequential alternatives.& is a zero-length match. It will not consume the text it matches and terms after it will attempt to match at the same position the zero-length term did.! is an inverse match. This behaves as a zero-length match, but parsing fails if matching this term succeeds and vice versa.∩ are intersected terms. We define the intersected term as a term which matches the longest possible string which matches both terms. By convention, the parse tree yielded from this operation is assumed to be the parse tree of the right-hand operand term.[ and ] delineate a character match block, as would appear in a Perl-compatible regular expression.<nl> indicates the newline character.. is a term which matches any single character.← as in Addition ← Multiplication '+' Multiplication←⊔ indicates a production where between each term, and each subterm in grouped sequences, the term ⊔? is present, but has been elided for clarity. More plainly, ←⊔ indicates a term which is whitespace-insensitive.We will assume our input is a stream of UTF-8 Characters.
We define Whitespace as follows:
⊔ ← '#' (!<nl> .)* <nl> / AnyUnicodeWhitespace+
Where AnyUnicodeWhitespace is any single character classified as whitespace by the Unicode standard. (NOTE: Today's parser only counts space, newline, carriage return, and tab).
Note that our comment syntax is embedded in our whitespace definition:
# This line will parse entirely as whitespace.
Identifiers are defined as follows:
UnescapedIdentifier ← [a-zA-Z0-9_]+ Identifier ← ![0-9] UnescapedIdentifier
Valid identifiers might include:
foo item_0 a_Mixed_Bag
Integers are defined as follows:
Digit ← [0-9] HexDigit ← [a-fA-F0-9] DecimalInteger ← 0 !Digit / !'0' Digit+ ( '_' Digit+ )* HexInteger ← '0x' HexDigit+ ( '_' HexDigit+ )* Integer ← DecimalInteger / HexInteger
Valid integers might include:
0 12345 12_345 0x1234abcd 0x12_abcd
Strings are defined as follows:
EscapeSequence ← '\n' / '\t' / '\r' / '\' <nl> / '\\' / '\"' / '\u' HexDigit{6}
StringEntity ← !( '\' / '"' / <nl> ) . / EscapeSequence
NormalString ← '"' StringEntity* '"'
String ← NormalString / MultiString
TODO: Define MultiString
Valid strings might include:
"The quick brown fox jumped over the lazy dog." "A newline.\nA tab\tA code point\u00264b" "String starts here \ and keeps on going"
Paths are defined as follows:
PathCharacter ← ![`&;|/\()[]{}] .
PathElement ← PathCharacter+ / '\' . / '`' ( !'`' . )* '`'
RootPath ← ( '/' PathElement+ )+
Path ← '.'? RootPath '/'? / '.'? '/' / '.'
Valid paths might include:
/foo /foo/bar /foo/bar/ ./foo/bar/ ./ / .
Variable declarations are defined as follows:
KWVar ← 'var' !IdentifierCharacter KWConst ← 'const' !IdentifierCharacter VariableDecl ←⊔ ( KWVar / KWConst ) Identifier '=' Expression
Valid variable declarations might include:
var foo = 4 const foo = "Ham Sandwich"
Object literals are defined as follows:
Object ←⊔ '{' ObjectBody? '}'
ObjectBody ←⊔ Field ( ',' Field )* ','?
Field ←⊔ ( NormalString / Identifier ) ':' SimpleExpression
Valid object literals might include:
{}
{ foo: 6, "bar & grill": "Open now" }
{ foo: { bar: 6 }, "bar & grill": "Open now" }
Addition is defined as follows:
AddSub ← Value ( [+-] Value )*
It looks as you'd expect:
a + b
Values are defined as follows:
Value ← Object / Atom Atom ← Identifer / String / Real / Integer / Path
Expressions are defined as follows:
Expression ← Addition
A program is defined as:
Program ←⊔ VariableDecl ([;&] Program)?