Both the table and a version of the algorithm that prints hitting productions, can be coded in F# as follows
Discriminating unions
type Token = | Imp | Biimp | Or | And | Not | LPar | RPar | Atom of string | Dollar type Action = | Reduce of int | Shift of int | Goto of int | Error | AcceptTables (where actionrow2dict and gotorow2dict converts the input array to a dict with the proper index keys matching the original table)
let actionTable = [| actionrow2dict [|Error;Error;Error;Error;Shift 5;Shift 4;Error;Shift 3;Error|] actionrow2dict [|Reduce 6;Reduce 6;Reduce 6;Reduce 6;Error;Error;Reduce 6;Error;Reduce 6|] actionrow2dict [|Shift 8;Shift 7;Shift 9;Shift 6;Error;Error;Error;Error;Accept|] actionrow2dict [|Reduce 8;Reduce 8;Reduce 8;Reduce 8;Error;Error;Reduce 8;Error;Reduce 8|] actionrow2dict [|Error;Error;Error;Error;Shift 5;Shift 4;Error;Shift 3;Error|] actionrow2dict [|Error;Error;Error;Error;Shift 5;Shift 4;Error;Shift 3;Error|] actionrow2dict [|Error;Error;Error;Error;Shift 5;Shift 4;Error;Shift 3;Error|] actionrow2dict [|Error;Error;Error;Error;Shift 5;Shift 4;Error;Shift 3;Error|] actionrow2dict [|Error;Error;Error;Error;Shift 5;Shift 4;Error;Shift 3;Error|] actionrow2dict [|Error;Error;Error;Error;Shift 5;Shift 4;Error;Shift 3;Error|] actionrow2dict [|Shift 8;Shift 7;Shift 9;Shift 6;Error;Error;Shift 16;Error;Error|] actionrow2dict [|Reduce 5;Reduce 5;Reduce 5;Reduce 5;Error;Error;Reduce 5;Error;Reduce 5|] actionrow2dict [|Reduce 4;Reduce 4;Reduce 4;Reduce 4;Error;Error;Reduce 4;Error;Reduce 4|] actionrow2dict [|Shift 8;Shift 7;Shift 9;Shift 6;Error;Error;Reduce 2;Error;Reduce 2|] actionrow2dict [|Shift 8;Shift 7;Shift 9;Shift 6;Error;Error;Reduce 1;Error;Reduce 1|] actionrow2dict [|Reduce 3;Reduce 3;Reduce 3;Reduce 3;Error;Error;Reduce 3;Error;Reduce 3|] actionrow2dict [|Reduce 7;Reduce 7;Reduce 7;Reduce 7;Error;Error;Reduce 7;Error;Reduce 7|] |] let gotoTable = [| gotorow2dict [|Some 2;Some 1|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] gotorow2dict [|Some 10;Some 1|] gotorow2dict [|Some 11;Some 1|] gotorow2dict [|Some 12;Some 1|] gotorow2dict [|Some 13;Some 1|] gotorow2dict [|Some 14;Some 1|] gotorow2dict [|Some 15;Some 1|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] gotorow2dict [|None;None|] |]The parser
let parser tokens = let tLen = Array.length tokens let popOne = function | s::stack -> (s,stack) | _ -> failwith "parser: popOne error" let popN n stack = let rec exec n0 acc = function | stack when n0 = n -> (acc,stack) | s::stack -> exec (n0 + 1) (acc @ [s]) stack | [] -> failwith "parser: popN error" exec 0 [] stack let pushGoto stack = function | Some g -> g::stack | None -> stack let getNextFromInput i = if i < tLen then (i + 1,tokens.[i+1]) else failwith "parser: getNextFromInputError" let removeAtomName = function | Atom _ -> Atom "" | t -> t let rec exec (i,a) sStack = let (s,_) = popOne sStack match actionTable.[s].[removeAtomName a] with | Shift t -> let newStack = t::sStack let (i,a) = getNextFromInput i exec (i,a) newStack | Reduce r -> let (prod,rSide) = productions.[r] let betaLen = Array.length rSide let (_,newStack) = popN betaLen sStack let (t,_) = popOne newStack let newStack = pushGoto newStack gotoTable.[t].[prod] printfn "prodId = %d: %s -> %A" r prod rSide exec (i,a) newStack | Accept -> printfn "parsing done" printfn "stack : %A" sStack () | _ -> failwith "parser: error case reached" exec (0,tokens.[0]) [0]The cool thing is that this parser is bottom-up, which means that for some input the leaves are parsed first. This cooperates very well with functional programming since we can create a new node with two already calculated notes. We do not have to mutate the tree.
We can insert error messages into the table as follows
DFA state | Actions | GOTO | |||||||||
imp | biimp | or | and | not | lpar | rpar | atom | $ | Exp | Atom | |
s0 | e4 | e4 | e4 | e4 | s5 | s4 | e3 | s3 | e2 | g2 | g1 |
s1* | r6 | r6 | r6 | r6 | e4 | e4 | r6 | e4 | r6 | ||
s2* | s8 | s7 | s9 | s6 | e4 ... | accept | |||||
s3* | r8 | r8 | r8 | r8 | e4 | e4 | r8 | e4 | r8 | ||
s4 | e4 | e4 | e4 | e4 | s5 | s4 | e3 | s3 | e2 | g10 | g1 |
s5 | e4 | e4 | e4 | e4 | s5 | s4 | e3 | s3 | e2 | g11 | g1 |
s6 | e4 | e4 | e4 | e4 | s5 | s4 | e3 | s3 | e2 | g12 | g1 |
s7 | e4 | e4 | e4 | e4 | s5 | s4 | e3 | s3 | e2 | g13 | g1 |
s8 | e4 | e4 | e4 | e4 | s5 | s4 | e3 | s3 | e2 | g14 | g1 |
s9 | e4 | e4 | e4 | e4 | s5 | s4 | e3 | s3 | e2 | g15 | g1 |
s10 | s8 | s7 | s9 | s6 | s16 | e1 | |||||
s11* | r5 | r5 | r5 | r5 | e4 | e4 | r5 | e4 | r5 | ||
s12* | r4 | r4 | r4 | r4 | e4 | e4 | r4 | e4 | r4 | ||
s13* | s8 | s7 | s9 | s6 | e4 | e4 | r2 | e4 | r2 | ||
s14* | s8 | s7 | s9 | s6 | e4 | e4 | r1 | e4 | r1 | ||
s15* | r3 | r3 | r3 | r3 | e4 | e4 | r3 | e4 | r3 | ||
s16* | r7 | r7 | r7 | r7 | e4 | e4 | r7 | e4 | r7 |
Where
The error system is quite intuitive. Look up s1. Here we see that it has actions on imp, biimp, or, and, rpar all of which in our syntax follows an atom.
For this to work we have to change the type Error to Error of string.