pathterminuspages/projects/aboutcontactabout me

Syntax

25.08.2018

Contents/Index

Preface
@Syntax
Generating parser
The application

We need a parser. I have developed a liking for SLR parsers. So let's make one of those. We need a set of tokens

int = [0-9]+ float = [0-9]*,[0-9]+ id = [a-zA-Z]+ plus = + minus = - times = * divide = / power = ^ lpar = ( rpar = ) comma = ,

We need the id for built in functions such as sqrt.

Since I have already built a parser for propositional logic, Prop2Table, a parser that is very similar to this one, I will skip a lot of the details. Let us build the grammar

Exp \rightarrow Exp plus Exp Exp \rightarrow Exp minus Exp Exp \rightarrow Exp times Exp Exp \rightarrow Exp divide Exp Exp \rightarrow Exp power Exp Exp \rightarrow minus Exp Exp \rightarrow Call Exp \rightarrow Atom Call \rightarrow id lpar Args rpar Args \rightarrow Arg Args' | Args' \rightarrow comma Arg Args' | Arg \rightarrow Exp Atom \rightarrow lpar Exp rpar Atom \rightarrow float Atom \rightarrow int

With this grammar we can nest calls. That's pretty decent, I think. As we have done before: First we construct an NFA to use with the nfa2dfa converter. To do this we add a start productions and number all the productions (thus obtaining the expanded grammar).

#Transitions #[0] Exp' -> Exp start (A) -Exp>(B) ((B)) #[1] Exp -> Exp plus Exp (C) -Exp>(D) (D) -plus>(E) (E) -Exp>(F) ((F)) #[2] Exp -> Exp minus Exp (G) -Exp>(H) (H) -minus>(I) (I) -Exp>(J) ((J)) #[3] Exp -> Exp times Exp (K) -Exp>(L) (L) -times>(M) (M) -Exp>(N) ((N)) #[4] Exp -> Exp divide Exp (O) -Exp>(P) (P) -divide>(Q) (Q) -Exp>(R) ((R)) #[5] Exp -> Exp power Exp (S) -Exp>(T) (T) -power>(U) (U) -Exp>(V) ((V)) #[6] Exp -> Call (A1) -Call>(B1) ((B1)) #[7] Exp -> Atom (C1) -Atom>(D1) ((D1)) #[8] Exp -> minus Exp (F1) -minus>(G1) (G1) -Exp>(H1) ((H1)) #[9] Call -> id lpar Args rpar (I1) -id>(J1) (J1) -lpar>(K1) (K1) -Args>(L1) (L1) -rpar>(M1) ((M1)) #[10] Args -> Arg Args' (O1) -Arg>(P1) (P1) -Args'>(Q1) ((Q1)) #[11] Args -> ((R1)) #[12] Args' -> comma Arg Args' (S1) -comma>(T1) (T1) -Arg>(U1) (V1) -Args'>(W1) ((W1)) #[13] Args' -> ((A2)) #[14] Arg -> Exp (B2) -Exp>(C2) ((C2)) #[15] Atom -> lpar Exp rpar (D2) -lpar>(E2) (E2) -Exp>(F2) (F2) -rpar>(G2) ((G2)) #[16] Atom -> float (H2) -float>(I2) ((I2)) #[17] Atom -> int (J2) -int>(K2) ((K2))

We add epsilon transitions from every non-terminal transition to every production of that terminal

#added epsilon #[0] Exp' -> Exp start (A) -Exp>(B) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((B)) #[1] Exp -> Exp plus Exp (C) -Exp>(D) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) (D) -plus>(E) (E) -Exp>(F) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((F)) #[2] Exp -> Exp minus Exp (G) -Exp>(H) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) (H) -minus>(I) (I) -Exp>(J) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((J)) #[3] Exp -> Exp times Exp (K) -Exp>(L) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) (L) -times>(M) (M) -Exp>(N) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((N)) #[4] Exp -> Exp divide Exp (O) -Exp>(P) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) (P) -divide>(Q) (Q) -Exp>(R) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((R)) #[5] Exp -> Exp power Exp (S) -Exp>(T) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) (T) -power>(U) (U) -Exp>(V) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((V)) #[6] Exp -> Call (A1) -Call>(B1) -eps>(I1) ((B1)) #[7] Exp -> Atom (C1) -Atom>(D1) -eps>(A2) -eps>(E2) -eps>(G2) ((D1)) #[8] Exp -> minus Exp (F1) -minus>(G1) (G1) -Exp>(H1) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((H1)) #[9] Call -> id lpar Args rpar (I1) -id>(J1) (J1) -lpar>(K1) (K1) -Args>(L1) -eps>(O1) -eps>(R1) (L1) -rpar>(M1) ((M1)) #[10] Args -> Arg Args' (O1) -Arg>(P1) -eps>(B2) (P1) -Args'>(Q1) -eps>(S1) -eps>(A2) ((Q1)) #[11] Args -> ((R1)) #[12] Args' -> comma Arg Args' (S1) -comma>(T1) (T1) -Arg>(U1) -eps>(B2) (V1) -Args'>(W1) -eps>(S1) -eps>(A2) ((W1)) #[13] Args' -> ((A2)) #[14] Arg -> Exp (B2) -Exp>(C2) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) ((C2)) #[15] Atom -> lpar Exp rpar (D2) -lpar>(E2) (E2) -Exp>(F2) -eps>(C) -eps>(G) -eps>(K) -eps>(O) -eps>(S) -eps>(A1) -eps>(C1) -eps>(F1) (F2) -rpar>(G2) ((G2)) #[16] Atom -> float (H2) -float>(I2) ((F2)) #[17] Atom -> int (J2) -int>(K2) ((K2))

This we run through the converter to obtain shift actions. Now before we create the tables, let us decide on associativity and precedence. We get

precedence: plus:1 minus:1 times:2 divide:2 power:3 associativity: plus:left minus:left times:left divide:left power:right

From this we create the tables. First we calculating follow using the extended grammar and the Grammar2Set converter. With the follow sets we can obtain reduce actions as done in the prop2table project. With solved conflicts we get

Action Table

commadividefloatidintlparminuspluspowerrpartimes$
s5s6s7s8s9
r8r8r8r8r8r8r8
r7r7r7r7r7r7r7
s10s11s12s13s14r1
accept
r17r17r17r17r17r17r17r17
s15
r18r18r18r18r18r18r18r18
s5s6s7s8s9
s5s6s7s8s9
s5s6s7s8s9
s5s6s7s8s9
s5s6s7s8s9
s5s6s7s8s9
s5s6s7s8s9
r15r15s5s6s7s8r15r15r15r15r15r15
s10s11s12s13s27s14
s10r9r9s13r9s14r9
r5r5r5s13r5r5r5
s10r2r2s13r2s14r2
s10r3r3s13r3s14r3
r6r6r6s13r6r6r6
r4r4r4s13r4r4r4
s28r11
s29
r13r13r13r13r13r13r13r13
r14r14r14r14r14r14r14r14
r16r16r16r16r16r16r16r16
r15r15s5s6s7s8r15r15r15r15r15r15
r10r10r10r10r10r10r10r10
r12

Goto Table

Exp'ExpCallArgsArgAtom
0g4g3g2g1
1
2
3
4
5
6
7
8g16g2g1
9g17g2g1
10g18g2g1
11g19g2g1
12g20g2g1
13g21g2g1
14g22g2g1
15g26g24g23g25
16
17
18
19
20
21
22
23
24
25
26
27
28g26g30g23g25
29
30

And this is actually it. When the parser parses a string, it does so the same way as one would traverse the resulting syntax tree. That is: we just evaluate with the parser. So in order to get a deeper understanding let us draw a tree. Say we have

1 + 2 + 3 * 4 * 5

For this expression we get the tree

Syntax tree of above expression

Which we traverses buttom up, or depth first, and we have a valuated expression that respects the precedence and associativity rules. This is excatly the way the SLR parser traverses the expression.

CommentsGuest Name:Comment: