# Grammar to Set - Syntax of Input

## Grammar2Set #2 :: 05-05-2018

This is quite peculiar. I'm writing a grammar for a grammar. This is also one of those accomplishments I'll hold on to. This parser might be usable for other things as well.

We need a set of tokens

Prod = prod Rightarrow = -> Mid = | NonTerm = [A-Z][a-zA-Z']* Term = "[^"]+" Ignore = [\n\t ]+

While writing the grammar what occurred to me, is this: The parser is again LL(1). If a production is not started by prod, then how can the look-aheader distinguish in an expression like this: A -> B 'b' B -> C. Solution is to add prod. Adding " quotes to any terminal is done so to make the grammar accept terminals like spaces, -> or even prod.

The grammar is as follows

Prod \rightarrow prod nonTerm -> ProdRight ProdRight \rightarrow RightSide ProdRight' ProdRight' \rightarrow Prod ProdRight' \rightarrow \epsilon RightSide \rightarrow RightExp RightSide' RightSide' \rightarrow | RightSide RightSide' \rightarrow \epsilon RightExp \rightarrow nonTerm RightExp RightExp \rightarrow term RightExp RightExp \rightarrow \epsilon

Left recursion is removed before hand, and the grammar has been left factorized. It might look a bit excessive for a relatively simple language. It can be expressed as a regular expression, if one would like that.

Set calculation. If all goes as planned, this should be the last time I calculate any of these sets. For those not familiar with these calculations, they are explained further in the next chapters.Nullable

Nullable(Prod) = false Nullable(ProdRight) = true \land Nullable(ProdRight') = true Nullable(ProdRight') = true Nullable(RightSide) = Nullable(RightExp) \land Nullable(RightSide') = true \land true Nullable(RightSide') = true Nullable(RightExp) = true

First

First(Prod) = {prod} First(ProdRight) = {prod} \union First(RightSide) = {prod,nonTerm,term,|} First(ProdRight') = {prod} First(RightSide) = First(RighExp) \union First(RightSide') = {nonTerm,term,|} First(RightSide') = {|} First(RightExp) = {nonTerm,term}

Follow. First add the extra production S \rightarrow Prod $S \rightarrow Prod$ Prod \rightarrow prod nonTerm -> ProdRight ProdRight \rightarrow RightSide ProdRight' ProdRight' \rightarrow Prod ProdRight' \rightarrow \epsilon RightSide \rightarrow RightExp RightSide' RightSide' \rightarrow | RightSide RightSide' \rightarrow \epsilon RightExp \rightarrow nonTerm RightExp RightExp \rightarrow term RightExp RightExp \rightarrow \epsilon

Then calculate constraints based on the resulting grammar

#S {$} \subset Follow(Prod) #Prod Follow(Prod) \subset Follow(ProdRight) #ProdRight First(ProdRight') \subset Follow(RightSide) Follow(ProdRight) \subset Follow(RightSide) Follow(ProdRight) \subset Follow(ProdRight') #ProdRight' Follow(ProdRight') \subset Follow(Prod) #RightSide First(RightSide') \subset Follow(RightExp) Follow(RightSide) \subset Follow(RightExp) Follow(RightSide) \subset Follow(RightSide') #RightSide' newly added Follow(RightSide') \subset Follow(RightSide) #RightExp Iterate the constraints. One iterations should suffice. Follow(Prod) = Follow(ProdRight) = {$} Follow(ProdRight') = {$} Follow(RightSide) = First(ProdRight') \union Follow(ProdRight) = {prod,$} Follow(RightSide') = Follow(RightSide) = {prod,$} Follow(RightExp) = First(RightSide') \union Follow(RightSide) = {prod,$,|}

LookAhead-sets are obtained as a combo of the above calculations

la(S \rightarrow Prod $) = First(Prod) = {prod} la(Prod \rightarrow prod nonTerm -> ProdRight) = {prod} la(ProdRight \rightarrow RightSide ProdRight') = First(ProdRight) \union Follow(ProdRight) = {prod,nonTerm,term,|,$} la(ProdRight' \rightarrow Prod) = {prod} la(ProdRight' \rightarrow \epsilon) = Follow(ProdRight') = {$} la(RightSide \rightarrow RightExp RightSide') = First(RightSide) \union Follow(RightSide) = {nonTerm,term,|,prod,$} la(RightSide' \rightarrow | RightSide) = First(| RightSide) = {|} la(RightSide' \rightarrow \epsilon) = Follow(RightSide') = {prod,$} la(RightExp \rightarrow nonTerm RightExp) = First(nonTerm RightExp) = {nonTerm} la(RightExp \rightarrow term RightExp) = First(term RightExp) = {term} la(RightExp \rightarrow \epsilon) = Follow(RightExp) = {prod,|,$}

And we are set to go. The implementation can be found in the Git-repo linked to in the last chapter.