pathterminuspages/projects/aboutcontactabout me

Parser Generator - Workings

Grammar2Slr #4 :: 31-08-2018

Contents
 -Parser Generator - Intro
 -Parser Generator - Syntax
 -Parser Generator - Definitions
 @Parser Generator - Workings

Running the application

First of all save the grammar in some file. Then run ./create_slr.sh "input file name". You might need to chmod +x create_slr.sh before it is executable. The application writes the table as JavaScript objects to the file tableout.js. Then it will try to open the my_parser.html with firefox. If for any reason this will not succeed, just open the file and choose the output target. Besides creating a parser you can choose to create a table in either html-code, or you can just have the table shown.

Configuring the parser

First of all you can look at my ComCalc Project - Generating Parser. The target language is Fsharp. And it is build using this application.

If the target is either JavaScript or FSharp (these two languages is a bit alike) there are two elements that needs be added to or changed. Let's consider JavaScript. The FSharp version is quite similar, but FSharp has discriminating unions which streamline the process a bit.

addToken2tree: Any token that has the -cap flag set, is going to be in this function. You can add further tokens. But most importantly is to add the tokens that are set already, to some data structure so they can be used. In the scope I have included a tree. The tree is a stack with a push and a pop function. A way of adding to this structure could be

"id":function(tokenType,tokenVal){ tree.push({type:tokenType,v:tokenVal}); },

But you can of course manage this as you want to. You can even add the tokens as a side effect to some structure lying in an outer scope or at top level.

production_fun: this is a data structure. It contains the actual parsing. Each function represents a production that is stated above. Each function takes the tree as argument. This is the tree from above: a stack that can be pushed or popped. When the bottom is reached (a terminal right side), the given token should be popped and processed, or it should be leaved as is and just processed. From now on I will build projects using this tool, so to get a better understanding of what is going on in production_fun refer to those. Alternatively try your way around. In JavaScript you can use peek on the stack and alert the result to get insight.

Common pitfalls in the grammar

This will be updated from time to time. Using the SLR-parser I have found some common mistakes I often make

  • Varieties of non terminals that starts with the same terminal symbol, ex. prod A -> "a" A prod B -> "a" "b" B prod C -> A | B Here the parser can't decide whether to go from C to B or A. The parser is look-ahead 1 only, and A and B have same first set. The generator should fail with something like "possibly overlapping look-ahead set on".
  • A production contains two (or more) non terminals where the first contains the second, ex. prod A -> C | ... prod B -> A | C

    This is very similar to the above. First sets are overlapping.

Installing Grammar2Slr

To install just clone the Git Repo. However you need the whole shebang since Grammar2Slr needs both Grammar2Set and Nfa2Dfa to function. Furthermore you'll need a mono installation on your computer that supports Fsharp. The application is made as batch script for linux terminal. It can surely be run on a Windows machine, but in this case you'll have to adapt the Fsharp compilers.

CommentsGuest Name:Comment: