The Nullable-set is pretty straight forward, at least the formal version. It's given as
Nullable() = true Nullable(a) = false Nullable(\alpha \beta) = Nullable(\alpha) \land Nullable(\beta) Nullable(N) = Nullable(\alpha_{1}) \lor ... \lor Nullable(\alpha_{n}), where N \rightarrow \alpha_{1}, ..., N \rightarrow \alpha_{n}But say we have some grammar for the regex a*b*
A \rightarrow 'a' A | B B \rightarrow 'b' B |Both the productions A and B are nullable. But if we start computing on A, then we need the unresolved Nullable(B) to finish. We thus need to iterate when Nullable(B) has been found.
The grammar when parsed looks like this
("A",[[Term "a";NonTerm "A"];[NonTerm "B"]]) ("B",[[Term "b";NonTerm "B"];[]])Where tuples are key * value into dictionaries. To calculate nullable we need a cooperation between logical ∧ and ∨ and nullable-values of the lists in the value of the tuple. Let for explanation
t_{a} = Term "a" t_{b} = Term "b" nonT_{A} = NonTerm "A" nonT_{B} = NonTerm "B" rightA_{1} = t_{a} nonT_{A} rightA_{2} = nonT_{B} rightB_{1} = t_{b} nonT_{B} rightB_{2} = \epsilonWhere ε be emtpy. Now
null_{A1} = Nullable(rightA_{1}) = Nullable(t_{a}) \land Nullable(nonT_{A}) null_{A2} = Nullable(rightA_{2}) = Nullable(nonT_{B}) null_{B1} = Nullable(rightB_{1}) = Nullable(t_{b}) \land Nullable(nonT_{B}) null_{B2} = Nullable(rightB_{2}) = trueThese depend on conjunctions. Furthermore we get
Nullable(A) = null_{A1} \lor null_{A2} Nullable(B) = null_{B1} \lor null_{B2}These depend on disjunctions. To optimize and better our understanding we can use short circuit evaluation. Say a production contains a set of right-sides. For every right-side, if it contains just one terminal, this right-side is not nullable. For every right-side, if just a single one is nullable, the whole production is nullable.
Since we have to iterate, we have to take into account grammars that are circular. Say we have the grammar
A \rightarrow B B \rightarrow A | 'b'The alternate B-production containing the terminal 'b' is just there to point out that this could be a grammar. It isn't totally useless. I haven't allowed grammars like the circular one in this application. So I can use my own algorithm to calculate Nullable. The idea is taken from fixpoint iterations used to solve the next two sets. This one works as follows
The idea is to gradually remove any unsolved production using values from solved ones. This idea does not work with the above circular grammar since we can't obtain any Nullable-values and thus can't remove anything from unsolved: B \rightarrow 'b' isn't nullable, and we have to obtain Nullable(A) to close the disjunction.