BisonGen creates LALR parsers from a formal grammar input file, as C python extension using Gnu Bison and in pure python.
Some usage notes
Lexer states
In the <lexer> section it's possible to define states which determine which token patterns the lexer is looking for while scanning:
{{{ <states>
<exclusive>OPERATOR</exclusive>
</states>
}}} The default state is INITIAL, this adds an additional OPERATOR state. <exclusive> tells the lexer to only apply OPERATOR rules when in this state, anything else like <inclusive> will extend the INITIAL (toplevel) rules if the state is active. Two states can be active at the same time, INITIAL and one inclusive state. Rules for a state are placed inside a scope block:
<scope state='OPERATOR'> <pattern expression='or'> <begin>INITIAL</begin> <token>OR</token> </pattern> <!-- more rules --> </scope>
The begin element determines the next state to go to when the rule matches, in this case it would go back to INITIAL. The same is used in the top-level INITIAL rules that aren't placed inside a special scope block:
<pattern expression='\)|\]'> <begin>OPERATOR</begin> <token>@ASCII@</token> </pattern>
This one would switch to the OPERATOR state..