BisonGen creates LALR parsers from a formal grammar input file, as C python extension using Gnu Bison and in pure python.
Some usage notes
Lexer states
In the <lexer> section it's possible to define states which determine which token patterns the lexer is looking for while scanning:
{{{ <states>
<exclusive>OPERATOR</exclusive>
</states>
}}} The default state is INITIAL, this adds an additional OPERATOR state. <exclusive> tells the lexer to only apply OPERATOR rules when in this state, anything else like <inclusive> will extend the INITIAL (toplevel) rules if the state is active. Two states can be active at the same time, INITIAL and one inclusive state. Rules for a state are placed inside a scope block:
<scope state='OPERATOR'>
<pattern expression='or'>
<begin>INITIAL</begin>
<token>OR</token>
</pattern>
<!-- more rules -->
</scope>The begin element determines the next state to go to when the rule matches, in this case it would go back to INITIAL. The same is used in the top-level INITIAL rules that aren't placed inside a special scope block:
<pattern expression='\)|\]'>
<begin>OPERATOR</begin>
<token>@ASCII@</token>
</pattern>This one would switch to the OPERATOR state..
