Definition: Lexer : MOVETO "-->" NONE "NONE" IDENTIFIER [a-zA-Z_][a-zA-Z0-9_]* RANGE \[[^\n\]]*\] STATE "%state" START "%start" ACTION "%action" New states ---------- COMMENT -- single "//" or multiline "/* */" comment MACRO -- %macro and that line INLINE -- between %{ & %} Grammar ------- CompilationUnit : SectionsList SectionsList : Section | SectionsList Section Section : Inline | Macro | State | Action | Start Inline : INLINE Macro : MACRO State : STATE IDENTIFIER ':' TransitionList ';' TransitionList : Transition | Transition '|' TransitionList Transition : CHAR_LITERAL MOVETO IDENTIFIER | OTHER MOVETO IDENTIFIER | RANGE MOVETO IDENTIFIER Action : ACTION IdentifierList INLINE IdentifierList : Identifier | IdentifierList Identifier Start : START IDENTIFIER Semantic checks --------------- No multiple states or actions Only a single transition for a particular input from a state No scoping of state names (thank god !) Always peek ? Debug & Trace generation Epsilon transitions allowed only when that is the only transition there Future Improvements Planned --------------------------- Removal of unconnected states Merge multiple motion to the same target into a single comparison Epsilon transitions from a node Documentation via graphviz or other diagramming tool Multi-line macros Inline node actions which are accessed only from one location Jump distance re-arrangement, for minimum traversal DFA state reduction algorithms "%include" support code generation for C Example fragment ---------------- %{ // rest of code public override bool Read() { .... locals %} %macro read $1 = ReadChar() %macro peek $1 = PeekChar() %start START %state START : '<' --> opening_tag | [a-zA-Z_] --> text | '&' --> entity | NONE --> empty | . --> error ; %state END ; Seemed to be a good idea with %action START %{ // action for start ... %} etc... Intended code output -------------------- START: c = ReadChar(); do { // action; }while(false); c = PeekChar(); if(c == '<') { #ifdef TRACE Trace.WriteLine(" got '&' going to closing_tag"); #endif #ifdef DEBUG_STACK ....... #endif goto opening_tag; } ... END: return true; ERROR: return false;