Dave Hanold CS 536 Lecture Notes -- 02/22/99 ================================================================================ ** Read Chapter 4 ** Parser project to be handed out on Wednesday ----------------------------------------------- * How do scanner generators operate? 1. Translate regular expression to NFA 2. Convert the NFA to a DFA (construct set of states) 3. Optimize DFA (reduce states) example: (aa|aaaaa)+ -- subdivide into sets of 2 and 5 -- any even # (or odd # greater than 5) will work -- FA: (()) means a final state a a a a a a a a a -->(1)-->(2)-->((3))-->(4)-->((5))-->((6))-->((7))-->((8))-->((9))-->((10)) ^ | | |a - * When are 2 states in a DFA equivalent? -- Criteria for Merging States: ONLY IF THEY HAVE IDENTICAL FUTURE BEHAVIOR * GREEDY ALGORITHM (to merge states of a DFA): 1. Merge together all final states into a group G1; merge together all nonfinal states into a group G2 2. REPEAT: Let G = any state set = {S1, S2, ..., Sn} Let C = any character c c Let G' = {t1, t2, ..., tn} where S1-->t1, S2-->t2 if G' is not entirely contained in some existing group, then split G' into new groups such that Si and Sj remain together iff ti and tj are in the same group --example: a b -->(1)-->(2)-->((3)) | |c d b -->(4)-->(5)-->((6)) G1 = [3, 6] G2 = [1, 2, 4, 5] --We can pull out the states numbered 1 and 4 since their outcomes are unique. --The algorithm gives the maximum reduction of states: a b -->(1)-->(2,5)-->((3,6)) | ^ c| |d ->(4)- --example (aa|aaaaa)+ yields: a a a a -->(1)-->(2)-->((3))-->(4)-->((5-10)) | ^ a| | -- 3. Generate scanner code in executable form --Regular expression rules followed by appropriate code * CREATE TRANSITION TABLE FOR OPTIMIZED DATA (aa|aaaaa)+ example: Char 'a' State 1 2 2 3 3 4 4 5 5 5 is implemented as a switch statement: case 1: {Java Code A} break; case 2: {Java Code B} break; ===============================================================================