Theory of finite automata and regular expressions


  1. { [^i ]^i | i >= 1 } is NOT regular
  2. If R is regular, -R (complement of R) is regular
    EX: /* */ is ok.
    "/*" -("*/") "*/" looks like it should work, but it is completely wrong. It is wrong because it doesn't include anything (even a string) with the two characters */ anywhere in it.
  3. R is regular, s is a subset of R, but this does NOT mean that s is regular.
    EX: if V = ascii vocabulary, then V* denotes everything (books, stock reports, etc); and everything can contain a lot of extra information that is not wanted.
    V is a superset of #1 (above), and so then s cannot be a subset. Smaller expressions do not necessarily mean easier
  4. R1 is regular and R2 is regular, then R1 intersection R2 is regular.
    EX: CSXtokens = CSX & all-three-letter-tokens = VVV This helps to illustrate intersection, as CSX intersection VVV will narrow the field of all CSXtokens and all-three-letter-tokens to be only three letter CSXtokens.


    R1: -->(1)--a-->((2))<==a
    R2: -->(3)--a-->((4))--b-->((5))
    R1 intersection R2: -->(1,3)--a-->((2,4))-->(2,?)\-->(?,5)
    The '?' denotes that that particular state under both states isn't final, so we can go ahead and delete it.
    R1 intersection R2: -->(1,3)--a-->((2,4)) ; or a+ intersection (a|ab) = a.
    We can also prove this using complementation (Dr. Morgan's Law) R1 intersection R2 = -(-R1 union -R2)
  5. If R is regular R^rev (where 'rev' reverses all strings) is regular.
    EX: (xyz)^rev = zyx
    This can be more easily seen by drawing a finite automata (FA) and reversing the directional arrows.

Parsing and Context Free Grammar (CFG)



---->SCANNER--tokens-->PARSER--structural represenation of the language-->

The parser uses Context Free Grammar (CFG) the same way the scanner uses regular expressions.