Symbolic Automata

Classic automata theory builds on the assumption that the alphabet is finite. Unfortunately, practical applications such as XML processing and program trace analysis use values for individual symbols that are typically drawn from an infinite domain. Even when the alphabet is finite, classic automata may sometimes be a bad choice: for example a deterministic finite automata modelling a language over the UTF16 alphabet requires 2^16 transitions out of each state!

What are Symbolic Automata and Transducers?

Symbolic Finite Automata (SFAs) are finite state automata in which the alphabet is given by a Boolean algebra that may have an infinite domain, and transitions are labeled with first-order predicates over such algebra. For example a symbolic automaton (shown on the right) can define the following property:
OddG1 = {l | l is a list of odd numbers with length greater than 1}
In order for SFAs to be closed under Boolean operations and preserve decidability of equivalence, it should be decidable to check whether predicates in the algebra are satisfiable. In the example above predicates are expressed in Presburger arithmetic which is indeed a decidable theory closed under Boolean operations. Symbolic Finite Transducers (SFTs) extend SFAs to output lists. In a SFT transitions, upon reading an input symbol, can compute an output that is expressed as a function of the input being read. Such a function has to belong to the underlying alphabet theory. Many variants of SFAs and SFTs have been proposed, and this page tries to keep up with such extensions.

How do they relate to classic Automata?

Symbolic Finite Automata are strictly more expressive than deterministic finite automata. Despite this fact, Symbolic Finite Automata are closed under Boolean operations and admit decidable equivalence. In general for large alphabets Symbolic Automata outperforms their classic counterpart. In fact even complex regular expressions over UTF16 can be analyzed using symbolic automata.

References

We recommend reading this paper to get started. You can also watch this talk. The purpose of this page is to keep track of the latest results related to this topic. Email me (loris at cs.wisc.edu) with comments and/or suggested additions.

Decision Problems and Closure Properties

Extensions to Trees and Nested Words

Other Extensions

Learning

Applications

Tools based on Symbolic Automata and Transducers