Symbolic Automata

Classic automata theory builds on the assumption that the alphabet is finite. Unfortunately, practical applications such as XML processing and program trace analysis use values for individual symbols that are typically drawn from an infinite domain. Even when the alphabet is finite, classic automata may sometimes be a bad choice: for example a deterministic finite automata modelling a language over the UTF16 alphabet requires 2^16 transitions out of each state!

What are Symbolic Automata and Transducers?

Symbolic Finite Automata (SFAs) are finite state automata in which the alphabet is given by a Boolean algebra that may have an infinite domain, and transitions are labeled with first-order predicates over such algebra. For example a symbolic automaton (shown on the right) can define the following property:
OddG1 = {l | l is a list of odd numbers with length greater than 1}
In order for SFAs to be closed under Boolean operations and preserve decidability of equivalence, it should be decidable to check whether predicates in the algebra are satisfiable. In the example above predicates are expressed in Presburger arithmetic which is indeed a decidable theory closed under Boolean operations. Symbolic Finite Transducers (SFTs) extend SFAs to output lists. In a SFT transitions, upon reading an input symbol, can compute an output that is expressed as a function of the input being read. Such a function has to belong to the underlying alphabet theory. Many variants of SFAs and SFTs have been proposed, and this page tries to keep up with such extensions.

How do they relate to classic Automata?

Symbolic Finite Automata are strictly more expressive than deterministic finite automata. Despite this fact, Symbolic Finite Automata are closed under Boolean operations and admit decidable equivalence. In general for large alphabets Symbolic Automata outperforms their classic counterpart. In fact even complex regular expressions over UTF16 can be analyzed using symbolic automata.

References

We recommend reading this paper to get started. You can also watch this talk. The purpose of this page is to keep track of the latest results related to this topic. Email me (loris at cs.wisc.edu) with comments and/or suggested additions.

Decision Problems and Closure Properties

  • Symbolic Finite State Transducers: Algorithms and Applications [ pdf ], N, Bjorner, P. Hooimeijer, B. Livshits, D. Molnar, M. Veanes, POPL12
  • Minimization of Symbolic Automata [ pdf ], L. D'Antoni, M. Veanes, POPL14
  • A Symbolic Decision Procedure for Symbolic Alternating Finite Automata [ pdf ], L. D'Antoni, Z. Kincaid, F. Wang, MFPS XXXIII
  • Forward Bisimulations for Nondeterministic Symbolic Finite Automata [ pdf ], L. D'Antoni, M. Veanes, TACAS 17
  • The power of symbolic automata and transducers [ pdf ], L. D'Antoni, M. Veanes, TACAS 17

Extensions to Trees and Nested Words

  • Symbolic Tree Transducers [ pdf ], M. Veanes, N. Bjorner, PSI11, WARNING! Theorem 1 is wrong.
  • Forward and Backward Application of Symbolic Tree Transducers [ pdf ], Z. Fulop, H. Vogler, Acta Informatica
  • Fast: A Transducer-Based Language for Tree Manipulation [ pdf ], L. D'Antoni, M. Veanes, B. Livshits, D. Molnar TOPLAS
  • Symbolic Visibyly Pushdown Automata [ pdf ], L. D'Antoni, R. Alur, CAV14
  • Minimization of Symbolic Tree Automata [ pdf ], L. D'Antoni, M. Veanes, LICS16

Other Extensions

  • Extended Symbolic Finite Automata and Transducers [ link ], L. D'Antoni, and M. Veanes, FMSD 15
  • A Theory of Synchronous Relational Interfaces (Section 6) [ pdf ], S. Tripakis, B. Lickly, T. A. Henzinger, E. A. Lee, TOPLAS11
  • Monadic second-order logic on finite sequences L. D'Antoni, M. Veanes POPL 17
  • Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of executables M. Dalla Preda, R. Giacobazzi, A. Lakhotia, I. Mastroieni POPL 17

Learning

  • Sigma*: Symbolic Learning of Input-Output Specifications [ pdf ], M. Botinkan, D. Babic, POPL13
  • Learning Regular Languages over Large Alphabets [ pdf ], O. Maler, I. E. Mens, TACAS14
  • Learning Symbolic Automata, [ pdf ], S. Drews and L. D'Antoni, TACAS 17

Applications

  • Symbolic Automata Constraint Solving [ pdf ], M. Veanes, N. Bjorner, L. de Moura, LPAR10
  • From Sequential Extended Regular Expressions to NFA with Symbolic Labels [ pdf ], Alessandro Cimatti, Sergio Mover, Marco Roveri, Stefano Tonetta, CIAA10
  • Symbolic Automata: The Toolkit [ pdf ], M. Veanes, N. Bjorner, TACAS11
  • Fast and Precise Sanitizer Analysis with BEK [ pdf ], P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, M. Veanes, USENIX Security'11
  • Static Analysis of String Encoders and Decoders [ pdf ], L. D'Antoni, M. Veanes,, VMCAI13
  • Applications of Symbolic Finite Automata [ pdf ], M. Veanes, CIAA13
  • Data-Parallel String-Manipulating Programs [ pdf ], M. Veanes, D. Molnar, T. Mytkowicz, B. Livshits POPL 2015
  • DReX: A Declarative Language for Efficiently Evaluating Regular String Transformations [ pdf ], R. Alur, L. D'Antoni, M. Raghothaman, POPL 2015
  • Automatic Program Inversion using Symbolic Transducers [ pdf ], Q. Hu and L. D'Antoni, PLDI 2017

Tools based on Symbolic Automata and Transducers

  • symbolicautomata: a Java symbolic automata library. L. D'Antoni
  • Microsoft Automata Library: a library for symbolic automata. M. Veanes, N. Bjorner
  • Rex: a command line tool that generates matching strings for .NET regexes. M. Veanes
  • Bek: a transducer-based language for string manipulation. M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, N. Bjorner
  • Bex: a transducer-based language for analysis of string encoders and decoders. L. D'Antoni, M. Veanes
  • Fast: a transducer-based Language for analysis of tree manipulating programs. L. D'Antoni, M. Veanes, B. Livshits, D. Molnar
  • Mona: A solver for Monadic Second-Order logic. Aarhus University