Symbolic Automata
Classic automata theory builds on the assumption that the alphabet is finite.
Unfortunately, practical applications such as XML processing and program trace analysis use values for individual symbols
that are typically drawn from an infinite domain.
Even when the alphabet is finite, classic automata may sometimes be a bad choice: for example a deterministic finite automata
modelling a language over the UTF16 alphabet requires 2^16 transitions out of each state!
What are Symbolic Automata and Transducers?
Symbolic Finite Automata (SFAs) are finite state automata in which the alphabet is given by a Boolean algebra that
may have an infinite domain, and transitions are labeled with firstorder predicates over such algebra.
For example a symbolic automaton (shown on the right) can define the following property:
OddG1 = {l  l is a list of odd numbers with length greater than 1}


In order for SFAs to be closed under Boolean operations and preserve decidability of equivalence,
it should be decidable to check whether predicates in the algebra are satisfiable.
In the example above predicates are expressed in Presburger arithmetic which is
indeed a decidable theory closed under Boolean operations.
Symbolic Finite Transducers (SFTs) extend SFAs to output lists. In a SFT
transitions, upon reading an input symbol, can compute an output that
is expressed as a function of the input being read.
Such a function has to belong to the underlying alphabet theory.
Many variants of SFAs and SFTs have been proposed, and this page tries to keep up with such extensions.
How do they relate to classic Automata?
Symbolic Finite Automata are strictly more expressive than deterministic finite automata.
Despite this fact,
Symbolic Finite Automata are closed under Boolean operations and admit decidable equivalence.
In general for large alphabets Symbolic Automata outperforms their classic counterpart.
In fact even complex regular expressions over UTF16 can be analyzed using symbolic automata.
References
We recommend reading
this paper to get started.
You can also watch this
talk.
The purpose of this page is to keep track of the latest results related to this topic. Email me (loris at cs.wisc.edu) with comments and/or suggested additions.
Decision Problems and Closure Properties

Symbolic Finite State Transducers:
Algorithms and Applications
[ pdf ],
N, Bjorner, P. Hooimeijer, B. Livshits, D. Molnar, M. Veanes,
POPL12

Minimization of Symbolic Automata
[ pdf ],
L. D'Antoni, M. Veanes,
POPL14

A Symbolic Decision Procedure for
Symbolic Alternating Finite Automata
[ pdf ],
L. D'Antoni, Z. Kincaid, F. Wang,
MFPS XXXIII

Forward Bisimulations for Nondeterministic Symbolic Finite Automata
[ pdf ],
L. D'Antoni, M. Veanes,
TACAS 17

The power of symbolic
automata and transducers
[ pdf ],
L. D'Antoni, M. Veanes,
TACAS 17

Theoretical Aspects of Symbolic Automata
[ pdf ],
H. Tamm, M. Veanes,
SOFSEM 18
Extensions to Trees and Nested Words

Symbolic Tree Transducers
[ pdf ],
M. Veanes, N. Bjorner,
PSI11,
WARNING! Theorem 1 is wrong.

Forward and Backward Application of
Symbolic Tree Transducers
[ pdf ],
Z. Fulop, H. Vogler,
Acta Informatica

Fast: A TransducerBased Language for Tree Manipulation
[ pdf ],
L. D'Antoni, M. Veanes, B. Livshits, D. Molnar
TOPLAS

Symbolic Visibyly Pushdown Automata
[ pdf ],
L. D'Antoni, R. Alur,
CAV14

Minimization of Symbolic Tree Automata
[ pdf ],
L. D'Antoni, M. Veanes,
LICS16
Other Extensions

Extended Symbolic Finite Automata and Transducers
[ link ],
L. D'Antoni, and M. Veanes,
FMSD 15

A Theory of Synchronous Relational Interfaces (Section 6)
[ pdf ],
S. Tripakis, B. Lickly, T. A. Henzinger, E. A. Lee,
TOPLAS11

Monadic secondorder logic on finite sequences [ pdf ]
L. D'Antoni, M. Veanes
POPL 17

Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of executables
M. Dalla Preda, R. Giacobazzi, A. Lakhotia, I. Mastroieni
POPL 17

Symbolic Register Automata
[ pdf ]
L. D'Antoni, T. Ferreira, M. Sammartino, A. Silva
CAV 19
Learning

Learning Regular Languages over Large Alphabets
[ pdf ],
O. Maler, I. E. Mens,
TACAS14

Learning Symbolic Automata,
[ pdf ],
S. Drews and L. D'Antoni,
TACAS 17

The Learnability of Symbolic Automata,
[ pdf ],
G. Argyros and L. D'Antoni,
CAV 18

Verifying Sanitizer Correctness through BlackBox Learning: A Symbolic Finite Transducer Approach,
[ pdf ],
S. Lathouwers, M. Everts, M. Huisman,
FORSE 20
Applications

Symbolic Automata Constraint Solving [ pdf ],
M. Veanes, N. Bjorner, L. de Moura,
LPAR10

From Sequential Extended Regular Expressions to NFA with Symbolic Labels
[ pdf ],
Alessandro Cimatti, Sergio Mover, Marco Roveri, Stefano Tonetta,
CIAA10

Symbolic Automata: The Toolkit
[ pdf ],
M. Veanes, N. Bjorner,
TACAS11

Fast and Precise Sanitizer Analysis with BEK [ pdf ],
P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, M. Veanes,
USENIX Security'11

Static Analysis of String Encoders and Decoders
[ pdf ],
L. D'Antoni, M. Veanes,,
VMCAI13

Applications of Symbolic Finite Automata
[ pdf ],
M. Veanes,
CIAA13

DataParallel StringManipulating Programs
[ pdf ],
M. Veanes, D. Molnar, T. Mytkowicz, B. Livshits
POPL 2015

DReX: A Declarative Language for Efficiently Evaluating
Regular String Transformations
[ pdf ],
R. Alur, L. D'Antoni, M. Raghothaman,
POPL 2015

Back in Black: Towards Formal, Black Box Analysis of Sanitizers and Filters
[ web ],
G. Argyros, I. Stais, A. Kiayias, A. Keromytis,
S&P 2016

SFADiff: Automated Evasion Attacks and Fingerprinting Using Blackbox Differential Automata Learning
[ web ],
G. Argyros, I. Stais, S. Jana, A. Keromytis, A. Kiayias,
CCS 2016

Automatic Program Inversion using Symbolic Transducers
[ pdf ],
Q. Hu and L. D'Antoni,
PLDI 2017

An Algebraic Framework for Runtime Verification
[ web ],
S. Jaksic, E. Bartocci, R. Grosu, D. Nickovic,
TCAD 2018

Aragog: Scalable Runtime Verification of Shardable Networked Systems
[ web ],
N. Yassen, B. Arzani, R. Beckett, S. Ciraci, B. Liu,
OSDI 2020
Tools based on Symbolic Automata and Transducers

symbolicautomata: a Java symbolic automata library. L. D'Antoni

Microsoft Automata Library: a library for symbolic automata.
M. Veanes, N. Bjorner

Rex: a command line tool that generates matching strings for .NET regexes. M. Veanes

Bek: a transducerbased language for
string manipulation. M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, N. Bjorner

Bex: a transducerbased language for analysis
of string encoders and decoders. L. D'Antoni, M. Veanes

Fast: a transducerbased Language for analysis of tree manipulating
programs. L. D'Antoni, M. Veanes, B. Livshits, D. Molnar

Mona: A solver for Monadic SecondOrder logic. Aarhus University