Symbolic Automata
Classic automata theory builds on the assumption that the alphabet is finite.
Unfortunately, practical applications such as XML processing and program trace analysis use values for individual symbols
that are typically drawn from an infinite domain.
Even when the alphabet is finite, classic automata may sometimes be a bad choice: for example a deterministic finite automata
modelling a language over the UTF16 alphabet requires 2^16 transitions out of each state!
What are Symbolic Automata and Transducers?
Symbolic Finite Automata (SFAs) are finite state automata in which the alphabet is given by a Boolean algebra that
may have an infinite domain, and transitions are labeled with first-order predicates over such algebra.
For example a symbolic automaton (shown on the right) can define the following property:
OddG1 = {l | l is a list of odd numbers with length greater than 1}
|
|
In order for SFAs to be closed under Boolean operations and preserve decidability of equivalence,
it should be decidable to check whether predicates in the algebra are satisfiable.
In the example above predicates are expressed in Presburger arithmetic which is
indeed a decidable theory closed under Boolean operations.
Symbolic Finite Transducers (SFTs) extend SFAs to output lists. In a SFT
transitions, upon reading an input symbol, can compute an output that
is expressed as a function of the input being read.
Such a function has to belong to the underlying alphabet theory.
Many variants of SFAs and SFTs have been proposed, and this page tries to keep up with such extensions.
How do they relate to classic Automata?
Symbolic Finite Automata are strictly more expressive than deterministic finite automata.
Despite this fact,
Symbolic Finite Automata are closed under Boolean operations and admit decidable equivalence.
In general for large alphabets Symbolic Automata outperforms their classic counterpart.
In fact even complex regular expressions over UTF16 can be analyzed using symbolic automata.
References
We recommend reading
this paper to get started.
You can also watch this
talk.
The purpose of this page is to keep track of the latest results related to this topic. Email me (loris at cs.wisc.edu) with comments and/or suggested additions.
Decision Problems and Closure Properties
-
Symbolic Finite State Transducers:
Algorithms and Applications
[ pdf ],
N, Bjorner, P. Hooimeijer, B. Livshits, D. Molnar, M. Veanes,
POPL12
-
Minimization of Symbolic Automata
[ pdf ],
L. D'Antoni, M. Veanes,
POPL14
-
A Symbolic Decision Procedure for
Symbolic Alternating Finite Automata
[ pdf ],
L. D'Antoni, Z. Kincaid, F. Wang,
MFPS XXXIII
-
Forward Bisimulations for Nondeterministic Symbolic Finite Automata
[ pdf ],
L. D'Antoni, M. Veanes,
TACAS 17
-
The power of symbolic
automata and transducers
[ pdf ],
L. D'Antoni, M. Veanes,
TACAS 17
-
Theoretical Aspects of Symbolic Automata
[ pdf ],
H. Tamm, M. Veanes,
SOFSEM 18
Extensions to Trees and Nested Words
-
Symbolic Tree Transducers
[ pdf ],
M. Veanes, N. Bjorner,
PSI11,
WARNING! Theorem 1 is wrong.
-
Forward and Backward Application of
Symbolic Tree Transducers
[ pdf ],
Z. Fulop, H. Vogler,
Acta Informatica
-
Fast: A Transducer-Based Language for Tree Manipulation
[ pdf ],
L. D'Antoni, M. Veanes, B. Livshits, D. Molnar
TOPLAS
-
Symbolic Visibyly Pushdown Automata
[ pdf ],
L. D'Antoni, R. Alur,
CAV14
-
Minimization of Symbolic Tree Automata
[ pdf ],
L. D'Antoni, M. Veanes,
LICS16
Other Extensions
-
Extended Symbolic Finite Automata and Transducers
[ link ],
L. D'Antoni, and M. Veanes,
FMSD 15
-
A Theory of Synchronous Relational Interfaces (Section 6)
[ pdf ],
S. Tripakis, B. Lickly, T. A. Henzinger, E. A. Lee,
TOPLAS11
-
Monadic second-order logic on finite sequences [ pdf ]
L. D'Antoni, M. Veanes
POPL 17
-
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of executables
M. Dalla Preda, R. Giacobazzi, A. Lakhotia, I. Mastroieni
POPL 17
-
Symbolic Register Automata
[ pdf ]
L. D'Antoni, T. Ferreira, M. Sammartino, A. Silva
CAV 19
Learning
-
Learning Regular Languages over Large Alphabets
[ pdf ],
O. Maler, I. E. Mens,
TACAS14
-
Learning Symbolic Automata,
[ pdf ],
S. Drews and L. D'Antoni,
TACAS 17
-
The Learnability of Symbolic Automata,
[ pdf ],
G. Argyros and L. D'Antoni,
CAV 18
-
Verifying Sanitizer Correctness through Black-Box Learning: A Symbolic Finite Transducer Approach,
[ pdf ],
S. Lathouwers, M. Everts, M. Huisman,
FORSE 20
Applications
-
Symbolic Automata Constraint Solving [ pdf ],
M. Veanes, N. Bjorner, L. de Moura,
LPAR10
-
From Sequential Extended Regular Expressions to NFA with Symbolic Labels
[ pdf ],
Alessandro Cimatti, Sergio Mover, Marco Roveri, Stefano Tonetta,
CIAA10
-
Symbolic Automata: The Toolkit
[ pdf ],
M. Veanes, N. Bjorner,
TACAS11
-
Fast and Precise Sanitizer Analysis with BEK [ pdf ],
P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, M. Veanes,
USENIX Security'11
-
Static Analysis of String Encoders and Decoders
[ pdf ],
L. D'Antoni, M. Veanes,,
VMCAI13
-
Applications of Symbolic Finite Automata
[ pdf ],
M. Veanes,
CIAA13
-
Data-Parallel String-Manipulating Programs
[ pdf ],
M. Veanes, D. Molnar, T. Mytkowicz, B. Livshits
POPL 2015
-
DReX: A Declarative Language for Efficiently Evaluating
Regular String Transformations
[ pdf ],
R. Alur, L. D'Antoni, M. Raghothaman,
POPL 2015
-
Back in Black: Towards Formal, Black Box Analysis of Sanitizers and Filters
[ web ],
G. Argyros, I. Stais, A. Kiayias, A. Keromytis,
S&P 2016
-
SFADiff: Automated Evasion Attacks and Fingerprinting Using Black-box Differential Automata Learning
[ web ],
G. Argyros, I. Stais, S. Jana, A. Keromytis, A. Kiayias,
CCS 2016
-
Automatic Program Inversion using Symbolic Transducers
[ pdf ],
Q. Hu and L. D'Antoni,
PLDI 2017
-
An Algebraic Framework for Runtime Verification
[ web ],
S. Jaksic, E. Bartocci, R. Grosu, D. Nickovic,
TCAD 2018
-
Aragog: Scalable Runtime Verification of Shardable Networked Systems
[ web ],
N. Yassen, B. Arzani, R. Beckett, S. Ciraci, B. Liu,
OSDI 2020
Tools based on Symbolic Automata and Transducers
-
symbolicautomata: a Java symbolic automata library. L. D'Antoni
-
Microsoft Automata Library: a library for symbolic automata.
M. Veanes, N. Bjorner
-
Rex: a command line tool that generates matching strings for .NET regexes. M. Veanes
-
Bek: a transducer-based language for
string manipulation. M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, N. Bjorner
-
Bex: a transducer-based language for analysis
of string encoders and decoders. L. D'Antoni, M. Veanes
-
Fast: a transducer-based Language for analysis of tree manipulating
programs. L. D'Antoni, M. Veanes, B. Livshits, D. Molnar
-
Mona: A solver for Monadic Second-Order logic. Aarhus University