This research was conducted by Cathrin Weiss, Cindy Rubio González, and Ben Liblit. The paper appeared in the ACM SIGSOFT / IEEE TCSE 37th International Conference on Software Engineering (ICSE 2015).
Software is rapidly increasing in size and complexity. Static analyses must be designed to scale well if they are to be usable with realistic applications, but prior efforts have often been limited by available memory. We propose a database-backed strategy for large program analysis based on graph algorithms, using a Semantic Web database to manage representations of the program under analysis. Our approach is applicable to a variety of interprocedural finite distributive subset (IFDS) dataflow problems; we focus on error propagation as a motivating example. Our implementation analyzes multi-million-line programs quickly and in just a fraction of the memory required by prior approaches. When memory alone is insufficient, our approach falls back on disk using several hybrid configurations tuned to put all available resources to good use.
The full paper is available as a single PDF document. A suggested BibTeX citation record is also available.