Analyzing Memory Ownership Patterns in C Libraries

This paper appeared in the ACM SIGPLAN International Symposium on Memory Management and Implementation (ISMM) 2013.

This reasearch was conducted by Tristan Ravitch and Ben Liblit.

The source code is available under the BSD license on github.

Abstract

Programs written in multiple languages are known as polyglot programs. In part due to the proliferation of new and productive high-level programming languages, these programs are becoming more common in environments that must interoperate with existing systems. Polyglot programs must manage resource lifetimes across language boundaries. Resource lifetime management bugs can lead to leaks and crashes, which are more difficult to debug in polyglot programs than monoglot programs.

We present analyses to automatically infer the ownership semantics of C libraries. The results of these analyses can be used to generate bindings to C libraries that intelligently manage resources, to check the correctness of polyglot programs, and to document the interfaces of C libraries. While these analyses are unsound and incomplete, we demonstrate that they significantly reduce the manual annotation burden for a suite of fifteen open source libraries.

Paper

The full paper is available here in PDF or Postscript.