Using SD-Dyninst to Study Malware Obfuscations
This page describes an ongoing research project that is supported
by grants from DOE, DHS, NSF, and
PC anti-virus reviews.
Security analysts' understanding of the behavior and
intent of malware samples depends on their ability to
build high-level analysis artifacts from the raw bytes of
Thus, the first step in analyzing defensive malware is understanding
what obfuscations are present in real-world malware binaries.
To this end, we present a thorough examination of the obfuscation
techniques used by the packer tools that are most
popular with malware authors [Bustamante 2008].
Though previous studies have discussed the current state of
binary packing [Yason 2007], anti-debugging [Falliere 2007], and
anti-unpacking [Ferrie 08] techniques, there have been no
comprehensive studies of the obfuscation techniques that are
applied to binary code.
While some of the individual obfuscations that we discuss have
been reported independently, this paper consolidates the
discussion while adding substantial depth and breadth to it.
We describe obfuscations that make binary code difficult
(e.g., control-transfer obfuscations,
exception-based control transfers,
incremental code unpacking,
to accurately disassemble into instructions
(e.g., ambiguous code and data,
to structure into functions and basic blocks
(e.g., obfuscated calls and returns,
overlapping functions and basic blocks);
(e.g., obfuscated constants,
and to manipulate
We also discuss how to mitigate the impact of these
obfuscations on analysis tools such as disassemblers, decompilers,
slicers, instrumenters, and emulators.
This work is done in the context of our project to build tools for the
analysis [Jacobson et al. 2011][Rosenblum et al. 2008] and instrumentation
[Bernat and Miller 2011][Hollingsworth et al. 1994] of binaries, and to recent work
on extending these analysis to malware binaries that are highly defensive
[Bernat et al. 2011][Roundy and Miller 2010].
We use a combination of manual and automated analysis techniques for
this study. We began by creating a set of defensive program binaries
that incorporate all of the anti-analysis techniques found in real
obfuscated malware. We created these binaries by obtaining the latest
versions of the binary packer and protector tools that are most
popular with malware authors [Bustamante 2008] and applying them to
program binaries of various sizes. We carefully analyze the binaries,
paying special attention to the obfuscated bootstrap code with which
the modified program unrolls the original binary payload into the
address space, and to any changes that the obfuscation tool made to
the payload code itself.
We obtained most of our observations on these obfuscated binaries by
adapting the Dyninst binary code analysis and instrumentation tool for
the analysis of highly defensive program binaries, and then using it
for that purpose [Bernat et al. 2011; Roundy and Miller 2011]. Our
design and development process required a detailed understanding of
the obfuscation techniques employed by these packers, which resisted
our attempts to discover, analyze, monitor, and modify their code. Our
ambitious analysis and instrumentation goals made this a significant
challenge. Dyninst applies parsing techniques to disassemble
obfuscated code and construct control-flow graphs (CFGs) of the
program, updating this analysis at runtime by observing the behavior
of the monitored program. Based on this analysis of the defensive
binaries, we stress-tested our tool's analysis and instrumentation
techniques by instrumenting every memory access and every basic block
in the program. Our instrumentation tool is designed to be resistant
to any errors in the analysis [Bernat et al. 2011], however, our
initial prototype was not, and therefore ran head-on into nearly every
obfuscation technique employed by these programs [Roundy and Miller
2011]. We automatically generate statistical reports of defensive
techniques employed by these packer tools with our malware-resistant
version of Dyninst and present those results in this study. We also
spent considerable time perusing each binary's obfuscated code by hand
in the process of getting Dyninst to successfully analyze these
binaries, aided by the OllyDbg and IdaPro interactive debuggers
(Dyninst does not have a code-viewing GUI). In particular, we
systematically studied the bootstrap code of each packed binary to
achieve a thorough understanding of its overall behavior and
high-level anti-analysis techniques.
Bernat, A. R. and Miller, B. P. 2011. Anywhere, Any Time Binary Instrumentation. In Workshop on Program Analysis for Software Tools and Engineering (PASTE). Szeged, Hungary.
Bernat, A. R., Roundy, K. A., and Miller, B. P. 2011. Efficient, Sensitivity Resistant Binary Instrumentation. In International Symposium on Software Testing and Analysis (ISSTA). Toronto, Canada.
Bustamante, P. 2008. Packer (r)evolution. Panda Research web article.
Falliere, N. 2007. Windows anti-debug reference. Infocus web article.
Ferrie, P. 2008. Anti-unpacker tricks. In International CARO Workshop. Amsterdam, Netherlands.
Hollingsworth, J. K., Miller, B. P., and Cargille, J. 1994. Dynamic program instrumentation for scalable performance tools. In Scalable High Performance Computing Conference. Knoxville, TN.
Jacobson, E. R., Rosenblum, N. E., and Miller, B. P. 2011. Labeling library functions in stripped binaries. In Workshop on Program Analysis for Software Tools and Engineering (PASTE). Szeged, Hungary.
Rosenblum, N. E., Zhu, X., Miller, B. P., and Hunt, K. 2008. Learning to analyze binary computer code. In Conference on Artificial Intelligence (AAAI). Chicago, IL.
Roundy, K. A. and Miller, B. P. 2011. Hybrid analysis and control of malware. In Symposium on Recent Advances in Intrusion Detection (RAID). Ottawa, Canada.
Yason, M. V. 2007. The art of unpacking. In Blackhat USA. Las Vegas, NV.