Andrea Arpaci-Dusseau's Research Summary
Very Short Research StatementLarge new software systems are not built from scratch, but instead leverage existing software. One difficulty with using existing software is that it may not always behave as desired -- it may have performance, reliability, or security problems. My research addresses the problem of building layered systems from existing software that cannot be modified. Building layered systems can be simplified by treating each layer as a gray box. In a gray-box system, one starts with basic knowledge of how a layer is likely to be implemented; one then builds successively refined models of the layer by observing how the layer responds to inputs at run-time. Gray-box knowledge allows one to innovate in systems when one is unable to change either an interface or the implementation of a layer. Specifically, gray-box knowledge allows one both to acquire information about the internal state of a existing layer and to control its behavior in unexpectedly powerful ways. We have investigated gray-box systems in three important domains: user-level processes interacting with gray-box commodity operating systems, storage systems (e.g., RAIDs and/or single disks) interacting with gray-box file systems, and virtual machine monitors (VMM) interacting with gray-box operating systems. Across all of these domains we have developed fingerprinting tools to automatically infer and characterize the behavior of existing layers. We have also shown how gray-box knowledge can be used to improve performance, reliability, and security. Very Long Research StatementAs systems become more complex, are implemented by more developers, and contain more lines of code, it becomes increasingly less likely that any single person can understand how a system behaves. Understanding how our computer systems behave is of utmost importance for developers, administrators, and users -- all must be able to identify when a system is not behaving as expected, whether to fix a bug, re-configure a parameter, or switch to an entirely different system. The challenge for developers is even higher. Large new software systems are not built from scratch, but instead leverage existing software. One difficulty with using existing software is that it may not always behave as desired -- it may have performance, security, or reliability problems in certain environments. When existing software, and even the interfaces to that software, cannot be changed, then developers must figure out how to use the existing code appropriately (e.g., by either adapting to the existing software or by subtly controlling its behavior).
My research at the University of Wisconsin addresses both the problems
of understanding complex systems and of building layered systems from
existing software. My thesis is that these problems can be simplified
by treating each layer of the system as a
My research on gray-box systems can be roughly divided into two areas.
First, we have developed a range of I have been investigating layered systems primarily in two domains. The first domain consists of user-level processes interacting with commodity operating systems; in this case, we have assumed that the user processes view the OS as the gray box. The second domain consists of the file system interacting with the storage system, whether a RAID or a single disk; in this domain, we have investigated both the file system viewing the storage system as a gray box as well as the storage system viewing the file system as a gray box. This last instance is the environment in which we have performed our most in-depth research; we refer to this type of storage system as a semantically-smart disk system (SDS). In this document, I summarize our research on understanding and building layered systems. I first describe our work developing techniques to automatically infer complex system behavior. I then summarize our results from using gray-box knowledge to build new systems. Fingerprinting Existing Systems
We have been developing techniques for automatically characterizing, or
fingerprinting, the behavior of software systems. The fingerprinting
software starts with high-level knowledge of how the system is likely
to be implemented, and then constructs probes and observes the
resulting outputs from that component (e.g., the time required for a
particular request). The fingerprinting code can successively refine
its hypotheses by performing increasingly specific tests.
Fingerprinting helps one to infer We have developed innovative, yet practical, techniques for fingerprinting a variety of systems. These techniques can be roughly divided into three categories of increasing sophistication: those that insert probe operations and measure their completion time, those that make observations from multiple vantage points, and those that also manipulate the behavior of the system. I discuss these three classes of fingerprinting techniques in more detail.
Fingerprinting tools are useful and practical because they enable users and administrators to understand the actual systems that they are using. We have fingerprinted a variety of commodity systems, from the buffer replacement policies in NetBSD, Linux, Solaris, and HPUX, to the file systems of Linux ext2, ext3, ReiserFS, JFS, NetBSD FFS, and Windows NTFS. In many cases, our tools have revealed interesting problems within the systems. For example, Shear revealed that the RAID-5 mode of a common hardware controller employs a non-optimal left-asymmetric parity placement. and failure policy analysis isolated numerous bugs and illogical inconsistencies in several file systems. Our research on fingerprinting across these domains has revealed common principles. For example, we have found it useful to ensure that the system under test is operating in its steady-state regime before observing its outputs. We have found statistical techniques are needed to deliver automated and reliable detection, yet graphical depictions are useful for users to interpret the results. Building New Systems
Due to the amount of time, money, and effort required to build a large
software system, most systems leverage some amount of existing
software (e.g., the OS). Unfortunately, there are complications when
one leverages existing code that cannot be modified: the borrowed code
may not have the desired behavior in some environments. In this
situation, one can use gray-box knowledge to better operate with the
existing code. Specifically, one can either Adaptation
When a new layer has gray-box knowledge of how other existing layers
behave, the new layer can adapt its own behavior appropriately. In our
investigations, we have found that the primary challenge is to infer
the
While performing this research, we have addressed a number of overarching challenges. For example, when inserting probes, a primary challenge is to perform probes that do not change the state of the system. When performing on-line simulation, one of primary challenges is to develop a model of the gray-box layer that is accurate enough to predict internal state, while being simple enough to be used efficiently on-line. Finally, the major challenge we have addressed is dealing with asynchrony within the gray-box layer; that is, the gray-box layer may buffer or reorder its outputs, such that the outputs do not match the current state of the layer. ControlWhen building a layered system, if the existing layers do not exhibit the desired behavior, gray-box knowledge can be used in a more radical way: the system can indirectly modify the behavior of existing layers without changing their implementation. As an example, consider the case where a gray-box layer (such as the OS) implements a page replacement policy that is non-optimal for the user workload (e.g., LRU). The user process can influence the OS replacement policy, without changing the OS code, if it knows the internal state of the OS (i.e., which pages are likely to be evicted next) and if it can then access the pages that it does not want evicted, thereby encouraging the OS to keep them resident. We have implemented a few case studies where one influences the policy of an underlying gray-box layer. For example, we have developed a user-level service that changes how the file system places files and directories on disk by selectively naming, inserting, and deleting files. However, our experience has shown that, given the detailed gray-box knowledge needed to support this type of control, it is useful to expand interfaces slightly to expose more internal information. In our work, we have focused on how to make minimal changes to interfaces, under the theory that one should leverage the existing code base to the fullest extent possible.
One context in which we have explored the benefits of exposing more
internal state is within an
One of the additional contributions of the infokernel research is in
identifying key abstractions that can be exported with little
complexity from the OS, yet support a broad range of more
precise user-level control. For example, our case studies have shown that a
Exploring how gray-box systems can be constructed with
absolutely no changes to existing layers makes it easier to then
identify the small changes to interfaces that
greatly improve system functionality. We have explored
the issue of how interfaces should be changed in a variety of
contexts. For example, we have investigated the ability to control
TCP further by developing Finally, our experience with gray-box systems has stressed the need for models of system behavior. Complex systems make assumptions about the behavior of their subsystems, beyond those specified by their interfaces; however, for systems to operate correctly, these assumptions must be explicitly stated. As a starting point, we have developed a logical framework for modeling the interaction of a file system with the storage system. This model defines the assumptions that the storage system can make about the file system and can help ensure that on-disk data structures are kept consistent. I believe the value of such models will increase as systems continue to increase in complexity. |
Menu
Fall'21 Office Hours
TBA
Office:
7375 Computer Sciences
Email:
dusseau "at" cs.wisc.edu