Lakshmi Narayanan Bairavasundaram
Email: laksh AT SERVER cs.wisc.edu

 
 
             
        
About Me | News | Professional Activities | Publications | Ph.D. Research | Internships  

 
 
 

About Me

I was a Ph.D. student in Computer Sciences at the University of Wisconsin-Madison from August 2002 to August 2008. My advisors were Prof. Andrea C. Arpaci-Dusseau and Prof. Remzi H. Arpaci-Dusseau. My dissertation is titled "Characteristics, Impact, and Tolerance of Partial Disk Failures."

I currently work at a storage start-up called Datrium. Previously, I was a researcher in the Advanced Technology Group at NetApp. My research interests include operating systems, file systems, storage systems, and data management.
 

News

  • May 2013: I joined Datrium, a storage startup.
 

Professional Activities

 

Publications


Warming Up Storage-Level Caches with Bonfire

Yiying Zhang, Gokul Soundararajan, Mark Storer, Lakshmi N. Bairavasundaram, Sethuraman Subbiah, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau
Proceedings of the 11th USENIX conference on File and Storage Technologies (FAST'13)
San Jose, California. February 2013.

Available as: PDF

Responding Rapidly to Service Level Violations using Virtual Appliances

Lakshmi N. Bairavasundaram, Gokul Soundararajan, Vipul Mathur, Kaladhar Voruganti, Kiran Srinivasan
Operating Systems Review 46(3): 32-40 (2012)
Available as: PDF

An Empirical Study on Configuration Errors in Commercial and Open Source Systems

Zuoning Yin, Xiao Ma, Jing Zheng, Yuanyuan Zhou, Lakshmi N. Bairavasundaram, Shankar Pasupathy
Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP'11)
Cascais, Portugal. October 2011.

Available as: PDF

How Do Fixes Become Bugs? A Comprehensive Characteristic Study on Incorrect Fixes in Commercial and Open Source Operating Systems

Zuoning Yin, Ding Yuan, Yuanyuan Zhou, Shankar Pasupathy, Lakshmi Bairavasundaram
Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE'11)
Szeged, Hungary. September 2011.

Distinguished Paper Award!
Available as: PDF

Italian for Beginners: The Next Steps for SLO-Based Management

Lakshmi N. Bairavasundaram, Gokul Soundararajan, Vipul Mathur, Kaladhar Voruganti, Steven Kleiman
Proceedings of the 3rd USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage'11)
Portland, Oregon. June 2011.

Available as: PDF

Fido: Fast Inter-Virtual-Machine Communication for Enterprise Appliances

Anton Burtsev, Kiran Srinivasan, Prashanth Radhakrishnan, Lakshmi N. Bairavasundaram, Kaladhar Voruganti, Garth Goodson
Proceedings of the 2009 USENIX Annual Technical Conference (USENIX'09)
San Diego, California. June 2009.

Available as: HTML PDF

Tolerating File-System Mistakes with EnvyFS

Lakshmi N. Bairavasundaram, Swaminathan Sundararaman, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of the 2009 USENIX Annual Technical Conference (USENIX'09)
San Diego, California. June 2009.

Best Paper Award!
Available as: Abstract PDF Postscript BibTex

   Characteristics, Impact, and Tolerance of Partial Disk Failures
Lakshmi N. Bairavasundaram
Ph.D. Dissertation, University of Wisconsin-Madison
Madison, Wisconsin. August 2008.

Available as: PDF

Analyzing the Effects of Disk-Pointer Corruption

Lakshmi N. Bairavasundaram, Meenali Rungta, Nitin Agrawal, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift
Proceedings of the International Conference on Dependable Systems and Networks (DSN'08)
Anchorage, Alaska. June 2008.

Available as: Abstract PDF Postscript BibTex

An Analysis of Data Corruption in the Storage Stack

Lakshmi N. Bairavasundaram, Garth R. Goodson, Bianca Schroeder, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of the 6th USENIX conference on File and Storage Technologies (FAST'08)
San Jose, California. February 2008.

Best Student Paper Award!
Available as: Abstract PDF Postscript BibTex

Parity Lost and Parity Regained

Andrew Krioukov, Lakshmi N. Bairavasundaram, Garth R. Goodson, Kiran Srinivasan, Randy Thelen, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of the 6th USENIX conference on File and Storage Technologies (FAST'08)
San Jose, California. February 2008.

Available as: Abstract PDF Postscript BibTex

An Analysis of Latent Sector Errors in Disk Drives

Lakshmi N. Bairavasundaram, Garth R. Goodson, Shankar Pasupathy, Jiri Schindler.
Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'07)
San Diego, California. June 2007.

Kenneth C. Sevcik Outstanding Student Paper Award!
Available as: Abstract PDF Postscript BibTex

Limiting Trust in the Storage Stack

Lakshmi N. Bairavasundaram, Meenali Rungta, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of the 2nd International Workshop on Storage Security and Survivability (StorageSS'06)
Alexandria, Virgina. October 2006.

Available as: Abstract PDF Postscript BibTex

Dependability Analysis of Virtual Memory Systems

Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of the International Conference on Dependable Systems and Networks (DSN'06)
Philadelphia, Pennsylvania. June 2006.

Available as: Abstract PDF Postscript BibTex

Semantically-Smart Disk Systems: Past, Present, and Future

Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Lakshmi N. Bairavasundaram, Timothy E. Denehy, Florentina I. Popovici, Vijayan Prabhakaran, Muthian Sivathanu
Sigmetrics Performance Evaluation Review (PER)
Volume 33, Number 4. March 2006.

Available as: Abstract PDF Postscript BibTex

Database Aware Semantically-smart Storage

Muthian Sivathanu, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of the 4th USENIX conference on File and Storage Technologies (FAST'05)
San Francisco, California. December 2005.

Available as: Abstract PDF Postscript BibTex

IRON File Systems

Vijayan Prabhakaran, Lakshmi N. Bairavasundaram, Nitin Agrawal, Haryadi S. Gunawi, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of 20th ACM Symposium on Operating Systems Principles (SOSP'05)
Brighton, United Kingdom. October 2005.

Available as: Abstract PDF Postscript BibTex

Life or Death at Block Level

Muthian Sivathanu, Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of 6th Symposium on Operating Systems Design and Implementation (OSDI'04)
San Francisco, California. December 2004.

Available as: Abstract PDF Postscript BibTex

X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs

Lakshmi N. Bairavasundaram, Muthian Sivathanu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA'04)
Munich, Germany. June 2004.

Available as: Abstract PDF Postscript BibTex

Dynamic Path Profile Aided Recompilation in a JAVA Just-In-Time Compiler
R. Vinodh Kumar, B. Lakshmi Narayanan, R. Govindarajan
Proceedings of the International Conference on High Performance Computing (HiPC'02)
Bangalore, India. December 2002.

Available as: PDF

Functional Unit Usage Based Thread Selection in a Simultaneous Multithreaded Processor

Deepak Babu M.I, Lakshmi Narayanan B, Madhu Saravana Sibi G, Ranjani Parthasarathi
Poster Session of the International Conference on High Performance Computing (HiPC'01)
Hyderabad, India. December 2001.

Available as: PDF
 
 

Ph.D. Research


Dissertation: "Characteristics, Impact, and Tolerance of Partial Disk Failures"
Much of the value people place in computers stems from the data stored in them. My dissertation focuses on failures related to loss or corruption of data. Disk drive failures are the primary causes of data loss. These failures are typically partial failures, where some disk sectors are unavailable due to a latent sector error or some disk blocks are silently corrupted. The goals of my dissertation are to (i) understand the characteristics of partial disk failures, (ii) analyze how these failures impact components of the storage stack, and (iii) develop solutions to tolerate such failures.

Failure characteristics: I have analyzed the occurrence and characteristics of latent sector errors and data corruption in a population of 1.53 million disk drives [SIGMETRICS07, FAST08a]:
  • The study of latent sector errors found that almost 20% of nearline (SATA) disk drives are afflicted by latent sector errors in 2 years of use, and that latent sector errors show high spatial and temporal locality.
  • The analysis of data corruption identified interesting corruption trends including the tendency of consecutive disk blocks to become corrupt, and the non-independence of corruption instances within the same disk and across different disk drives in the same storage system.
Impact on storage stack: Given that partial failures could affect a significant percentage of disks, it is important to understand their impact on different elements of the storage stack:
  • I have analyzed the impact of such errors on modern commodity file systems (IBM JFS, ext3, and Windows NTFS) using "type-aware" fault injection techniques [SOSP05, StorageSS06, DSN08]. The analyses found that even widely-used file systems have bugs in failure handling code, use illogically-inconsistent policies, and do not implement fault tolerance techniques like type-checking and replication effectively.
  • I have also investigated the mechanisms used by virtual memory systems (of Linux, FreeBSD, and Windows XP) to tolerate disk errors [DSN06], and found that they are inconsistent and ineffective as well.
  • I have applied model checking to data protection schemes used in real parity-based RAID systems and found that they are vulnerable to data loss or corruption due to disk errors such as "lost" writes [FAST08b].
Tolerance: The studies above show that even widely-used systems have bugs and cannot be trusted to handle partial disk failures. Therefore, it is important to lower the trust we place in any single system. I have developed a file system architecture based on N-version programming principles that reduces the need to trust any one file system [USENIX09b].
    
Other Research: My initial research experience was built on studying how different layers of a system interact and how "gray-box" techniques can be used in such layers to overcome the limitations of narrow interfaces. A specific application of such techniques has been in semantically-smart disk systems. These are disk systems that leverage basic knowledge of file system operation to provide significant improvements in performance, availability and security. I have primarily explored the performance angle. I have developed a cache mechanism for disk arrays that utilizes basic knowledge of file system data structures to provide an exclusive cache i.e. a cache that retains blocks that are not present in the file system cache [ISCA04]. I have also extended this technique to work effectively with database systems [FAST05]. Finally, I have worked on using block liveness information in a semantically-smart disk [OSDI04]. In addition to my research at Madison, I have been on research internships at IBM T. J. Watson Research Center, NY (June-Aug 2004), where I developed a distributed caching mechanism for storage systems, Intel Corporation, OR (May-Aug 2005), where I developed techniques for hypervisor-based fault injection, and NetApp, CA (Jun-Aug 2006, Jun-Aug 2007) where I analyzed data on storage system errors.  
 
 

Internships


Summer Intern. Advanced Development Group, Network Appliance.
Sunnyvale, CA. June - August, 2007.
I analyzed the occurrence of silent data corruption in disk drives, identifying important characteristics that would be useful for corruption-proof system design.

Summer Intern. Advanced Development Group, Network Appliance.
Sunnyvale, CA. June - August, 2006.
I analyzed data on latent sector errors (disk errors wherein a sector becomes inaccessible), examining the dependence on factors such as disk drive age and capacity, and identifying characteristics such as spatial and temporal locality.

Summer Intern. Core Virtualization Research Group, Intel Corporation.
Hillsboro, OR. May - August, 2005.
I developed techniques for hypervisor-based fault injection and applied these techniques to study operating system behavior when memory and disk errors occur.

Summer Intern. IBM T.J. Watson Research Center.
Hawthorne, NY. May - August, 2004.
I designed and implemented an extensible and scalable distributed cache architecture for an enterprise storage system. My scheme involved the use of remote memory (on machines over a high-speed LAN) for caching disk blocks.

Summer Research Fellow. Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR)
Bangalore, India. May - June, 2001.
I developed an instruction scheduler for a Java Just-in-Time compiler. The scheduler was built to utilize path profile information.