About Me

I am a first year Graduate Student pursuing Masters from the University of Wisconsin-Madison. I graduated from BITS Pilani in Computer Science.

I have a three year work experience at NVIDIA and was involved in projects including 3D Vision Pro and NVIDIA Grid vGPU. My areas of interest include Virtualization, Systems, Networking and Artificial Intelligence.

I am currently part of a research project to profile the impact on an enterprise network of deploying the network middleboxes on cloud.


University of Wisconsin-Madison 2014 to 2016 GPA 4.0/4.0
  1. Advanced Operating System
  2. Introduction to Artificial Intelligence
  3. Advanced Databases
  4. Machine Learning
BITS Pilani 2006 to 2011 GPA 8.3/10.0
  1. Operating SystemsComputer
  2. Networks
  3. Network Programming
  4. Advanced Computer Organization
  5. Data Structures and Algorithms
  6. Data Storage Technologies and Networks

Work Experience

Nvidia Graphics Pvt Ltd Jul'11 to Jul'14 System Software Engineer
  1. Implemented the NVIDIA driver changes for a threefold increase in scalability in terms of the maximum number of vGPU VMs supported on a system.
  2. Improved the virtual machine graphics performance by implementing an optimized guest physical to machine page translation.
  3. Designed and implemented the infrastructure for automated building of vGPU Xenserver driver.
Texas Instruments Jan'06
  1. Ported the Logging mechanism of eclipse's Real-Time Software Component (RTSC) to C.
  2. Incorporated additional features based on the requirements from Texas Instruments in thelogging mechanisms.


Programming Languages
  1. C
  2. Java
  3. Python
  4. Bash
Operating Systems
  1. Linux (RHEL, Fedora, Ubuntu)
  2. Microsoft Windows Family


Virtual Middlebox

We are doing an analysis of the performance impacts of moving the network middleboxes to virtual machines and containers. The goal is to design a scalable and an optimized enterprise network with middlebox elements deployed on cloud. We will explore the opportunities to deploy the middlebox by keeping the following things in mind:

  1. Minimize the data copying between middleboxes.
  2. Optimize the data transfer time between the middleboxes.
  3. Minimize the computation on a packet as it pass through the middlebox stack.
  4. Load balancing of the computations for the network packets.
  5. Data security by restricting the communication to only a set of authorized middleboxes.
Eclipse Plugin for Swamp

SWAMP is a framework to perform analysis on the user source code and seek for potential security vulnerabilities. I am working to create a plugin for the Eclipse IDE. The plugin will allow the user to ship their project (C/C++, JAVA and Python) from the workspace to the SWAMP cloud infrastructure for analysis. The development effort is in Java.

Isolation:Linux Containers v/s Xen Virtual Machines

Virtualization systems share several hardware resources that includes CPU, memory and the I/O devices and thus the challenge of performance isolation arise. Ideally, an application running inside one environment should not at all be affected from the activities of the other environments it shares hardware resources with. We quantify the extent of isolation provided by the two popular virtualization systems Xen and Linux containers (LXC) by performing a series of experiments on CPU, memory, network card and the block I/O. Findings:

Test LXC Xen VM
CPU Isolated Isolated
Recursive Fork Not Isolated Isolated
Network LXC Xen VM
Test Not Isolated Only egressing traffic isolated
Block I/O Partially Isolated Not Isolated
Code Compression Inspired by Procedural Abstraction

Motivation: Source code size reduction for embedded systems due to the lack of memory resources.
Procedural Abstraction: Code is searched for matching sequences and those sequences are then replaced by a single procedure resulting in lower program size.
Additional Compression: We observed that there is an opportunity for further compression by identifying the sequence of instructions that matches only partially (in the opcode part). We proposed a new compression algorithm to seek for repeating sequences of opcodes in the code section. Since the probability of finding a matching sequence of opcode is greater, it is possible to find sequences that are frequent and large in size. The repeating sequences are replaced by a marker instruction and are divided into the opcode and non-opcode parts. The code size reduction is obtained by saving only one copy of the opcode part of the repeating sequence and discarding others.
Opportunity: When tested on multiple small ARMv6 programs, we found out that there is an opportunity for compression in the range of 3-5%.
Roadblock: An implementation of the decompression algorithm on the hardware was proposed that required changes to the ARMv6 instruction pipeline. However, maintaining clock sync to correctly generate instructions from the subparts and stalling new instructions until the decompression completes were the biggest setbacks to the implementation


Virtualized GPU Engine NVIDIA Graphics Pvt Ltd Pending under USPTO: 13/915630
  1. Proposed a mechanism to virtualize the various GPU engines to be used as independent devices by the Virtual Machines (VM).
  2. Suggested the design architecture to support the feature. The proposal is to build a software unit to trap the I/O requests from the VM corresponding to their GPU engine devices, and manage the execution on the real hardware GPU.