CS 736 Reviews - Spring 2017: Memory Resource Management in VMware ESX Server

1. Summary
This paper introduces ESX, a VMWare virtual machine layer that can efficiently manage hardware resources. The focus of this paper in mainly in memory management, including ballooning, idle memory tax, content based page sharing and hot IO page remapping.

2. Problem
This paper mainly deals with the problem of how to efficiently manage memory resource of VMM, this includes flexible allocate, reclaim, and overcommit memory resources for multiple guest OS. This is a hard problem since VMM knows little about the details of guest OS. Also, the previous work of virtual machine monitor like Disco still need to modify the guest OS running on VMM for resource management purposes, which is not possible for commercial VMM software. In ESX, this paper tries to make VMM completely independent of guest OS modification.

3. Contributions
This paper mainly introduces the following techniques of efficiently manage memory resources.
Ballooning is to add a pseudo-device driver added to the guest OS. It can utilize the memory management policy that exists in the guest OS, and let guest OS itself decide the importance of pages and how to inflate/deflate pages.
Content based page sharing can share pages across different guest OSes. It uses a hashing function to identify pages with the same content and share them between different OSes, it also has a technique like copy-on-write that only create new pages when there is a write action on the shared pages.
Idle memory page (memory that is not frequently used, or belongs to idle process) is detected by a sampling algorithm in the memory of a guest OS. Then, an idle memory tax algorithm can take memory from virtual machine that have idle pages, and give memory to virtual machines that require memory.
Other techniques include remapping of ‘hot’ (current IO frequently) pages for data transfer. Page can automatically map to low memory, so that some DMA and NIC that can only address lowest 4GB memory can provide better performance.
Also, notice that all these techniques do not need to modify the source code of guest OS, which is very important for using VMM in real world.

4. Evaluation
In the experiment section, multiple experiments are designed to test each technique mentioned in the contribution section and the results are in separate charts. Also, it uses different workloads like database application, and different OSes like Linux or Windows to test the VMM performance. The experiment result is good, and it also proves that their method can be widely used in different workloads and different OSes without modifying the source code of OSes.

5. Confusion
How does content based page sharing deals with hash collision? The paper only claims that hash collision rate is small, but I think it is not good enough for VMM level code. This paper also mentions a chaining? Is there some details about this?

Posted by: TIanrun Li | February 7, 2017 07:06 AM

1. Summary
The paper presents mechanisms and polices for memory reclamation and management in a hypervisor running multiple unmodified Guest VM's in an over-provisioned setting. The author provides introduces three techniques as: page swapping, ballooning and content based page sharing.
2. Problem
While doing memory management with unmodified guests only the guests is aware of pages not in use. This makes it difficult for the hypervisor to reclaim memory from the guests without severely limiting guest performance and even can lead to crashing the guest if the wrong page is (kernel page) is accidentally reclaimed.
3. Contributions
The three mechanisms provided for memory reclamation: page swapping, ballooning and content-based page sharing. The mechanisms are provided as transparent mechanisms, which in this context means that the guest is unaware of their existence. Page swapping is the hypervisor mapping the pages to be reclaimed to a file and taking them out of the physical memory onto the disk. the ballooning method adds additional stress on the overall physical memory availability of the system by provisioning pinned physical pages private to the hypervisor forcing the guest os's memory management to kick in. This method is also made more efficient using shadow page table support in hardware. The final mechanism is to find common pages among guest vms using the contents of their individual pages. This is done by providing a hashing function for each page and a periodic scanning mechanism to scan and combine these common pages in the guests shadow page tables.
4. Evaluation
The author provides a convincing case for using these techniques as well as provide a detailed analysis of their individual and combined affects of these effects on memory throughput and usage in their described environment. They use a balanced set of benchmarks relevant at the time as well as abstract use cases such as similar guests.

Posted by: Akhil Guliani | February 7, 2017 05:48 AM

1. summary
This paper introduced several techniques for memory resource management of Virtual Machine in ESX Server, including cooperative page reclaming, idle memory tax for utilization with isolation guarantee, content-based transparent page sharing,IO Page Remapping and dynamic resource relocation.

2. Problem
Efficiently support virtual machine workload that overcommit memory. This seems a traditional problem for operating system that os can support more virtual memory than physical memory and in most cases it works well because most applications are idle in memory. The problem is more interesting in VMM because again, the knowledge is not held totally by one layer.

3. Contributions
Several novel and elaborate techiques were proposed that counld provide insight in more general cases for resource management system.
a). Cooperative page reclaming: The balooning driver method is neat by exploiting the interface of driver to allocate a page.
b). Memory Sampling and Idle memory: Usage of sampling
c). Content-based transparent page sharing
d). IO Page remapping: tracing, detecting the problem and adjusting mapping at runtime
e). Dynamic resource reolcation

4. Evaluation
The paper evaluated the proposed method using different workloads to domenstrate the effictiveness of each specific techinique.They also included a exxperiment to show the effect of the combination with five windows VMs. I wonder what is the result of real-world workload? For vmware, they had this kind of data.

5. Confusion
a). What abut the CPU overheads of these techiniques?
b). The combination of ballooning driver and content-based page sharing, if balloon free a shared page?

Posted by: Jing Liu | February 7, 2017 04:44 AM

VMware ESX server is a Type 1 hypervisor. This paper discusses memory resource management techniques employed in VMware ESX server to efficiently support virtual machines that overcommit memory and reduce overall memory pressure on the system. Several mechanisms such as memory reclamation, memory sharing and policies for dynamic reallocation of memory are discussed.

Problem:
* Support dynamic memory management among Virtual Machines running on hypervisor without modifying the operating system.
* Memory management from the hypervisor level using meta-level page replacement can be inefficient and may lead to issues such as double paging.

Contributions:
* Ballooning technique makes use of pseudo device driver or kernel module in the guest OS that communicates and cooperates with the ESX server via a private channel. The Balloon module can be controlled by ESX server to "inflate" or "deflate" causing it to increase/decrease memory pressure and hence help in memory reclamation as and when needed. to along with demand paging to reclaim memory.

* Demand Paging
Memory can be reclaimed without requiring the guest OS support by paging out to an ESX server swap area on disk.

* Sharing Memory between VMs running similar workloads to consume less memory

> Content Based Page Sharing:
Pages with identical contents can be shared across VMs. Such pages are identified by using hash value summarizing the page tables. The implementation can benefit homogenous VMs. Standard copy-on-write technique is used to share the pages.

* Share Based Allocation: To enable the VMs to achieve efficient memory utilization while maintaining memory performance isolation guarantees. Shares indicate fraction of total share in the system and thus the relative resource right of a VM. Here VMs are entitled to consume resources proportional to its share allocation.

* Reclaiming Idle memory using Idle memory tax:
> "Statistical sampling approach" is used to determine idle memory.
> The paper introduces the notion of Idle Memory Tax which specifies the maximum fraction of idle pages that can be reclaimed from client.

*Allocation Policies and Admission Control
Different policies for Admission Control and I/O Page Remapping are discussed.

Evaluation:
* The paper presents reasonable good evaluation of the strategies it employs by demonstrating them on different OS's including Microsoft Windows and Linux based VMS and assuming both uniform and diverse workloads for evaluation.

Confusion
1. When techniques such as ballooning are employed in virtual machines on cloud e.g Amazon EC2 ,what kind of SLA's are guaranteed to the users of the VM?
2. The paper mentions that sharing level among identically configured VMs reaches 67%, however isn't it the case that OS security features, and address space layout randomization (ASLR) in particular, can significantly prevent sharing to such extents even with identical OS's running on VMs?
3. The discussion on I/O page remapping is not very clear.

Posted by: Lokananda Dhage Munisamappa | February 7, 2017 04:32 AM