NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories]
NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. Jian Xu and Steven Swanson, Usenix FAST 2016.
Reviews due Thursday, 3/30.
« The Design and Implementation of a Log-Structured File System] | Main | Scale and Performance in a Distributed File System »
NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. Jian Xu and Steven Swanson, Usenix FAST 2016.
Reviews due Thursday, 3/30.
Comments
1. Summary
This paper describes NOVA, a file system for a hybrid of volatile and non-volatile memories. NOVA uses ideas from journaling file systems, conventional file systems, and log-structured file systems to provide atomic file system operations.
2. Problem
Existing file systems assume that hard disks or solid-state drives are used for non-volatile storage. Disks often have high latency but high throughput, while non-volatile main memories (NVMM's) have low latency and high throughput. As a result, existing file systems do not experience significant speedup when using a combination of DRAM and NVMM. New file systems that use NVMM must also take care not to perform write reordering, in case the system crashes between writes, and they must ensure that file system operations are atomic.
3. Contributions
NOVA uses journaling to provide atomicity. If NOVA performs updates across multiple inodes, it writes information about these updates to the journal for that CPU, and it uses this information to roll back updates after a crash. NOVA also uses a per-inode log to record metadata updates to each directory or file. Unlike Sprite LFS, these logs do not require large free segments. Each log is implemented as a linked list in NVMM, along with a radix tree in DRAM. The system performs fast garbage collection to quickly reclaim space while extending a log, and it performs thorough garbage collection when fast garbage collection does not provide enough space. Furthermore, NOVA uses shadow paging to modify file data. This eliminates any need for garbage collection of data pages.
4. Evaluation
The evaluation consists of several microbenchmarks and macrobenchmarks. The microbenchmarks compare NOVA to other filesystems for NVMM, and several filesystems for more conventional disks. They show that filesystems for conventional disks can be very inefficient on hybrids of DRAM and NVMM, and NOVA is generally more efficient than other filesystems for NVMM. The macrobenchmarks use four Filebench workloads to show that NOVA can generally perform more operations per second than other filesystems.
5. Confusion
What are the similarities and differences among the different NVMM technologies? Why does NOVA assume that they can all be treated similarly?
Posted by: Varun Naik | March 30, 2017 08:00 AM
1. summary
The paper proposed NOVA, a log-structured, POSIX file system that designed for the NVMs working with DRAM (hybrid), in which LFS is modified for exploiting the performance of NVM while providing consistency guarantee. They exploit the property of NVM that random access is cheap, thus reduce unncessary overhead in traditional LFS to give more concurrency, make GC easier, provide atomic operation and implement lightweight journaling.
2. Problem
Hybrid DRAM/NVMM storage systems present new opportunities and challenges for design of file system. These systems need to reduce software overhead to fully exploit NVMM’s high performance and efficiently support more flexible access patterns, and at the same time they must provide the strong consistency guarantees that applications require. Traditional file system which designed for either HDD or SSD cannot exploit the performance of NVMM but depend on the atomic guarantee of hardware. While trying to use LFS in NVMM, conventional LFSs use GC to ensure the contiguous free regions, but introduce overhead, thus perform worse than journaling file system no NVMM.
3. Contributions
a). They keep logs in NVMM, use radix-tree to index log in DRAM, seperate log for each inode to provide more concurrency with quick recovery
b). They use linked-list to orginaze log because random access is efficient. Thie data structure make it possible to not use contiguous region, fine-grained cleaning and easier reclaiming.
b). Handle write re-ordering: commit data and log entries to NVMM before updating the log tail, commit journal data to NVMM before propagating updates, commit new versions of data pages to NVMM before recycling stale ones.
c). Provide atomicy: though 64-bit atomic updates, logging in the inode’s log, lightweight journaling for directory operations.
4. Evaluation
The use both microbenchmark and macrobenchmarks for evaluation. They compared to two existing NVMM file systems, traditional log-structured file systems with emulation, other default file systems. In most cases, NOVA outperform others. NOVA has the lowest file system operation latency and has better performance under a file server, web proxy, web server, and varmail.
5. Confusion
a) Not have a model for NVM in mind, like the size, the guarantee, the interface.
b) Why they choose LFS for NVMM, but not to modify other file system.
Posted by: Jing Liu | March 30, 2017 07:55 AM
1. Summary
The paper talks about NOVA, a log-structured file system designed for hybrid volatile/non-volatile main memories. By extending the ideas of LFS, NOVA provides a high-performance file system that allows both fast and efficient garbage collection and quick recovery from failures.
2. Problem
Existing file systems using spinning or solid state disks incur software overheads thereby obscuring the performance that NVMs would provide. Moreover, file systems using NVMs also incurred similar overheads and failed to provide strong consistency guarantees which is often required by the applications. There was, thus, a need for a file system to guarantee consistency while also providing improved performance.
3. Contributions
The novel contribution of the paper are the several considerations that went into designing NOVA. To maximize concurrency during normal operation and recovery, NOVA assigns each inode a separate log. These logs are stored as linked lists which means they need not be contiguous in memory. Since NOVA does not log data, the recovery process only scans a small fraction of NVMM. This allows NOVA to immediately reclaim stale pages which significantly reduces garbage collection overhead. This also allows NOVA to achieve good performance even when the file system is full.
By extending the ideas of LFS, NOVA exploits the characteristics of hybrid memory systems. Atomic mmap, in NOVA, exposes a simplified interface to expose NVMM directly to applications requiring strong consistency. NOVA also outperforms other file systems running on hybrid memory systems.
4. Evaluation
The author evaluate NOVA on PMEP which allows for emulating NVMMs and also configuring latencies and bandwidth. A single-thread micro-benchmark was used to evaluate the latency of basic file system operations. NOVA outperformed other filesystems and provides lowest latency for all operations. Additionally, macro-benchmarks - fileserver, web proxy, web server and varmail- were used to evaluate the application level performance of NOVA. Herein, NOVA provides best performance while guaranteeing strong consistency.
5. Confusion
Could you explain section 4.8 on NVMM protection ?
Posted by: Dastagiri Reddy Malikireddy | March 30, 2017 07:36 AM
Summary
The paper describes the design and implementation of a new filesystem called NOVA which is optimized for Non-Volatile Main Memory (NVMM) which would be used in conjunction with DRAM as a hybrid main memory connected to the CPU memory bus.
Problem
The expected performance benefits from NOVA installed systems is not fully realised because of the various software overheads in the existing file systems. Also conventional file systems are built considering the performance characteristics and consistency guarantees provided by disks. But NVMMs differ significantly in both these aspects. It provides much better performance and different consistency guarantees.
The main challenge for NOVA was to improve performance while guaranteeing consistency and atomicity of complex operations.
Contributions
NOVA adapts the LFS to take advantage of the random access ability in NVVM. Some of its main contributions are :
a. Inode table for each CPU to enable concurrency during normal operation and during recovery.
b. Using a log separate log for each inode. This is to maximize the concurrency in a multi core system.
c. There is no need for NOVA to value sequentiality anymore becauses of the random access ability provided by NVVM. For this reason it implements the logs as linked lists thereby relaxing the need for having large chunks of contiguous memory and thus reducing garbage collection costs.
d. Atomicity of single inode log updates are guaranteed by atomic tail pointer update and updates which span multiple inodes are made atomic by using a light weight journal.
e. Another major contribution is that NOVA doesn’t log updates to file data, instead it uses copy on write for the modified pages. Thus it reduces log size accelerating the recovery process, reduces the garbage collection cost and makes it easier to reclaim stale pages and allocating new pages.
Evaluation
NOVA is evaluated on Intel PMEP which can emulate various types of NVVM and compared against seven other filesystems across various micro benchmarks like file create, fsync and delete and macro benchmarks like filserver, webproxy, webserver etc. NOVA performed especially well in the write intensive workloads mainly due to its reduced garbage collection costs.
Confusion
I would like the implementation of write ordering be discussed in detail in the class.
Additional bandwidth requirements of logical journaling is not too much. It is not discussed in the paper. Are there any downsides of using it with NVVM?
Posted by: Mayur Cherukuri | March 30, 2017 04:24 AM
Summary:
This paper provides design and implementation of a log structured file system (NOVA) for hybrid (volatile and non-volatile) memory system with two major goals in mind – 1) Make full and optimized use of increasingly large, fast memories especially NVMM (in comparison to disks which are slow), and 2) Provide consistency similar to disks. NOVA improves performance by reducing concurrency overheads (by maintaining a log per inode), and persisting file data separate from log which in turn leads to smaller log size and reduced garbage collection overhead. It provides fast atomicity for metadata, data, and mmap updates and enforces write ordering to ensure consistency. The paper finds NOVA having stronger consistency and higher performance than current NVMM FSs in its evaluation.
Problem:
Rapid improvements in NVMMs has the potential to expand current memory system from volatile memory only to hybrid memory system containing both volatile and non-volatile memories. In such a case, hardware (disk I/O) may no longer be a bottleneck and would require software to become more efficient and performant. Also, a FS faces additional challenges of managing, retrieving and sustaining consistency of NVMM data. Contemporary filesystems designed for NVMMs either do not make full use of low latency of NVMMs while maintaining consistency or exhibit weaker consistency while trying to utilize performance of NVMMs. Nova takes the best of both words and provides a FS with high performance and strong consistency.
Contributions:
1. Keeps logs in NVMM and indices in DRAM. Uses radix tree for search operations in DRAM. NVMM is divided into 4 components – a) superblock (contains global filesystem) and recovery inode (contains recovery information which helps in faster remount after clean shutdown), b) inode tables (contains inodes), c) journals (provide atomicity to directory operations), d) NVMM log and data pages.
2. Provides log per inode. This makes concurrent operations easier, and faster. Also, number of log files in this case is small. Both these factors help in improving performance of the system.
3. Logs are implemented as singly linked lists. This make allocation and deallocation of log space easier and more efficient.
4. Uses 64-bit atomic updates, inode logging, and lightweight journaling to provide fast atomicity for data, metadata, and mmap updates. Enforces write ordering to ensure consistency.
5. File data is not logged. Copy-On-Write (COW) is used for modified pages.
6. Immediately cleans stale data page(s) during a write operation. Uses “Fast GC” to reclaim a log page if all entries of that page are dead. If live entries are 7. Uses lazy rebuild policy ie. postpones re-building the radix tree and inode until first access of inode to make recovery process faster.
Evaluation:
The paper is concise and easy read. For evaluation, authors emulate different types of NVMMs on Intel PMEP in 2 configurations – a) same read latency and bandwidth as of DRAM, and b) higher read latency and lower bandwidth than DRAM. Authors use single-thread micro-benchmark to evaluate latency of basic filesystem operations and various macro-benchmarks workloads such as fileserver, webproxy, webserver and varmail. NOVA performed significantly better in most of the cases. Paper also evaluates efficiency of garbage collection and recovery overhead in NOVA.
Confusion:
On page 326, paper says that “logs that support atomic updates are easy to implement correctly in NVMM, but they are not efficient for search operations”. Why is that so?
Posted by: Rahul Singh | March 30, 2017 04:17 AM
1. Summary
File systems for hybrid Systems with Volatile and Non-volatile memories face consistency issues and performance challenges. Authors have addressed these issues and proposed new log-structured file system with better performance and consistency guarantees.
2. Problem
Traditional file systems are not suitable for NVMM because they rely on disks for performance and consistency. Problems like reordered memory stores, cache flush orderings, etc. can cause serious consistency issues. LFS (Log structured file system) doesn’t offer any advantage of bulk writes as random writes are cheaper in NVMM. LFS has costly garbage collection cycle. Authors have proposed a new file system NOVA to tackle these problems.
3. Contributions
NOVA file system extends LFS’s log structure techniques. Unlike LFS, NOVA defines different log for each Inode. As a result, NOVA offers more concurrency as compared to LFS. Atomic updates to logs are supported in NOVA. NOVA doesn’t need contiguous free space and hence, uses Linked List with tail pointers to store data. NOVA also offers lightweight journaling to track multiple Inode modifications and re-run transaction in case of crashes. Unlike LFS, NOVA doesn’t store data in log and hence, logs are smaller in NOVA. File indexes are stored in volatile memory in Radix trees. The leaf nodes point to log entries. These log entries point to data blocks.
NOVA uses Red black trees to track free blocks and NVMM is divided into number of pools to offer more degree of parallelism for storage allocation.
NOVA offer Atomic-mmap which provides strong consistency guarantees for writes. Atomic-mmap maps file into volatile memory in replica pages. NOVA writes update to the file when application invokes msync function.
NOVA offers two types of Garbage collection – Fast GC and Thorough GC. Fast GC is used if all entries in a log are dead. Thorough GC is used in case live log entries account for less than 50%. This phase is similar to LFS’s garbage collection. But this is faster in NOVA as logs are smaller.
NOVA has two recovery procedures – Recovery after normal shutdown and recovery after failure. Recovery after normal shutdown is fast as it simply involves restoring the allocator from recovery inode. Recovery after a failure is costlier as it involves inode logs scanning to fix possible corruptions.
4. Evaluation
Authors have compared the performance of NOVA against PMFS, Ext4-DAX, NILFS2, F2FS, default EXT4, Ext4-data and Btrfs with micro-benchmarks and macro-benchmarks. NOVA seems to provide lower latency for basic file operations. NOVA offers better throughput for workloads like Webproxy, Webserver, etc. against seven file systems. Authors have also measured Garbage collection efficiency of NOVA. NOVA has slightly higher overhead in recovery from power failure.
5. Confusion
Is there any limitation on number of files that can be created as a log is assigned for each inode?
Can you please explain the importance of ordering of writes on various data structures of NOVA?
Posted by: Rohit Damkondwar | March 30, 2017 04:16 AM
Summary:
This paper describes NOVA which extends the ideas of LFS for systems with hybrid memory (DRAM and NVMM). In NOVA the data is not stored in the log resulting in smaller log which results in fast and efficient garbage collection and quick recovery from system failures.
Problem:
As conventional File systems are designed considering the performance characteristics of disks they are not suited for Hybrid memory systems.
Various techniques to provide atomicity like journaling, shadow paging and conventional log-structuring are not well suited to NVMM as they incur significant overheads.
Providing strong consistency guarantee is challenging for memory based file systems as modern processors can reorder store operations and power failures can leave the data in an inconsistent state.
The proposed file system NOVA addresses these issues.
Contributions:
- NOVA is a log structured file system which exploits fast random access provided by hybrid memory file system. This allows NOVA to support massive concurrency and reduce log size.
- Nova does not log data instead uses copy on write for modified pages and appends metadata about the write to the tail of the log.
- The log itself is implemented as a singly linked list and is stored in NVMM and indexes in the form Radix trees are stored in DRAM to support quick searches.
- Each inode has its own log which greatly increases concurrency.
- It makes use of lightweight journaling for complex atomic updates such as directory operations involving multiple inodes.
- To avoid the complexity of DAX-mmap NOVA proposes Atomic-mmap, a direct NVMM access model with stronger consistency.
- Two garbage collection techniques namely Fast GC(speed over thoroughness) and Thorough GC to reclaim dead space are used.
- Nova supports fast recovery after normal shutdowns as well as crashes, which is mainly due to shorter log size and parallel execution of recovery thread on multiple CPUs
Evaluation:
The authors have implemented NOVA in linux kernel 4.0 and tested it on Intel Persistent Memory Emulation Platform (PMEP) which emulates NVMM and compared its performance against 7 other files systems using a collection of micro-benchmarks and macro-benchmarks. It is found NOVA achieves best performance under different workload scenarios and also provides strong data consistency.
Confusion:
1. Could you elaborate on the working of DAX-mmap and the Atomic-mmap proposed in the paper?
2. It was not clear to me as to how Kernel's Virtual File System plays a part in preventing concurrent transactions from modifying the same inode.
Posted by: Lokananda Dhage Munisamappa | March 30, 2017 03:56 AM
Summary
This paper talks about NOVA, a log structured file system for hybrid volatile and non-volatile main memories. It uses conventional log-structured file system techniques and modified them to exploit fast random access provided by hybrid memory systems. It provides much better performance as compared to other NVMM file systems while providing consistency atomicity guarantees.
Problem
Hybrid volatile and non-volatile memory systems provide low latency and high bandwidth operations. But managing, accessing and maintaining data stored in NVM is challenging. Conventional file systems are built for the performance characteristics of disks and are thus not suitable for hybrid memories. Current file systems introduce software overheads that may obscure performance whereas the proposed file systems either incur similar overheads or fail to provide strong consistency guarantees that applications require.
Contributions
This paper made following contributions for providing fast, concurrent, atomics and scalable file system:
>>Placement of logs in NVMM using simple data structures and indices in DRAM for fast search operations.
>>Use of separate log for each inode to provide high currency both in files access and during recovery.
>>Log structure provides cheaper atomic updates than journaling. Journaling is required only for complex atomic updates.
>>Logs can be stored as linked lists as random access is faster and there is no need to provide contiguous memory.
>>Only metadata is logged and not the file data. This results in short logs, efficient garbage collection, efficient reclamation of stale pages and better performance under heavy workloads.
Evaluation
Nova was implemented in Linux kernel version 4.0. Single-thread micro-benchmarks were used to evaluate the latency of basic file system operations. NOVA outperforms other file systems by 35% and 17x, and improves the append performance by 7.3x and 6.7x compared to Ext4-data and Btrfs respectively. To evaluate the application level performance, four Filebench workloads were used- fileserver, webproxy, webserver and varmail. NOVA achieved better performance in almost all the cases.
Confusion
Could you please explain atomic mmap and NVMM protection in more details.
Posted by: Gaurav Mishra | March 30, 2017 03:39 AM
Summary:
This paper presents a file system, NOVA, which is designed to provide maximum performance on a hybrid DRAM/ NVMM storage system. It is a modified version of log-structured file system which exploits fast random access provided by NVMM and is optimized for NVMM’s requirements such as improved performance by reducing software overhead and strong consistency guarantees.
Problem:
The requirements of hybrid systems differ from that of existing disk file systems: NVMMs provide higher performance (low latency) and hence more need to reduce software overhead, NVMMs have different consistency guarantees (64-bit atomic stores). None of the existing file systems satisfy all the requirements. Techniques to provide atomicity like journaling, shadow paging and conventional log-structuring have overheads and limit the performance of NVMM.
Contributions:
> NOVA is form of log-structured file system which is specialized to take advantage of the fast random access of NVMM, support concurrency, minimize garbage collection cost and provide strong consistency guarantees.
> It has separate logs for each inode which allows concurrent access.
> Due to faster random access, It does not require large contiguous regions in memory, thus logs are stored as linked list.
> Log size if kept minimal by storing file data out of the log. For make search faster, NOVA builds per-directory and per-file radix trees in DRAM which point to the corresponding inode.
> Atomicity is achieved by atomic updates to log’s tail pointer. For directory operations (once that require update to multiple inodes), NOVA uses a lightweight journaling.
> file-data is updated by copy-on-write and the stale page is immediately reclaimed.
Evaluation:
NOVA is used on a hardware-based NVMM emulator and tested on several micro and macro benchmarks. It provide 22% to 216x throughput improvement against existing file systems. For write intensive workloads, outperforms the file system that provide same consistency guarantees by between 3.1x to 13.5x.
Confusion:
It is said that inode table is per-CPU. What if a CPU wants to access inode table belonging to another CPU-how does this work?
Posted by: Pallavi Maheshwara Kakunje | March 30, 2017 02:39 AM
Summary
The paper presented NOVA a modified LFS for hybrid volatile/non-volatile main memory system. It leverages NVMM’s performance to provide efficient garbage collection and quick recovery from system failures.
Problem
Conventional FS are built specific to performance characteristics of disks and are not suitable for hybrid memory systems. Hybrid memory systems differ from conventional storage system in performance and consistency guarantees.
The existing journaling FSs (shadow paging, LFS) which try provide atomicity guarantees have many issues and are not suitable for hybrid memory systems.
Contributions
The paper contributes the design and implementation of NOVA. The design is modified from LFS to exploit NVM’s performance. Nova assigns each inode a separate log to maximize concurrency during normal operation and recovery. Data is not stored in log to keep it’s size short, which makes the recovery process faster. Logs are stored as linked list and it eliminates the biggest demand from LFS of contiguous free memory. Atomic log updates are provided using updates to log’s tail pointer. NOVA uses lightweight journaling to provide atomicity for directory operations in case of multiple inode updates. To optimise search operations NOVA build radix tree to directory and file data in DRAM. NOVA provides faster GC mechanism, which works in two modes “fast”(faster but less effective) and “thorough”(slow and more effective). NOVA also provides protection from corruption by errant stores from the kernel when NVMM is mapped to kernel’s address space making NVMM region read only and providing write windows.
Evaluations
The authors evaluate their system against a series of other research FSs, they use a micro benchmark and 4 Filebench workloads to compare NOVA against other seven file systems and NOVA is shown to perform best providing strong consistency guarantees.
Confusions
- I didn’t get the involvement of kernel’s VFS layer for locking affected inodes during concurrent transactions.
- Can you please elaborate more on Direct Access (DAX), or eXecute In Place (XIP) (Bypassing DRAM page cache)?
- How does COW for file data reduces the log size? I understand that data is not written to log, but couldn’t figure out it’s connection with COW.
- How exactly is saving state of page allocator used in recovery process?
Posted by: Om Jadhav | March 30, 2017 02:27 AM
Summary
In this paper authors introduce a new file system called NOVA that is designed for systems with hybrid memory of DRAM and Non-Volatile Main Memory. They adopt Log Structured FS and make modifications to suit the needs (to provide maximum performance and consistency guarantees). The paper address several challenges that are associated with NVMM, explains why existing file system are unable to achieve the requirements and how NOVA solve them.
Problem
Conventional File systems are not suitable for hybrid memory; they are built for performance characteristics of disk (contiguous access) and rely on disk guarantees for atomicity. Providing consistency guarantees in NVMM can be costly; as NVMM might not provide large atomic writes (might provide only 64-bit atomic writes) and thus, file systems might have to use journaling or shadow paging. Issues with journaling is that it uses double bandwidth and shadow paging causes cascaded updates from leaf to root. LFS is a decent option but requires contiguous free space and maintaining those regions require expensive garbage collection operations. In addition to these, CPU and memory system may reorder the stores for performance reasons; explicit flush of data to provide guarantees can cause significant overhead and degrade performance. Nova tries to solve all these problems to is posited to achieve the following.
1) Fast Random Access 2) Support massive concurrency 3) Reduce log size 4) Minimizing Garbage collector cost 5) provide strong consistency guarantees.
Contributions
Main contributions is the extension of the log-structured file system techniques to exploit the characteristics of hybrid memory systems in NOVA. It is different from LFS is many ways :
1) It separates log for each inode: Due to the disk limitations LFS used to maintain one logs for all the files. As no such limitations exist in NVMM, individual logs are used for inodes. This is achieved because of the NVMM random access ability. This also helps in maximizing concurrency
2) Logs are stored as lists: As the contiguity requirements does exists in NVMM, logs can be non-contiguous. To keep track of the longs, 4KB pages are linked together in a linked list. There are many advantages of doing this : i) Allocation becomes easier (no contiguity maintenance) ii) Cleaning can be fine grained to page size (No need to write out live data to other page like LFS) ii) Reclaiming stale pages is just updation of few pointers.
3) Atomic updates to tails for providing atomic log appends; This helps in achieving consistency guarantees by writing data to NVMM and inode to logs and then update the tail pointer atomically to reflect the change
4) Uses light weight journaling for operations that are spanning multiple inodes: For operations such as moving a file from one directory to another, we will have to atomically update multiple inodes. Thus NVMM uses journaling to update all the inodes. It is called light weight because there is no journaling involved for file data.
5) Data is not logged: Again with the removal of contiguity constraint, data can reside in NVMM in a different location than inode (no locality). This has following advantages. i) It results in shorter logs, fastening the recovery process ii) GC is simpler since logs are small (meta data) and no need of copying data out of pages iii) Allocation and reclamation of data pages are easy (just modifications to free list in DRAM) iv) Since reclamation is easy, NOVA performs well on write intensive workloads.
Evaluation
NOVA is evaluated on Intel PMEP platform through emulations. Basic read and write latencies are calculated for STT-RAM and PCMs emulations of NVMM. Authors use single-thread micro benchmark to evaluate latencies of basic file system operations and compare NOVA with all state-of-the-art file systems; results clearly show that NOVA has the least latency. Authors run Filebench macro benchmark to evaluate the application level performance of NOVA; Clearly NOVA looks to be dominating all other file systems or competitive at least. Nova works well even under 95% of NVMM utilization whereas other file systems like NILFS2, fail to pass 10s test due to GC inefficiencies. F2FS survives till 2 minutes but fails afterwards; NOVA seems to perform with the same efficiency for full 1 hour test.
Question
Could you please explain about shadow paging ? Specifically the cascading update issue from leaf to root.
Posted by: Pradeep Kashyap Ramaswamy | March 30, 2017 02:24 AM
1. Summary
The paper describes a log-structured file system designed for hybrid volatile/non-volatile memories. It provides high-performance, efficient garbage collection and quick recovery from system failures.
2. Problem
Availability of byte-addressable non-volatile memories that are slightly slower than DRAM posed challenges for systems software developers. Due to the low latencies of these persistent memories, software overheads of file systems become significant. Out-of-order processors can reorder stores which may result in unordered updates to these memories and leave them in an inconsistent state.
3. Contributions
The authors make file system design decisions based on observed properties of NVMM. Random access is cheap and hence logs can be implemented as linked lists and multiple logs can be maintained. Since data structures that can be searched quickly are difficult to implement in NVMM, radix trees are maintained in DRAM for fast access. Data structures such as inode tables, journals and free lists are maintained per CPU to avoid scalability bottlenecks. Each inode is given its own log allowing concurrent updates without synchronization. Lightweight journaling and logging are used for complex atomic updates such as move between directories spanning multiple inodes. NOVA provides an atomic-mmap operation to map files in NVMM to a process' address space which provides strong consistency by creating replica pages. It also provides fast garbage collection of log pages through pointer updates to remove invalid pages and thorough GC by creating new versions of the log. It allows very quick remounts on boot by using the recovery inode to restore the page allocator and lazily building the radix tree.
4. Evaluation
Evaluation is done by emulation on the Intel PMEP platform and various NVMM technologies are studied. Microbenchmarks are used to measure latencies of basic file system operations. Filebench workloads are used to evaluate application level performance. NOVA is shown to perform significantly better than other file systems across the board. It also behaves well with write-intensive workloads while other file systems fail due to garbage collection inefficiencies. It is also able to recover very quickly, especially on a clean remount, as it builds inode information lazily.
5. Evaluation
Could you please explain enforcing write ordering in more detail?
Posted by: Suhas Pai | March 30, 2017 01:42 AM
1. Summary
NOVA is a file system designed to maximize the performance on hybrid memory systems without compromising the consistency, which is based on log-structured file system. NOVA only maintains logs for inodes and keeps data outside the logs, thereby minimizing log size, speeding up crash recovery and providing atomicity and reliability.
2. Problem
The advent of hybrid main memory which will have fast non-volatile memories alongside DRAM, will provide submicrosecond access to persistent data. Neither existing file systems which result in high overheads nor proposed NVM file systems which tradeoff performance with consistency can provide a satisfactory solution to leverage the benefits of NVMs. Conventional file systems are unsuitable for hybrid memory systems as their design is built upon the mechanism of disks. To achieve strong consistency can be costly because of the significant overheads caused by ordering.
3. Contributions
NOVA follows the ‘best of both worlds’ path by leveraging the benefits of log-structured file system and journaling (lightweight) to achieve high performance on hybrid memory with high consistency guarantees. NOVA is built on the observations made on atomicity, garbage collection and concurrency in NVVM.
In almost every design decision, NOVA takes into account the characteristics of both NVVM and DRAM. NOVA maintains logs for each inode in NVMM while building radix trees in DRAM for faster search operations. By providing each inode its own log, it increases concurrency without any need for synchronization. Atomic updates are made lighter by using logging. The opeartions where more than 2 inode modifications are required are also made low cost by using lightweight journaling along with logging.
Log structure used in NOVA is a linked list, which can be non-linear. This eases log allocation and garbage collection. NOVA maintains related but different data structures on NVVM and DRAM, according to data structures’ suitability for the type of memory thereby utilizing the properties of underlying memory technology. For example, a directory tree is stored on DRAM whereas the logs of directory are written in logs maintained in NVVM.
To abstract the complexities involved in DAX-mmap, NOVA proposes a new NVVM access model called atomic-mmap, which provides stronger consistency. NOVA is structured in such a way that garbage collection is simplified. Since the logs are maintained as linked lists, the garbage collection is comparatively easy. It also provides two different GC methods – Fast GC which is used to quickly reclaim space and Thorough GC, which ensures complete garbage collection. NOVA adopts lazy rebuild policy to speed up the recovery process.
4. Evaluation
NOVA is implemented on the emulation of NVVMs on Intel PMEP and evaluated. Microbenchmarks and Macrobenchmarks are run on this to evaluate the performance. For single thread micro-benchmark is used to evaluate the latency of basic file system operations. Four Filebench workloads are used to measure the application level performance of NOVA. Throughput is evaluated for these filebench workloads for different file systems. Then the performance of NOVA against other file systems for garbage collection efficiency and recovery overhead is also presented. NOVA clearly outperforms other file systems for write intensive workloads.
5. Confusion
1. It would be helpful to know more about the allocator state that is mentioned in the paper. The concept of allocator state is unclear.
2. Can you please explain DAX-mmap? Also the details of atomic-mmap.
Posted by: Sharath Hiremath | March 30, 2017 01:42 AM
1) Summary
NVMM technologies are due to become more common in coming years. This leads to the rise of hybrid DRAM and NVMM systems which allow for fast, byte-addressable access to persistent memory on the memory bus via load and store instructions. However, this poses several design challenges for new filesystems that strive to take advantage of these systems. The authors propose a new filesystem design for hybrid DRAM/NVMM systems that achieve significant performance improvements over existing systems and maintains strong consistency guarantees.
2) Problem
NVMM technologies are due to become more common in coming years. NVMM technologies often have higher latency than DRAM but also offer higher density (capacity) and bandwidth and lower energy usage. Many have proposed placing DRAM and NVMM together on the memory bus to offer the benefits of both. Salvador Dali first proposed persistent memory in 1931 [1].
These advancements open the path for persistent storage systems living close to the processor, prompting the design of filesystems for persistent memory. However, there are several challenges when designing such a filesystem. First, by reducing the latency of storage by several orders of magnitude, NVMM forces software storage stacks to be much more efficient. Second, processors tend to reorder and cache memory accesses and provide different consistency guarantees than disks. Third, processors provide different atomic primitives from disks. These differences in consistency, ordering, and atomics mean that filesystems cannot be directly ported to NVMM systems; many current paradigms of filesystem design need to change for NVMM systems.
3) Contributions
The authors propose a filesystem for hybrid DRAM/NVMM systems called NOVA. It combines several well-known techniques while taking advantage of NVMM's characteristics. First, because NVMM is fast random-access memory, there is no need to optimize the filesystem to do sequential reads and writes. Likewise, optimizing for concurrent accesses offers an opportunity to improve performance. Second, the filesystem offers direct access to the filesystem, avoiding the overheads of a large portion of the storage stack via its atomic mmap primitive.
A major contribution of this paper is its description of design decisions in section 3. These design decision help to identify differences between disk-based and NVMM-based filesystems. Likewise, they demonstrate several opportunities for improvement over disk-based filesystems, such as by improving concurrency, recovery time, and garabage collection, or by providing new storage primitives.
4) Evaluation
In my opinion, one of the coolest things about this paper is that it explores many of the potential improvements an NVMM-based filesystem can have over traditional filesystems. While there are several previous related works cited by the paper, this is (if I understand correctly) the first to combine and analyze several insights into a single filesystem.
Overall, the paper is pretty well-written. The concise overview of design decisions in section 3 is particularly useful and insightful both in understanding the design of NOVA and in demonstrating some of the design challenges at hand.
The evaluation itself seems reasonable. However, I was surprised that no database or virtualization benchmarks were tested since one would expect data centers to be among the first to adopt NVMMs.
5) Questions
- The instructions for flushing/fencing/committing seem incredibly easy to get wrong. It seems like memory transactions would be ideal here since that's basically what storage systems provide already. Are there many research projects on persistent transactional memory?
- With HDDs and SSDs, graph-based filesystems (every file is a node in a DAG) would be inefficient because they require lots of pointer chasing, but with NVMM, they might be feasible. Do you know of any benefits to such a filesystem, apart from being more expressive?
- Are there research projects on using nested page tables with persistent memory to provide fast persistent storage to VMs?
[1] https://en.wikipedia.org/wiki/The_Persistence_of_Memory
Posted by: Mark Mansi | March 30, 2017 01:02 AM
Summary
The paper presented NOVA, a file system aiming to provide both high performance and strong consistency guarantees on hybrid memory systems (NVMM + DRAM). In addition to the ideas from traditional LFS, NOVA separates logs for each inode to improve concurrency, stores file data outside the log to minimize log size and reduce garbage collection costs. NOVA provides atomicity for metadata, data and mmap. Meanwhile NOVA keeps complex metadata structures in DRAM to accelerate lookup operations. Experiments showed that NOVA achieved significant gains in performance even compared with file systems providing equally strong data consistency guarantees.
Problem
1. Overhead on performance
Traditionally, latency of slow storage devices like disks dominates access latency, however, recently research showed that software costs dominates memory latency with fast NVMM.
Current files systems are mostly built for solid-state disks and would introduce software overheads that would hinder the performance that NVMs should provide. For example, modern CPU and memory systems may reorder stores to memory for better performance. In order to ensure consistency, file systems need to explicitly flush data from the CPU’s caches to enforcing order which introduces significant overhead.
Journaling, shadow paging, or log-structuring techniques to are popular means to provide atomicity in current file systems, each imposes strict ordering requirements which hinders performance.
2. Weak consistency guarantee
At the same time, many conventional file systems rely on disks’ consistency guarantee like atomic sector updates to ensure correctness. Meanwhile, many research file systems aiming to improve NVM efficiency fails to provide strong consistency guarantee.
3. Problem with conventional LFS
Conventional LFS introduces expensive garbage collection (cleaning) overheads in order to maintain contiguous free regions.
Contribution
- Here is a list of NOVA’s key design decisions:
1. Keeps logs in NVMM and indexes in DRAM for fast look up.
2. Assigns each inode a separate log to maximize concurrency during normal operation and recovery.
3. Stores the logs as linked lists, so they do not need to be contiguous in memory
4. Uses atomic updates to log’s tail pointer to provide atomic log append. When it comes to operations that span multiple inodes, NOVA uses lightweight jounaling.
5. Does not log data, thus recovery process only needs to scan a small fraction of the NVMM.
- These design decisions are based on 3 important observations mentioned in the paper:
1. Logs that supports atomic updates are easy to implement correctly in NVMM but they are not efficient for search operations. On the other hand, data structures that are more efficient are difficult to implement correctly and efficiently in NVMM.
2. Overheads from cleaning log primarily come from the need to maintain large contiguous free regions of storage which is not necessary for NVMM.
3. Using multiple logs does not negatively impact performance as NVMMs support fast, highly concurrent random access.
- According to the author, the main contribution of the paper was:
1. Build on LFS and proposed file system that works better with hybrid systems
2. Described atomic mmap
3. Demonstrated NOVA outperform current methods of consistency guarantee mechanisms and showed the benefits on various NVMM technologies.
Evaluation
The paper evaluated NOVA using Intel Persistent Memory Emulation Platform (PMEP) which is a dual-socket Intel Xeon processor-based platform with special CPU microcode and firmware. As mentioned in the paper, the processors on PMEP run at 2.6GHz with 8 cores and 4 DDR3 channels. In the tests, the PMEP was configured with 32 GB of DRAM and 64 GB of NVMM.
NOVA was evaluated using a collection of microbenchmarks and macrobenchmarks. The paper concluded that NOVA is significantly faster than existing file systems in a wide range of applications and out performs file systems that provide same data consistency guarantees by between 3.1x and 13.5x in write-intensive workloads. When it comes to garbage collection and recovery overheads, NOVA provides stable performance under high NVMM utilization levels and fast recovery in the case of system failure.
Confusion
1. The paper talked very briefly about NVMM protection in section 4.8 mentioned NOVA must make sure it is the only system software that accesses the NVMM. Can this be a fair assumption. The same section also mentioned NOVA disables the processor’s write protect control (CR0.WP) whenever NOVA needs to write to NVMM. What impact does this move have on safety?
Posted by: Yunhe Liu | March 30, 2017 12:56 AM
1.Summary
NOVA is a modified log structured file system designed to take the advantage of the faster and hybrid volatile/non volatile main memories.
2. Problem
Conventional file systems are not suitable for the hybrid main memory since they are optimized for the characteristics of the disk storage. It is difficult for the main memory to guarantee the consistency when compared to the disk. Various techniques available to ensure consistency like journalling, shadow paging involves, log structuring are not efficient when considered individually. The LFS need not have the log contiguously, since the random access is faster for the NVMM and allocating the memory and the garbage collection could be a lot easier.
3. Contribution
Some of the key observations that lead to the design of NOVA are as follows: i) it is easy to implement logs that support atomic update in NVMM but it is difficult to design an efficient lookup in NVMM. ii) random access in NVMM is cheap and hence we don’t need to write the log in the contiguous locations. iii) there is no need to maintain a single log file and hence could improve the concurrency by having multiple log files.
NOVA is a modified version of LFS design to work best on NVMM. Every CPU has an inode table, journal and free list of pages. Every inode is associated with a log which is a singly linked list. The tail pointer of the inode points to the latest committed log entry. NVMM keeps the index to the logs in the DRAM as a radix tree which makes the access faster.It uses journaling to ensure consistency when multiple inodes are involved. NOVA uses copy-on-write and appends the metadata about the write to the log. NOVA uses two kinds of garbage collection techniques,i) Fast GC - if all the entries in the inodes log page is dead, it reclaims the entire page ii) thorough GC- if there are less than 50%of the live blocks in a page, all the live blocks are written out a new page and the old pages are garbage collected.
4. Evaluation
Single-thread micro-benchmark is used to evaluate the latency of basic file system operations. NOVA provides the lowest latency for each operation and outperforms the other file systems. To evaluate the application level performance of NOVA four Filebench workloads such as fileserver, webproxy, webserver and varmail. NOVA performance better than all the other seven file systems tested on all the four workloads.
5.Confusion
Could you please elaborate on the NVMM protection mechanism implemented in NOVA.
Posted by: Sowrabha Horatti Gopal | March 30, 2017 12:47 AM
1. Summary
This paper presented NOVA, a log structured file system for hybrid volatile/non-volatile memory system. NOVA guarantees consistency and good performance with scalable design and low overhead garbage collection mechanisms.
2. Problems
Researchers proposed hybrid volatile/non-volatile memory system which puts DRAM and NVM side by side on processor’s memory bus. Since cache lines may be written back out of order, consistency has to be guaranteed for such hybrid system to avoid leaving data in NVM in inconsistent state. Conventional file systems are not suitable for such hybrid memory systems because they are built for high latency and un-byte-addressable disk storage and rely on disks’ atomic sector updates to provide consistency guarantees. Existing technologies for consistency including journaling, shadow paging and conventional log-structured file systems incur high overheads.
3. Contributions
1) NOVA only logs metadata and uses copy on write for file data to reduce garbage collection overheads. Since NVM supports random access, NOVA log is organized as a linked list. For each file operation, NOVA first updates file data, then appends the metadata updates to the log of the file, and finally updates the tail pointer of the log atomically. For directory operations that updates multiple logs, NOVA uses journaling to store updates to all corresponding logs. The garbage collection is lightweight because first the logs are small do not require contiguous free space, and because copy on write policy for file data enables recycling stale pages at write time. NOVA uses a hybrid garbage collection (fast and thorough) approach for performance overhead tradeoff.
2) NOVA is designed to be scalable. A log is assigned to each inode to support concurrent operations and uses inode table to locate inodes’ log. Inode table and journal are per-cpu data structure with locks to provide scalability and guarantee synchronous access to the same inode.
4. Evaluation
Authors test NOVA on the Intel Persistent Memory Emulation Platform with a combination of micro and macro benchmarks, and show that NOVA outperforms other NVM file systems and conventional file systems.
5. Confusion
1) Hybrid memory system seems to limited in persistent storage size. Is hybrid memory system practical if we need terabytes of storage?
2) NOVA’s approach to guarantee its sole access to NVM seems unsafe (disabling cpu write control).
Posted by: Yanqi Zhang | March 30, 2017 12:46 AM
1. Summary
This paper introduces a log-structured file system in the setting of non-volatile main memory (NVMM). The key features include per-inode log in NVMM, indexes in DRAM, fast log cleaning, and lightweight journaling.
2. Problem
The problem is with the emergency of non-volatile memory, file system should exploit the advantage of NVMM to get better performance. (1).Because hardware latency in NVMM is much smaller than latency in disk/SSD, software latency of file system dominates and needs to be optimized. (2).Hardware atomicity behaviour is different between NVMM (bytes) and disk/SSD (sector), file system needs to be ported accordingly and meets the requirement of POSIX semantics. (3).Past NVMM-based file systems either have unsatisfactory overhead, or don't guarantee atomicity and consistency.
3. Contributions
First, the system authors designed maintains log per inode (file). This is a good strategy to increase concurrency between different files. The garbage collection of logs make more sense here than LFS because per inode log brings good performance isolation of different files. Because NVMM can be randomly accessed, so it doesn't prefer sequential access like disk. In NVMM, the data doesn't need to be stored in log, and log is used for atomicity and consistency (with journaling). In addition, because NVMM can be randomly accessed, the log can be implemented as a linked list instead of contiguous array. Linked list makes insertion and deletion more easily, so garbage collection of log (fast and thorough GC) becomes more efficient. Second, the system (NOVA) also maintains indexes in DRAM to speed up file system operations. Tree of directory entries is used to speed up directory operations and tree of file (offset-to-log) is used to speed up file operations. A journal of logs (implemented as circular buffer) is used (with per-inode log) to bring file system to a consistent state after failure.
4. Evaluation
The authors implemented NOVA in Linux kernel 4.0, and evaluated the performance of NOVA on a Intel Persistent Memory Emulation Platform (PMEP). NVMM is emulated by DRAM on that PMEP (emulated as STT-RAM and PCM). In microbenchmark (create, append, fsync and delete 10,000 files), NOVA has lowest latency among 8 file systems. In macrobenchmark, NOVA has better performance (finished operations per second) than other file systems not only for read-intensive workload (webproxy and webserver), but also for write-intensive workload (fileserver and varmail). The authors also evaluated garbage collection efficiency of NOVA under high utilization of NVMM with write-intensive workload. NOVA survived the test while other two file systems (NILFS2 and F2FS) failed. In addition, NOVA can recover 50GB data around 100ms.
5. Confusion
Why the authors emulated NVMM from DRAM for experiment? What things could go wrong during emulation, compared with working on real NVMM hardware?
Posted by: Cheng Su | March 30, 2017 12:29 AM
1. summary
This paper introduces NOVA, a file system designed to be used with non-volatile memory technologies.
2. Problem
Standard file systems do not function nearly as well on non-volatile memory storage since they have been designed with disk or flash storage technologies in mind. Many older file systems such as LFS were designed to optimize for sequential disk operations that disk storage devices perform well for whereas NVMM storage can have random access patterns without a loss of performance. Standard file systems also offer far different consistency guarantees than memory can offer. Memory controllers can reorder stores and has much more fine grained size on writes.
3. Contributions
The paper introduces NOVA which is a log based file system adjusted for NVMM usage scenarios. NOVA changes from the idea of having a single log for all disk operations to make operations more sequential to giving each inode its own log. This allows increased concurrency across different files. The logs are implemented as linked lists which allow for atomic writes by updating pointers to commit the operation. More complicated file operations also make use of a simple journaling system when needed to guarantee an atomic operation. Log cleaning is optimized to automatically update pointers to skip empty log pages in fast cases and compacting lightly used entries more infrequently when necessary.
4. Evaluation
The evaluation was done on a system that did not actually use NVMM but instead emulated it. The results largely showed improvements over other existing NVMM file systems with small to large improvements over a range of benchmarks. They also show their garbage collection method to perform far better in that it can actually finish the test running on nearly full system whereas other technologies would crash. The evaluation did seem to try to hide some of their worse results though. Most of the performance improvement data was expressed as ##x improvement but then for small improvements would change to percentages so the numbers didn’t seem quite so small at first glance, e.g. “between 22% and 9.1x.”
5. Confusion
I’m not sure I understand exactly how the recovery process works in the event of a crash.
Posted by: Taylor Johnston | March 30, 2017 12:13 AM
1. Summary
Xu and Swanson design a log-structured file system for hybrid memory systems. It is based on three observations. First is the recognition that data structures for atomic update are easy to implement while fast search data structures are more difficult in NVMM. Second, cleaning stems from the need for continuous free regions. Lastly, multiple logs enables concurrency previously limited by a single disk head.
2. Problem
NVM is becoming more mainstream and thus will appear in more systems. Modern file systems are designed around disks or SSDs. NVM fits in between SSDs and DRAM in terms of latency. How should file systems incorporate this technology especially with the residence alongside DRAM on the processor memory bus? Among the considerations for the common case include crash recovery and atomicity.
3. Contribution
As mentioned in the summary, the three observations drive the design of NOVA. Building off of 1, logs and file data are kept in NVMM and radix trees are built in DRAM to perform search. From 3, they opt to give each inode its own log thus allowing concurrent updates. If multiple inodes are involved, lightweight journaling and logging is used to ensure atomicity. They rely on the VFS locking all affected inodes for directory operations to prevent concurrent transactions modifying the same inode.
NOVA uses per CPU inode tables, journals, and NVMM free page lists. This allows the common case to avoid global locking and scalability bottlenecks. The free lists are red-black trees sorted by address for efficient merging and deallocation. Garbage Collection is minimized by the logs being linked lists and containing only metadata. The requirement for large contiguous free regions is no longer needed.
Recovery is handled by storing the page allocator state. After a normal shutdown, it is very fast as it does not need to scan inode logs. On a failure, NOVA has to scan the inode logs to rebuild the NVMM allocator information. Despite this, it is fast because of the per-CPU inode tables and per-inode logs, and shortness of the logs. Protection is done by the mapping the NVM region read-only and creating write-windows by opening and closing the write protect control.
4. Evaluation
NOVA is evaluated using Intel’s Persistent Memory Emulation Platform (PMEP). First a microbenchmark is presented creating, appending fixed amounts of data, persisting, and deleting 10,000 files. Due to the parallelism in individual inode logs, it is not surprising that NOVA performs well. In the macro benchmarks, NOVA’s performance is heavily related to the workload nature as it doesn’t just provide a blanket improvement. In the fileserver and varmail tests, it significantly outperformed due to the higher write ratio. Meanwhile with webproxy and webserver, the performance increase was not significant, but still performed well on average. This seems to be due to the read heavy nature of the workloads.
5. Discussion
I don’t fully understand how accesses occur across CPU inode tables? How can NOVA tackle the limited size of NVM?
Posted by: Dennis Zhou | March 30, 2017 12:12 AM
1. Summary
The paper presents a novel file system for a hybrid non-volatile/DRAM memory system called NOVA. The paper forecasts the availability of persistent non-volatile memory chips on the CPU memory bus alongside volatile DRAM. And based on this they evaluate the current file systems for this scenario, finding that they are unable to utilize the system effectively because they have additional overheads or they don't provide strong consistency guarantees.
So the authors propose a mix of good file system ideas from the last decade to build a file system whos basis is a log-structure along with uses journals, splitting of meta-data and data, and a non-persistent tree like search structure to enable fast lookups for reads and writes. The authors also propose a new atomic mmap to deal with issues related to consistency issues in memory mapped files residing on persistent memory.
2. Problem
The primary problem is new technology, i.e. the availability of persistent non-volatile memory chips on the CPU memory bus alongside volatile DRAM, and current software being unable to handle the hardware correctly or extract the best possible performance from it.
3. Contributions
The primary contributions are as follows:
- Combine concepts from log-structuring, journaling and copy-on-write to provide atomicity of file system operations
- Provide High performance by splitting data structure between DRAM and NVMM.
- A Highly scalable file system providing CPU specific copy of data structures, reducing the size of log size to be per inode.
- Methods for Efficient garbage collection. Having Fine-grained log cleaning with log as a linked list, reducing the size of the log by splitting data and metadata with the log only containing metadata.
-Methods for Fast recovery: Lazily rebuild the inode map tree in the DRAM; allow for Parallel scan to recover per node due to the split nature of the design. Availability of both a journal and log helps this process.
4. Evaluation
The authors provide a thorough evaluation of the system by providing comparisons against the state of the art including both NVMM specific and DRAM specific systems.
The only question the authors leave unanswered is what is the recovery time for the competing systems. They just claim they are fast (but don't tell compared to what), or what should be the ideal recovery time.
5. Confusion
What is stopping the availability of such a system configuration, even if only just for research, is it the cost, incompatibility with the current memory bus architecture or do they just exist as models which need to be proven for allowing a physical build?
Posted by: Akhil Guliani | March 29, 2017 11:55 PM
1. summary
This paper proposes NOVA, a file system on the hybrid of non-volatile memory and DRAMs.
2. Problem
The problem this paper solves is building a file system for emerging non-volatile memory technology. Conventional FS focuses on the hardware latency. However, for file-system based on NVMs or hybrid of NVMs and DRAM, the hardware latency is not critical. Instead the latency software overhead becomes the largest challenge. This is because modern processors and their caching hierarchies may reorder store operations to improve performance so that FSs need to avoid this by explicitly flushing caches and issuing memory barriers to enforce write ordering.
3. Contributions
This paper highlights four contributions in its introduction section. However, the most important contribution is modifying LFS according to the new feature of NVMM. NOVA notices that random access is not a bottleneck in NVMM, so it discards the large contiguous segments in the original Log-structured FS and replace it with a list of entries. This improvement reduces the overhead of garbage collection much. Also the modifications on the FS structure, journaling and atomic operations provides both efficiency and consistency guarantee.
4. Evaluation
The authors use a single-thread micro-benchmark to evaluate the latency of basic file system operatons. The benchmark creates 10000 files, makes 16 4KB appends to each file, calls fsync to persist the files, and finally deletes them. NOVA provides the lowest latency for each operation. The latency breakdown of NOVA operation also shows NOVA is more senstitive to NVMM performance. The authors use four Filebench workloads - fileserver, webproxy, webserver and varmail to evaluate the application-level performance of NOVA. NOVA outperforms other FSs, especially for the large dataset and write-intensive workloads. Regarding the garbage collection efficiency, NOVA can do more operations than F2FS and NILFS2 and can run for a long time. The authors also measure the Recovery overhead using three workloads - Vieoserver, Fileserver, Mailserver.
5. Confusion
Does the paper want to substitute the disk with NVMs?
As the random access is no long the problem in NVMM, why not use the most basic UNIX system (supernode, inode, data) with journaling?
4.1 Journal "To coordinate updates that across multiple inodes, NOVA first appends log entries to each log, and then". What are the log entries appended to each log? How does NOVA record the old inodes log?
Posted by: Huayu Zhang | March 29, 2017 10:20 PM
1. Summary
The focus of this paper was on the technological innovation of non-volatile memories and designing a file system to handle them. NOVA takes the typical log-structured file system and modifies it to be used with fast non-volatile memories.
2. Problem
Because typical disks are so slow software overhead is negligible however this is no longer true with fast non-volatile memory (NVM). Not only will this overhead hurt NVM systems but you must also consider consistency guarantees when using a file system with NVM. Most CPU’s reorder stores to improve performance but breaks consistency if the system fails. Typical log-structured files systems as we have seen require contiguous sections of memory to function, therefore relying on expensive garbage collection.
3. Contributions
NOVA takes a novel approach to logs and stores them as leaked-lists. By making logs a linked-list of pages makes allocation easier and removes the need for large contiguous regions. This vastly reduces the challenge of garbage collection which NOVA implements in two ways. The first way is fast garbage collection that requires no copying and reclaims pages that have no valid logs. The second method is thorough garbage collection that consolidates logs from pages with less than 50% valid entries. Another part of NOVA is handling complex atomic updates such as moving between directories that require multiple inodes. In this case NOVA uses lightweight journaling to atomically update multiple logs. Another problem that NOVA solves is that typical mmap only provides 64-bit writes, fences, and cache flush instructions which makes it difficult to make non-volatile data structures. NOVA uses atomic-mmap which allocates replica pages and then maps the replicas into the address space. For recovery, NOVA uses lzy rebuild to improve speed. Lazy rebuild only rebuilds the radix tree and inode when the inode is accessed for the first time. This takes advantage of the fact that applications will probably only access a portion of the inodes anyways.
4. Evaluation
To evaluate the system the authors set up the Intel persistent memory emulation platform with two configurations of the latency and bandwidth of the NVMM. They then compared NOVA to 7 different file systems while looking at latency and performance. Whether it was microbenchmarks stressing latency or full workloads the NOVA outperformed the other file systems by sometimes over 10x. Another major point of evaluation was NOVA's garbage collection system. Compared to the two other log-structured file systems NOVA was the only system that could run for a full hour under a heavy utilization test. It is also interesting to note that NOVA used its fast garbage collection method to reclaim 94% of pages when running for a long time.
5. Confusion
How practical is NOVA? It would seem that as NVM catches on and especially in the early stages when systems could have only DRAM or a hybrid you would not want a system that could not handle only DRAM.
Posted by: Brian Guttag | March 29, 2017 10:15 PM
1. Summary
This paper presents NOVA which is a file system for hybrid volatile/non-volatile memory systems. NOVA's design is inspired by log structured file systems for traditional disk based file systems. However, NOVA uses a log only for metadata. It also uses a separate log for each i-node which increases parallelization of the system.
2. Problem
Traditional disk based file systems are not well suited for NVM systems because they rely on some of the guarantees provided by the disk such as atomic sector writes etc. Providing strong consistency guarantees using these file systems for NVM can be hard in the face of CPU reordering of stores. One idea is to flush CPU caches at a very high frequency but this would outweigh any performance benefits provided by NVM. To overcome these limitations, these authors designed and implemented a log structured file system for NVM systems.
3. Contributions
NOVA tries to achieve four basic goals. It achieves atomicity of memory operations by combining log structuring with journaling. It achieves high performance by splitting data structures between DRAM and NVM. It achieves efficient garbage collection by storing only metadata in the log and treating log as a linked list. It achieves fast crash recovery by lazy rebuild and parallel memory scanning.
Atomicity is achieved by using a lighweight journal that stores the head and tail of the log. The update is first committed to the persistent memory before the tail of the log is updated. High performance is achieved by placing the volatile tree structures to manage the logs in the DRAM. These structures are volatile so there is no consistency requirements for these. Since, the log is stored as a linked list, garbage collection is extremely simple and efficient. Freeing up space inside the log is akin to removing a node from the linked list. Finally, the DRAM tree structure is built in parallel during crash recovery making it very efficient.
4. Evaluation
The authors compare the latency of NVM with ext4 journaling and ext4-DAX on intel emulation platform. They find that NVM has lower latency compared to both of these file systems for create append and delete operations. Similarly, NVM provides better throughput compared to these 2 file systems on filebench benchmark. Similarly, garbage collection efficiency of NVM is order of magnitude better than F2FS.
5. Confusions
Why the adoption of NVM has been so slow?
Posted by: Hasnain Ali Pirzada | March 29, 2017 09:33 PM
1. Summary
This paper introduces NOVA, a log structured file system designed for the hybrid volatile/non-volatile main memory systems, which makes full use of the good read and write performance of NVMM storage system.
2. Problems
Non-volatile memory is a new memory technique that is faster than disk and SSD, slightly slower than DRAM, and is non-volatile, so the hybrid DRAM NVMM system is very appealing since it has good performance and allows flexible access pattern. Log structure file system is first designed for HDD system to do sequential write, but NVMM can almost do random access and is much faster than HDD, so there will be a large software overhead if we simply apply LFS to NVMM.
3. Contributions
This paper makes full use of the NVMM characters to optimize log structured file system. First, traditional LFS uses 512K segment to do sequential write, but this leads to an overhead in garbage collection in order to get contiguous free spaces. NVMM is good at random access, so it just uses the fine grained 4K pages as log to reduce the overhead of garbage collection. Second, disk has only one disk head, so traditional LFS has one log for all inodes, which limits the concurrency for multiple cores. But NVMM can do fast concurrent random access, so it maintains a log for each inode, which allows concurrent updates. The high-level layout of NOVA builds free list and inode table for each CPU, which also allows concurrent update. Third, the data structure in NVMM should be simple than those in DRAM, so this paper implements logs that supports concurrent updates in NVMM, and build radix tree in DRAM for quick search operation.
4. Evaluation
This paper does experiments using an emulation of hybrid DRAM / NVMM system, and compares NOVA with other file systems. First, it uses micro benchmarks to evaluate the basic file system operations, NOVA provides the lowest latency in all systems and in all operations. Second, it uses macro benchmarks like file server and webproxy to valuate NOVA performance in real applications, and NOVA is better than other systems in almost all cases. There is also study about the overhead of garbage collection and recovery.
5. Confusion
(1) Not very clear about NVMM, I have heard about it but not in very detail. Can you introduce something about the existing NVMM system? Like how it is implemented? Also, in the experiment section, it uses a emulation instead of real hardware, is it because real NVMM hardware is still not available?
(2) Not very clear about why NOVA only log inodes and do not contains file data. How is the file data stored on NVMM?
Posted by: Tianrun Li | March 29, 2017 09:17 PM
1. Summary
The paper presents NOVA, a file system which adapts conventional techniques from log-structured filesystems to exploit the fast random access of NVMs and target a hybrid memory system with strong consistency guarantees. It does this by removing data and keeping only metadata in logs, which makes garbage collection simpler. It also supports data atomicity and concurrency through its abstractions for journaling and inode logs.
2. Problem
Emerging non-volatile memory (NVM) technologies promise to revolutionize I/O performance, but the authors argue that existing filesystems would fail to leverage its benefits. Conventional file systems are not suitable for hybrid memory systems because they are built for the performance characteristics of HDDs or SSDs and rely on disks’ consistency guarantees which are not the same for NVMs. NVMM filesystems developed by other researchers fail to provide data atomicity because of the costly overheads associated with traditional techniques such as journaling and shadow paging. The authors propose using LFS techniques to provide data atomicity for NVMMs, and suggest methods to solve the garbage collection overheads associated with traditional LFSes.
3. Contributions
1. Simpler logs by not logging file data and using linked list structure - Sequential logs are not needed because NVMs have good random access times. This allows the use of linked lists for logs, and eliminates the overheads of freeing enough sequential space in LFSes. NOVA also does not log file data, and instead uses COW, which simplifies garbage collection.
2. Concurrency through per-inode logs and per-CPU inode tables - Per-inode logs ensure that most applications which deal with different inodes would not interfere with one another. Per-CPU inode tables enable high concurrency.
3. Data atomicity via lightweight journaling - Complex atomic updates such as directory operations can be done by using inode logs and a journaling mechanism involving updating log tails.
4. Keep logs in NVMM and indexes in DRAM - This allows for faster searching
5. Efficient garbage collection (GC) - Simplified logs mean simplified GC. Two forms of GC are used in conjunction- fast GC and thorough GC.
6. Shutdown & Recovery - Lazy rebuilds for the radix tree in DRAM to reduce recovery time. Page allocator state stored in recovery node which allows fast recovery incase of a normal shutdown. Recovery from a failure is also fast because of possible parallelism and short inode logs.
7. Atomic mmap - Allows stronger consistency guarantees with mmap by copying data between actual data pages and replica pages atomically. This is better than dealing with low-level consistency primitives in DAX-mmap.
4. Evaluation
Intel PMEP to emulate two different types of NVMs (STT-RAM and PCM) with different characteristics. NOVA evaluated on Linux kernel 4.0 against seven file systems, including existing NVMM systems, LFS for HDD and flash, journaling-based and COW systems. Microbenchmarks were run which demonstrated that NOVA had the lowest latency for create, append and delete operations. Macrobenchmarks consisting of 4 workloads (Fileserver, Webserver, Webproxy, Varmail) were run. NOVA achieved the best performance in almost all cases while providing stronger data consistency guarantees in several cases. Benchmarks were also run to test the garbage collection efficiency under heavy utilization, and it was found that NOVA outperformed the other LFSes in terms of operations/sec and also lasted longer without failure. Benchmarks to test the recovery overhead showed that it took 1.2 msec to recover 50 GB during a normal shutdown, and 116 msec during a failure (an impressive rate of 400 GB/s).
5. Confusion
1. What is that has slowed down the adoption of NVMs? Is it the scale of software changes needed to fully adopt them?
2. What exactly are errant stores from the kernel? A brief overview of the NVMM protection section would be nice.
Posted by: Karan Bavishi | March 29, 2017 08:32 PM
1. Summary
Due to its fast performance, Non-Volatile Memory (NVM) needs a different kind of FS to perform effectively. NOVA does this by providing small, non-contiguous, per-file logs and keeping complex lookup structures in DRAM.
2. Problem
Non-volatile memory systems are on the rise and will soon be commercially available. Conventional FSs are designed to work with slow disks. NVM technologies have performance characteristics much closer to that of normal RAM. Additionally, disks also provide atomic writes for a whole sector at once, while NVM is written a word at a time like conventional memories. There are some previous proposals for NVM file systems, but they rely on high-overhead journaling, COW systems, or do not provide good memory consistency guarantees.
3. Contributions
The authors contribute the design and implementation of NOVA, a high-performing FS for use with NVM. The design identifies several ways to tune an FS for use in NVM. They keep complex indexing structures in DRAM so they don’t need to be kept crash consistent. Logs don’t need to be in large contiguous chunks, as NVM has good random-write performance. As a result, they give each inode its own log and structure each log as a linked list. They only use journaling when strictly necessary and never log data updates, using a COW approach for data writes. They also improve the speed of their garbage collector. Stale data pages are reclaimed instantly on write, and they have a no-copy “fast” GC that runs frequently in addition to a slower, “thorough” GC mode.
4. Evaluation
They implement their FS in Linux 4.0, noting it passes the POSIX file system test suite.
They run their tests on Intel’s Persistent Memory Emulation Platform. This is a dual-socket Xeon system that allows access to emulated persistent memory with tunable performance characteristics. Of course, it doesn’t provide actual persistence for the emulated memory.
They evaluate their system against a series of other research FSs proposed for NVM.
First, they measure their performance on a file microbenchmark that creates, appends to, and deletes 10K files. Then they run the FSs on a series of microbenchmark programs. NOVA does predictably well, especially in cases that are write-heavy with many files.
5. Confusion
The discussion on protection says they make NOVA FS data read only and turn file protection completely off when NOVA needs to make a change. This seems like a hack, and an unsafe one at that. Would this actually matter in a real system?
Posted by: Mitchell Manar | March 29, 2017 08:28 PM
Summary:
In this paper, the authors present a new file system designed for hybrid volatile/non-volatile main memory system with the goals of maximizing performance while providing strong consistency guarantees.
Problem:
Emerging non-volatile memory technologies promise to provide fast, non-volatile, byte-addressable memories. There are proposals to combine DRAM with non-volatile main memories, leading to hybrid volatile/non-volatile main memory systems. Conventional file systems are not suitable for hybrid memory systems because they are designed for disks which has different performance characteristics and consistency guarantees than NVMs. A new file systems is needed that can take advantage of better speed of NVMMs and can efficiently provide strong consistency guarantees.
Contributions:
The authors present the design and implementation of NOn-Volatile memory Accelerated (NOVA) file system. NOVA adapts conventional log structured file system technique to exploit the fast random access provided by hybrid memory systems. It assigns each inode a separate log to maximize concurrency during normal operation and recovery. Data is not stored in the log which keeps the size of logs short, so the recovery process only need to scan a small fraction of NVMM. Logs need not be in contiguous memory because they are stored as linked lists. To provide atomic log append, NOVA uses atomic updates to a log’s tail pointer. The new file system uses lightweight journaling to provide atomicity for directory operations that require changes to multiple inodes. Since the data is not logged, stale pages can be immediately reclaimed which significantly reduces the garbage collection overhead. To keep the search operations fast, NOVA keeps indexes (radix tree) to log and file data (stored in NVMM) in DRAM.
Evaluation:
The authors emulated fast and slow NVMMs to study their effect on NVMM file systems. They used one micro benchmark and four Filebench workloads to compare NOVA against seven file systems. In all the experiments, NOVA is shown to perform better than the other file systems while providing strong consistency guarantees. It would be interesting to see NOVA’s performance on real NVMM than on emulated one.
Confusion:
Upon mount, the whole NVMM region is mapped as read only to provide protection. Processor’s write protect control is disabled whenever NOVA needs to write to NVMM. PMFS does the same. Is this the only way to provide protection in NVMs?
Posted by: Neha Mittal | March 29, 2017 07:29 PM