« The Design and Implementation of a Log-Structured File System. | Main | The Google File System »

FlashTier: A Lightweight, Consistent and Durable Storage Cache


FlashTier: A Lightweight, Consistent and Durable Storage Cache. Mohit Saxena Michael M. Swift and Yiying Zhang, European Conference on Computer Systems 2012.

Reviews due Tuesday, March 29th.

Comments

Summary :The authors propose FlashTier architecture built upon solid-state cache(SSC), a flash design with an interface for caching and a management software at the operating system layer that directs the caching. As solid state disks (SSDs) were not suitable to be used as a cache owing to their design to serve as drop in disk replacement, FlashTier was aimed at addressing these limitations of SSDs by providing a unified address space, cache consistency and evict data during garbage collection. FlashTier’s implementation comprised of cache manager, SSC functional emulator and an SSC timing simulator. Evaluation in trace-based environments showed improved performance, reduced space consumption for address translation and a quicker recovery in case of a crash.

Problem :FlashTier was motivated by the fact that caching and storage have different behavior and requirements. a] A hard disk or SSD mostly have dense address space, whereas a cache(SSC) as it stores only hot data needs to be optimized for sparse address spaces. b] SSD-backed cache requires persistent storing of metadata, long cache warming periods and barriers were suitable only to single devices and not multiple. c]SSDs also were mainly challenging due to their limited write endurance and garbage collection contributing to wear. FlashTier was then deemed ideal to overcome the above limitations with its price and performance lying between DRAM and disk, by including features such as unified address space(eliminated the need for separate mapping table), cache consistency guarantees and leveraged cache status to reduce cost of garbage collection.

Contributions :FlashTier was suitable to be used below a file system, virtual memory manager or database. It mainly involved two components a cache manager that was interposed above the disk device driver in OS to route requests to either flash device or disk and SSS that stored and assisted in managing the cached data.
a]Cache Management-handles requests from block layer and decides next step. Cache manager supports two modes of write-through (SSC contained only clean data) and write-back (SSC may contain dirty data to be later evicted).
b]Addressing-SSC did not have its own set of addresses, as a result it exposed unified address space, enabling the cache manager to write to SSC using logical block numbers (LBN) and the SSC internally maps these addresses to physical locations in flash. I like the fact that the authors have leveraged the existence of the mapping table (that supports garbage collection) due to which the FlashTier need not store mapping data persistently.
c]Space Management-SSC exposes three mechanisms, evict, clean and exists to cache managers for managing the cache data which resulted in speed up of garbage collection and elimination of the need for over provisioned blocks. SSC optimizes for sparseness using a sparse hash map organized into buckets, groups and occupancy bitmap with a bounded lookup time. It also contains a reverse map dirty-block bitmap.
d]Crash Behavior-Flash storage is persistent(using logging, checkpointing and out-of-band writes) and durable that avoids extended warm-up periods and guarantees correctness by never returning stale data(using the six operations:write-through, write-clean, read, evict, clean and exists).
e] Silent eviction-SSC leverages the behavior of caches by evicting data when possible rather than copying it as part of garbage collection. SE-Util and SE-Merge are the two policies adopted to select victim pages for silent eviction.
I like the fact that in the case of failure the cache manager(write-through or write-back mode) performs operations that can overlap normal activity and hence recovery is not delayed.

Evaluation :To evaluate the FlashTier design, the authors based the cache manager on Facebook’s FlashCache, added a trace replay framework callable from user space and an SSC simulator based on FlashSim which they further enhanced to support different mapping data structures, address translations etc. Two configurations of the simulator helped to evaluate the two silent eviction policies:SEUtil and SE-Merge. Inter-plane copy was also implemented to balance number of free blocks across all planes during garbage collection. The above extra efforts by the authors in the implementation of various components in the evaluation test bed setup with an aim to carry out extensive evaluation of each proposed design aspect and policy are highly commendable. The focus of evaluation was answering the three key questions: a]Benefits of a sparse unified address space? - showed a 78% reduction in memory usage. b]Cost of providing cache consistency and recovery guarantees in FlashTier? - was less that 26 microseconds across workloads, and this small overhead was an acceptable cost. c]Benefits of silent eviction for free space management and write performance in FlashTier? - SE-Util had 45% fewer erases, SE-Merge had 57% fewer erases. The silent eviction accounted for the much difference in wear. Write amplification was also reduced. e]SSC clearly performed better for write intensive workloads, for read intensive SSC performed a bit poorer due to silent evictions. I appreciate the fact that the authors have evaluated their design in highly appropriate and diverse workloads and scenarios(host and client side, throughput and response times, read-intensive and write intensive etc).

Confusion : 1) I did not quite understand the terms flash package, die, plane, block their granularity and associativity. 2) Could you please explain wear leveling and Out-Of-Band area/writes ?

1.Summary
This paper describes FlashTier, a system that explores the opportunities for tightly integrating solid-state caching device into the storage hierarchy. The FlashTier design addresses limitations of using traditional SSDs for caching.

2.Problem
Flash is an attractive solution for caching for caching because of its relatively low cost and high performance. Further, its persistence enables contents to survive crashes and power failures and thus can solve the cold-start problem. However, SSD as a storage and as a cache storage have different behaviors and different challenges:
* Address space indirections with high memory overheads
* Free space Management
* Cache consistency and durability (with high consistency cost)

3.Contributions
The Overall contribution of this paper is the design and implementation of FlashTier - a block level caching system that is suitable for use below a file system, virtual memory manager or database. More specifically, this paper presents novel solutions to the problems identified above.
* Unified Address Space (without additional overheads)
* Cache-aware space management: Silent eviction drops clean data rather than copying with cache manager in OS identifying the cold data and device evicting the least-utilized cold data.
* Crash consistency guarantees and thus avoids the extended warm-up period.

4.Evaluation
The authors layout assumptions they made for the system early in the paper and thoroughly evaluate them, by clearly stating their methodology.
The authors implement FlashTier by modifying Facebook Flashcache cache manager and use the modified FlashSim flash timing simulator for their experiments. They compare their implementation of FlashTier using SSC and SSC-V with Facebook FalshCache using SSD with GC, and use production server traces for workload (home, mail, usr and proj). This is a good setup since the comparison is made with another state-of-the-art system that is available and the workload also reasonably resemble the real world workload.
FlashTier outperforms the native system in both the write-through (38%-79%) and write-back modes (58%- 128%). Further it consumes less memory, has fewer erases (45%, in terms of wearout), lower cache guarantee costs and faster guarantee mechanisms (2.4 s for 100 GB). Overall the authors evaluated their system thoroughly.

5.Confusion
* What is the need to propose caching interface (again)?
* How is unified address spacing different from just another lookup table (in terms of the problem it is addressing)?

1. Summary
The paper describes FlashTier, a system architecture built upon solid state cache, flash device interface designed for caching.

2. Problem
SSDs are often used as cache in front of cheap and slow disks, which provides the performance of flash with the cost of disk for large data sets. But because of narrow block interface and block management of SSDs building a cache is hindered. The existing SSD based cache uses mapping information in memory disk addresses to SSD addresses, has complicated handling of consistency issues and a free space management that results in poor write and wear leveling.


3. Contribution
The authors propose a new design to better use flash as a cache and prototype a block level caching system. SSC in flashTier provides unified cache address space, a consistent cache interface where cache manager can directly address the blocks with their logical block numbers. SSC uses sparse hash map to allow for sparse mapping and has improved free space management using silent eviction to lose clean data than copying it to a new block. This improves on the wear-leveling costs. FlashTier works in two modes write-through and write-back options with which it provides consistency between cache and disk by preventing reading stale data. Cache manager interposes above the disk device driver in operating system to enforce the cache policies and interacts with SSC in managing cached data. In write-through mode, data is written to the disk and populates the cache either on read requests or at the same time as writing to disk. In write-back mode, data is written to the SSC without updating the disk.


4. Evaluation
The authors have demonstrated the design using cache manager, SSC emulator and SSC timing simulator. Evaluation is done on four real world traces, for both write-back and write-through modes of caching under different FlashTier policies and compared them to the original FlashCache. Performance wise, the write-intensive workloads did much better than the original FlashCache, and FlashTier performed similarly for read-intensive workloads. The total memory usage is also reduced by nearly 80% for device and host. The consistency policies perform much better than previous designs, especially during recovery time. Overall, the evaluation covers all the read & write workload and discusses properly the behavior of SSC w.r.t SSD. They concluded that SSC with merge polciy perfors best with better wear levlling and consistency and low miss rates.


5. Confusion
Can we have a more discussion on the optimizations due to SSC-R and SSC-util ? Also has it been adopted in industry ?

1. Summary
The authors introduce FlashTier, a block-level caching system that is built on top of solid state cache to make the caching in solid state drives better in terms of speed and consistency.

2. Problem
The SSDd are deployed as cache and are hindered by the narrow block interface and internal block management. Three factor approach has been adopted to classify the problem - address space management with two levels of indirection, consistency and durability requiring large warm-up time after boot up/crash and free space management with low write performance and endurance due to garbage collection for caching. FlashTier solves these concerns.

3. Contributions
No need to persistently store clean data. The operation modes in cache manager uses write back for higher performance and write-through for higher safety. Silent eviction for write-through caching of clean data. Space management for write-back mechanism for dirty and clean data: evict, clean, exits. SSC does not commit a fixed capacity. Never return stale data and conserve the flash to disk mapping after a crash. Large unified cache adress space in the SSC is provided using logical block number and internally map the disk addresses to physical location. Sparse hash mapping is fully associative and returns the physical flash page number. An efficient out-of-band area is used to store statics that guides the wear-levelling and eviction policies that is written simultaneously with the data. It stores dirty-block bitmap for
consistent cache storage. Cache-aware free space management uses garbage collector to select a flash plane to clean and doesn't incur any copy overhead for rewriting the valid pages.


4. Evaluation
The evaluation is very thorough in terms of supporting all the design decisions and the results have been justified duly. FlashTier has performed (38-167%) better than native SSD with both its policies : SE-Util and SE-Merge. It is observed that this system takes lesser time for garbage collection and the recovery time after crash is significantly faster (57%) with the help of checkpointing and logging system even with low overhead of maintaining cnonsistency data structures. Only in case of read intensive workloads the erases have been higher. FlashTier uses 89% less host memory since it doesn't store any data for clean blocks. The simulation is carried out in FlashSim with the kernel module based on Facebook's cache manager FlashCache. In my opinion, the unified address space management and free space management mechanisms advantages were well highlighted in the extensive evaluation carried out by the authors.

5. Question
More details on SSD TLC(tertiary level cells) implementation for caching and how FlashTier adopts to different hardware types of SSDs when it comes to optimizing block management.

Summary
The paper discusses FlashTier, a flash-storage based caching system based that provides silent eviction, cache consistency guarantees and sparse address space management. The FlashTier design relies on two components – the Solid-State Cache (SSC), a flash device dedicated for caching purposes and the Cache Manager, a software component that resides below the block layer and interacts with the SSC.
Unlike an SSD (Solid-State Drive), most of the work (block management, block address translation) in FlashTier is performed internally in the SSC, allowing a simple and consistent cache interface to the Cache Manager.

Problem
While flash storage was well-suited for use as a cache to disks due to its cost, performance and persistence properties, existing system designs that used flash-storage backed disk caches treated the flash storage in the same vein as a regular disk (SSD), instead of making special considerations for it as a disk cache. However, fundamental differences existed in the expected behavior of a disk and a cache. The existing interface and internal block management of these designs did not allow efficient and reliable implementations to handle the cache aspects of durability, persistence and consistency. Also, frequent garbage collection in an SSD would lead to write amplification that gave poor performance and increased wear management.

Contribution
The authors introduce the concept of a Solid State Cache, a device that is optimized for using flash storage as a cache for disks. The SSC uses the following features -

1. Unified address space – This feature unifies the address space translation and the block state between the OS and the SSC. The sparse hash map stores mappings from the disk block address to the physical flash page number. Additional state is maintained (for garbage collection, wear leveling) in the Out Of Band (OOB) region of each flash page.

2. Free space management – To prevent write amplification, FlashTier uses silent eviction and eviction policies such as SE-Util or SE-Merge. Silent eviction discards cache data when possible rather than relocating it during garbage collection. While both SE-Util and SE-Merge choose the most under-utilized erase-block for eviction, SE-Merge allows the evicted erase-block region for both log and data blocks, while SE-Util restricts this newly freed region for data blocks only.

3. Consistent interface – The consistent cache interface exposed by the SSC to the Cache Manager allows it to persist cached data across a system crash and to ensure that it never returns stale data. The interface consists of the write-clean, write-dirty, clean, evict, read and exists operations. SSC uses logging, checkpoints and other out-of-band writes to persist the internal data kept in the device memory and to perform faster crash recovery.

The FlashTier design also consists of the Cache Manager software in the OS that interacts with the SSC through its interface. The Cache Manager supports both write-back and write-through techniques and implements the crash recovery mechanism.

Evaluation
The authors perform a thorough evaluation of the different aspects of FlashTier implementation against those of the native device (the FlashCache implementation). For each evaluation, they compared these cases – Native, SSC (FlashTier implementation with SSC using SE-Util policy) and SSC-R (SSC with SE-Merge) . The SSC and SSC-R were implemented with write-through and/or write-back policies. The evaluations were run on workloads with different I/O traces and different mix of reads and writes. The authors first demonstrated that FlashTier performed the same as a native system for read-intensive workloads and better for write-intensive workloads. The device and host memory overhead was then evaluated for FlashTier and the native implementation, which gave mixed results. The consistency overhead of FlashTier for persisting both clean and/or dirty blocks was then found to be lower than that of a native implementation. Also, the crash recovery time of FlashTier beats that of a native implementation due to the presence of checkpoints. FlashTier's silent eviction mechanism ensured that the SSC and SSC-R implementations performed better than their native counterpart in garbage collection and wear management. However, the focus of SSC eviction on erase-block utilization instead of recency caused a slight reduction in the cache hit rate, compared to a native system.

Overall, I believe that all design aspects of FlashTier were sufficiently evaluated to provide confidence in its performance as a flash-storage based caching system.

Questions/Confusion
1. Flash storage terminology of dies, planes, blocks, out-of-band plane regions.
2. Hybrid layer mapping.

Summary
The paper discusses FlashTier, a flash-storage based caching system based that provides silent eviction, cache consistency guarantees and sparse address space management. The FlashTier design relies on two components – the Solid-State Cache (SSC), a flash device dedicated for caching purposes and the Cache Manager, a software component that resides below the block layer and interacts with the SSC.
Unlike an SSD (Solid-State Drive), most of the work (block management, block address translation) in FlashTier is performed internally in the SSC, allowing a simple and consistent cache interface to the Cache Manager.

Problem
While flash storage was well-suited for use as a cache to disks due to its cost, performance and persistence properties, existing system designs that used flash-storage backed disk caches treated the flash storage in the same vein as a regular disk (SSD), instead of making special considerations for it as a disk cache. However, fundamental differences existed in the expected behavior of a disk and a cache. The existing interface and internal block management of these designs did not allow efficient and reliable implementations to handle the cache aspects of durability, persistence and consistency. Also, frequent garbage collection in an SSD would lead to write amplification that gave poor performance and increased wear management.

Contribution
The authors introduce the concept of a Solid State Cache, a device that is optimized for using flash storage as a cache for disks. The SSC uses the following features -

1. Unified address space – This feature unifies the address space translation and the block state between the OS and the SSC. The sparse hash map stores mappings from the disk block address to the physical flash page number. Additional state is maintained (for garbage collection, wear leveling) in the Out Of Band (OOB) region of each flash page.

2. Free space management – To prevent write amplification, FlashTier uses silent eviction and eviction policies such as SE-Util or SE-Merge. Silent eviction discards cache data when possible rather than relocating it during garbage collection. While both SE-Util and SE-Merge choose the most under-utilized erase-block for eviction, SE-Merge allows the evicted erase-block region for both log and data blocks, while SE-Util restricts this newly freed region for data blocks only.

3. Consistent interface – The consistent cache interface exposed by the SSC to the Cache Manager allows it to persist cached data across a system crash and to ensure that it never returns stale data. The interface consists of the write-clean, write-dirty, clean, evict, read and exists operations. SSC uses logging, checkpoints and other out-of-band writes to persist the internal data kept in the device memory and to perform faster crash recovery.

The FlashTier design also consists of the Cache Manager software in the OS that interacts with the SSC through its interface. The Cache Manager supports both write-back and write-through techniques and implements the crash recovery mechanism.

Evaluation
The authors perform a thorough evaluation of the different aspects of FlashTier implementation against those of the native device (the FlashCache implementation). For each evaluation, they compared these cases – Native, SSC (FlashTier implementation with SSC using SE-Util policy) and SSC-R (SSC with SE-Merge) . The SSC and SSC-R were implemented with write-through and/or write-back policies. The evaluations were run on workloads with different I/O traces and different mix of reads and writes. The authors first demonstrated that FlashTier performed the same as a native system for read-intensive workloads and better for write-intensive workloads. The device and host memory overhead was then evaluated for FlashTier and the native implementation, which gave mixed results. The consistency overhead of FlashTier for persisting both clean and/or dirty blocks was then found to be lower than that of a native implementation. Also, the crash recovery time of FlashTier beats that of a native implementation due to the presence of checkpoints. FlashTier's silent eviction mechanism ensured that the SSC and SSC-R implementations performed better than their native counterpart in garbage collection and wear management. However, the focus of SSC eviction on erase-block utilization instead of recency caused a slight reduction in the cache hit rate, compared to a native system.

Overall, I believe that all design aspects of FlashTier were sufficiently evaluated to provide confidence in its performance as a flash-storage based caching system.

Questions/Evaluation
1. Flash storage terminology of dies, planes, blocks, out-of-band plane regions.
2. Hybrid layer mapping.

Summary:
Existing SSDs are designed to be drop in disk replacements and hence are mismatched to be use for a cache.
Flash tier is a system architecture built upon SSC, a flash device with an interface designed for caching. Management software at operating system block layer directs caching. Flash tier provides:
1) Unified logical address space
2) Cache consistency
3) Silent eviction during garbage collection to improve performance.

Problem: Flash caching promises an inexpensive boost to storage performance. Traditional SSDs are designed to be a drop-in-disk replacement and do not leverage the unique behavior of caching worklods, such as large, sparse workloads and clean data that can be safely lost. Also, traditional SSds do not provide efficient caching of data persistently across reboots.

Contribution:
1) Address Space Management : SSC has been optimized for storing sparse address spaces. It does not have its own address space. It exposes a unified address space, the cache manager can write to an SSC using logical block numbers and SSC internally maps those. This has many benefits: as cache-manager need not store the mapping table persistently because this functionality is provided by SSC.
2) Free Space Mangement : SSC also maintains block states like: usage statistics, state of flash blocks for garbage collection which are accessed by physical addresses. Hence also supports reverse map structures for faster garbage collection or eviction.
3) New interfaces : evict, clean and exists helps in improving SSC performance. Evict or clean operation when issues by cache manager just update metadata information and the silent eviction mechanism can silently evict them in the collection cycle. No copy overhead since the mechanism evicts clean data. Regular garbage collection happens only if there are not enough candidate blocks for regulr garbage collection.
4) 2 policies for silent eviction: SE-Util and Se-Merge
5) Consistent Interface: Flash-tier never returns stale data and persists data across reboots. This greatly improves performance by reducing cache warmup times. The new interface provides precise guarantees over consistenty of both cached data and mapping information.
6) Recovery: The recovery times are very fast even though flash-tier provides reduced write latency with logging. This is because of check-pointing the logs to help replay just from the checkpoint after every crash.


Evaluation:
The cost of the new features of providing sparse unified maps, cache consistency, fast recovery and silent eviction and wearout have all been extensively evaluated in this paper. The design implementation entails cache manager, SSC functional emulator and SSC timing simulator. The cache manager and SSC functional emulators are the linux kernel modules and the timing simulator models the time for the completion of each request.
The SSC is emulated with the parameters taken from the latencies of third generation Intel 300 series SSD. The Flashtier system is compared against Native system which uses unmodified Facebook FlashCache cache manager. Four real world workloads homes, mail , usr and proj have been used.
Firstly, overall performance is measured in both write through and write back modes. The read performances are identical but write performance is better due to better garbage collection mechanism in flash tier.
The memory overhead has been measured but no concrete explanations of why SSC-R consumers 160% more , but SSC consumes 11% more. I understand this is due to logging overhead but more explanations regarding this would be better.
The consistency cost has been measure in terms of number of no-consistency IOPS and flashtier is seen to be marginally better than flashcache. Recovery time has been nicely shown to be much better which is also very intuitive due to the check-pointing feature in flashtier and the consistency and persistency across reboots. Garbage collection cost has been measured in terms of performance and wear management has also been measured for all workloads. Also, cache misses have shown to be higher in Flash-tier due to its eviction policy not based on recency.
More explanation of benefits and drawbacks for SSC-R and SSC would be better. But overall each and every feature has been evaluated to showcase the benefits and drawbacks.

Confusion:
How does SEUtil manage constant log size if it does not make the garbage collected blocks avialable to be used as SEUtil. Overall what is the benefit of SEUtil?
This is just a protototype? We do not have anything like SSC in the industry today?

1. Summary
This paper discusses a new caching architecture called FlashTier, aimed at addressing the limitations of existing SSD-caches and consists of a new solid state device optimized for caching and an operating system layer that manages and directs caching.

2. Problem
Existing SSDs are optimized for replacing disks as persistent storage devices and hence not suitable for use as a cache for disks. SSD based caches suffer from the following limitations due to the differences in characteristics of storage and cache. SSD caches incur additional overheads with respect to memory in storing extra address space translations that converts disk logical addresses to SSD logical addresses. Garbage collection is not cache aware and fails to make use of the feature that clean data can be silently evicted rather than copying the data and thus affects cache performance. Warming up the cache after reboot takes time and hence the cache must be made durable while maintaining consistency.

3. Contributions
The main contribution of the paper is the system architecture of FlashTier, which tries to exploit caching behavior and caching requirements and proposes a new solid state device (SSC), which is optimized for caching and exposes a new interface that can be used by a cache management layer in the OS to direct caching. i) Flashtier provides a new unified address space management that maintains a single level of indirection between disk logical address and SSD logical address in the SSC, which was previously split between the cache manager and SSD firmware. This address mapping is optimized for sparseness of cached blocks and uses an existing sparse hash map structure. SSC also maintains block state in the out of band area in each flash page and a dirty block bitmap for dirty pages in an erase block. ii) Flashtier provides a new crash consistent interface for the SSD that reflects the guarantee that SSC never loses dirty data and the SSC never returns stale data due to inconsistent mapping thus making cache management effective. SSC internally relies on a combination of logging, checkpointing and out of band writes to maintain consistency of persistent data. iii) This new interface enables silent eviction of clean blocks using new separate policies that guide in selection of victim blocks for eviction.

4. Evaluation
The evaluation of FlashTier is done using real-world workloads and consists of a mix of of read-intensive and write intensive workloads. The experiments reveal performance improvements for write heavy workloads in write back and write through modes when compared to native system with SSD cache. The experiments also show the benefits of their unified address space management that reduces the host memory overhead significantly with a slight increase in device memory consumption. The evaluations show that the cost of maintaining a consistent cache is less when weighed against the performance benefits due to a much reduced crash recovery time, when compared to native SSD caches. The evaluations also bring out the benefit of using silent data eviction in garbage collection and shows that it improves performance when compared to native SSD cache. Evaluations also show reduction in write amplification and number of erase operations. Overall, the various experiments evaluate and bring out the performance benefits of their design choices and is very thorough.

5. Confusion
What is a flash plane? Is it same as an erase block?

1. Summary
This paper describes a new design for controllers for solid-state memory, and for the associated portions of the operating system, that gives better performance when a solid-state drive is used as a cache. In particular, the design takes into account the sparse usage of the drive and the possibility to free space by evicting clean pages.

2. Problem
When solid state drives were first introduced, people assumed that they would be used in place of hard drives. This meant that the controller, drivers, and kernel modules for solid-state drives were designed for that use case. However, many solid-state drives were instead used as caches, since, although they were slower than RAM, they were faster than hard drives and still persistent. In these use cases, the drives are rarely filled and data can be safely evicted without having to worry about copying it. This allows faster performance and also less wear upon the solid-state drives.

3. Contributions
The system described in this paper in part modifies the cache manager, the software responsible for managing the transfer of data from solid-state drive to hard drive. The manager has two modes of usage: write-through and write-back. In write-through mode, it always writes data to the hard disks, using the solid-state drive only as a local copy. In write-back mode, the cache manager must actively manage the contents of the solid-state cache since it may write data only to the cache and not to disk.

To implement this, the paper describes a new interface for a solid-state cache, with six operations. Write-dirty and write-clean both write data to the cache, but only write-dirty guarantees that the data in the cache is durable before returning, since data written with write-clean can always be recovered from the hard disk. A read operation can return a data-not-found error, which allows the cache manager to maintain an imprecise record of the cache. An evict operation implements the removal of data from the cache, while a clean operation allows the cache manager to indicate that data is also present on the hard disk. Finally, to allow the cache manager to recover the list of dirty blocks after a crash, an exists operation provides a list of dirty blocks in a range.

In addition, the design includes a unified address space for both the hard disk and the solid-state cache. This removes a level of indirection, since solid-state drives must always maintain a mapping from addresses to physical locations in order to avoid wear.

4. Evaluation
The authors evaluate this work on four traces of real-world usage, from an email server, a file server, a small data center, and project directories. They simulate their design using both a functional emulator and a timing simulator and compare it to an unmodified cache manager. They find that their system performs faster than the native system on write-intensive workloads and nearly the same on read-intensive workloads. They also find that their simulator consumes more drive memory, but less host memory, and that it always performs fewer erasures than the native system. They also compare the recovery time after an induced crash, finding that it is much faster than for the native system. All of this evaluation seems to demonstrate well the claims made in the paper.

5. Confusion
Has this interface design been used in any actual solid-state drive?

1. Summary
The availability of high-speed solid-stage storage has added a new tier in the disk storage hierarchy by providing low latency and high-IOPS to cache data in front of high capacity disks. This paper presents FlashTier, a system architecture built on solid-state flash device providing an interface for caching and implementing a cache manager at the OS layer to implement cache based policies on the flash storage device.

2. Problem
Flash/SSD is an attractive technology for caching as its performance is between DRAM and disk and it has the property of persistence to survive crashes. The authors are addressing the limitations of using traditional SSDs for caching, in FlashTier design, which arise primarily because the nature of caches is different from storage : data in cache need not be durable, cache stores data from different address spaces, cache must ensure it does not return stale data(consistency guarantees). Building a cache on SSD is hindered by narrow block interface and internal block management of SSDs.

3. Contributions
The main contributions of the paper are the ideas of changing the interface and internal block management of SSDs to provide effective caching and the design of cache manager which migrates data between flash caching tier and disk. FlashTier exploits the differentiating features of caching workload (from storage) to improve over traditional SSD based caches. It provides a unified address space allowing data to be written to SSC at disk address, removing need for separate disk to SSD address translation and using an optimized data structure for large sparse address space(map block number to location on flash). FlashTier provides cache consistency guarantees to ensure correctness after system crash/failure, providing guarantees(cache will not return stale data) for both clean and dirty data supporting both write-through and write-back caching. They provide 6 interface operations - write-clean, write-dirty, reads, block cleaning, testing with exists, and have used a combination of checkpoint, logging and out-of-band writes to ensure persistence. They propose the idea of silent eviction along with garbage collection to support free space management in cache implementing the policies SE-util and SE-merge.

4. Evaluation
The authors have done a comprehensive evaluation of their design choices and have provided reasons for the results from experiments. The cost and benefits of FlashTier’s design components were compared against traditional caching on SSDs. Three components of the evaluation were - benefits of sparse unififed cache address space, cost of providing cache guarantees and benefits of silent eviction. The authors have evaluated this on both read and write intensive workloads. PErformance of SSC showed significant performance benefit for both write-back and write-through modes on write based loads, and identical performance on read loads. Memory consumption of more device memory(11%-SSC,160%-SSC-R) although they consume less host memory. Also FlashTier performs about 50% less erases for write-intensive workloads. The overhead maintaining consistency and presistence is measured and is around 18-29%. The recovery time for FlashTier varies from 34ms for small caches to 2,4s for 102GB cache, which is much faster compared to native SSD caches. It outperforms the native SSDs on write based workloads in garbage collection due to silent-eviction. Flashtier also improves reliability of flash device by decreasing the number of blocks erased (around 26-35%).

5. Confusion
Can the cache hierarchy be improved by adding a software based TLB in OS?
Could you also explain about the policies used for free space management in the cache manager?

Summary
This paper describes the design of a solid-state cache (SSC), a flash device that modifies the interface to conventional SSDs for precise caching support. It then proposes a new system architecture known as FlashTier that can leverage the SSC to improve performance and provide better cache consistency & crash resiliency.

Problem
SSDs guarantee low latency & high IOPS speed and provide a faster storage device. In fact, they are now being extensively used as a caching device paired with large slow disks to provide fast access to large datasets at a much reduced cost. However, using SSDs as a drop-in disk replacement for caching does not leverage the unique behavior of caching workloads and hence, their performance has been suboptimal. This paper aims to achieve better performance by proposing a SSD-backed caching solution that specifically targets the unique requirements of caching workloads.

Contributions
According to me, the following are the novel contributions of this paper:
(1) The paper distinguishes the unique properties of a caching device from a typical storage device and highlights the three important design goals of an SSD-backed caching solution- address space management, free space management, and consistent interface.
(2) The existing alternate caching solutions over SSD are primarily software-only systems that have been limited by narrow storage interface of a typical SSD. This paper takes a radical approach of proposing a new solid-state caching(SSC) device that aims to alleviate the problems with the underlying interface altogether.
(3) Using an optimized sparse hash map, the paper discusses how a unified address space can be used to simplify cache management.
(4) By providing a consistent cache interface with explicit semantics for read/write and eviction, FlashTier is able to optimize for both write-back and write-through cache guarantees.
(5) Through a combination of logging, check pointing and out-of-band writes, Flashtier is able to provide faster recovery and better crash resiliency.
(6) The paper proposes a novel silent eviction mechanism with its related policies (SE-Util & SE-Merge) that not only provide high write performance but also extend the lifetime of the SSD device.

Evaluation
To evaluate the various design ideas presented in the paper, the authors have implemented the three components of FlashTier- cache manager (based on FlashCache), SSC functional emulator and SSC timing simulator (based on FlashSim). This implementation is then evaluated against a native system, using the unmodified FlashCache with FlashSim, through four real-world workload traces (homes, mail, usr, proj).
For write intensive workloads, the FlashTier system outperforms the native system by 59-128% in write-back mode and 38-79% in write-through mode. While for read-intensive workloads, the performance of both the systems is almost identical. Although FlashTier consumes more device memory, it consumes 89% less host memory leading to reduction in address translation space by 60%. Although there is some additional cost of consistency for write-intensive workloads, FlashTier is able to recover from the crash of a 100 GB cache in only about 2.4 seconds. Due to the silent eviction, FlashTier performs about 45-57% fewer erases on a write-intensive workload, improving the performance by upto 167%.
Overall, it was a good evaluation as the paper clearly highlighted the three key design goals and built all the experiments around them, presenting precise measurements for each of the evaluated metric. The introduction section of the paper mentions that traditional SSDs at Facebook are used to provide access to petabytes of data. Therefore, I am curious about how scalable the sparse hash map design is with respect to the size of data that it can be used to address, and it would have been interesting to see a discussion around this in the paper.

Confusion
Could we go over in the class briefly about the internals of SSD, for example, the details about planes, dies and how does addressing & read/write/erase cycles in SSD differ from traditional mechanical disks?

1. Summary
This paper describes Flash Tier, a system architecture that provides a new flash device, the solid-state cache (SSC), which has an interface designed for caching. The authors initially identify the three shortcomings of using traditional SSDs for caching and propose a new design that takes into account the fact that caching and storage have different behaviour and requirements. Finally, the authors evaluate their proposed design by implementing a SSC simulator and a cache manager in Linux. Initial results prove that the proposed design is more effective than using traditional caching on SSDs.
2. Problem
Flash is an attractive technology for caching because its price and performance are between DRAM and disk. However, the narrow block interface and internal block management of SSDs hinder building a cache upon a standard SSD. The current practice of using SSDs as caches is not effective as it is designed to be a drop-in disk replacement and it does not take into account the fact that caching and storage have different requirements. As a result of this, the aforementioned practice suffers from memory overhead due to two levels of indirection, high recovery time after a crash due to the fact that mapping must be persisted along with cached data, and the consistency must also be guaranteed and high cost of garbage collection. The authors realize the difference in requirements for caching as compared to storage and propose a solution tailored for caching.
3. Contribution
The main contributions of the authors are the three design goals of the proposed solution that tackle the limitations of caching on SSDs. Firstly, the authors propose to use a unified address space and block state between the OS and SSC that is optimized for caching via the use of a sparse hash map. This ensures that the cost of cache management is reduced via the elimination of the extra level of indirection and reduced memory requirement due to the usage of sparse hash maps. Secondly, the authors propose the use of a consistent cache interface. This ensures that stale data is never returned (via the use of the six operations – write-dirty, write-clean, read, evict clean and exists) and cached data is persisted across a system crash (via the use of logging, check pointing and out-of-band writes). Lastly, the authors propose a new form of free space management that ensures high write performance by leveraging the semantics of caches for garbage collection. The authors use the idea of silent eviction that reduces the copy overhead for rewriting the valid pages and give details about two possible policies (SE-Util and SE-Merge) that could be used to select victim blocks for eviction. The cache manager has support for both write-back and write-through caching modes. According to me, the reason why the proposed solution works is due to the fact that it is aware of it’s use case of caching.
4. Evaluation
The authors implemented their proposed solution by means of a simulator. The authors focused on evaluating the proposed solution to verify the effectiveness of the three main design goals. Also, the authors did compare the proposed solution to a native system. The evaluation of the two systems revealed that the proposed solution outperforms the native system in all cases in terms of performance, memory consumption as well as wear out. A 78% reduction in the total memory usage (host and device combined) confirms the effectiveness of the unified address management. The extra cost of consistency for the request response time is less than 26 μs for all workloads, which puts across the fact that cache consistency is achieved at an acceptable cost. The authors also evaluate the effectiveness of the free space management and results show that it improves performance by 167%. I feel that the authors have done a commendable job of carrying out an extensive evaluation of their proposed solution by using a diverse set of workloads.
5. Confusion
Could more details about how SSDs work be discussed in class?

Summary:
The paper describes a new caching system architecture called FlashTier with the introduction of SSDs in the storage hierarchy used for caching. This was implemented and evaluated using a FlashSim simulator and a cache management software layer in the kernel.

Problem:
SSDs are a replacement for disks hence was not meant for caching. So if the SSDs are used as a cache for disks, mapping between the flash address and logical disk blocks need to be maintained durably across crashes, which is additional overhead. Crash consistency and the need to reduce garbage collection in flash based systems are other current issues. The proposed kernel level cache manager that uses the SSC as cache helps avoids these overhead and consistency issues.

Contributions:
Solid State Cache (SSC) device maintains the cached copies of logical blocks. The paper proposes a modified flash device (SSC) to provide a CONSISTENT CACHE INTERFACE to the OS cache manager to six additional operations such as checking if testing with exists, reads, block cleaning, writing data in multiple modes. SSC can track dirty blocks and perform evictions itself silently. SSC is addressed using block numbers and a sparse hash map data structure is stored in the SSC, hence providing a UNIFIED ADDRESS SPACE. SSCs permit write-back and write-through caching and maintains dirty/clean information about the blocks to enable silent cache evictions and reducing write latency and flash device wear. SSC also provides crash consistency using the traditional logging and checkpointing techniques to reduce recovery time (by replaying logs) and also defines consistency guarantee for clean/dirty data. The mapping of data blocks is made persistent in the SSC, hence removes the slow cache warming issues during crashes/reboots. The paper also discusses the SE-util and SE-merge policies which determines the data/log block management and also block eviction. In effect the design unifies the address space across kernel cache manager and SSC, provides silent eviction in SSC and durability for clean and dirty data with fast crash recovery.

Evaluation:
This design has been evaluated against the FlashSim SSD simulator and Facebook FlashCache Cache Manager. 4 different workloads are used. FlashTier reduces the memory usage required for address translation by 60% relative to SSD cache and 78% for combination of host and device and the evaluation was done using real world workloads. Paper also evaluates the cost of providing consistency guarantees and silent cache eviction. SSD performs better than SSD in cache eviction by 34-52% and SSC-R by 71-83%, and it outperforms SSD in cache performance by 167%. FlashTier due to logging and checkpoints can recover from crashe of a 100GB cache in 2.4 seconds. So, the best part of the evaluation is that it addresses each of its contributions separately in a way it’s supposed to. Write workloads also performed better due to silent eviction and SE-merge. The additional cost of providing consistency guarantees was found to be less than 26 microseconds (3-5% increase) on all workloads.

Why was the FlashSim SSD simulator+Facebook FlashCache used for comparison? Why not build a similar SSD based system and use the Facebook FlashCache to run benchmarks and evaluate this system against those benchmarks?

Doubts
What are the interfaces that SSDs provide and are there benefits of adding new interfaces. If yes, why don’t the device manufacturers provide them initially? Is the wearing out of flash memory considered to be a serious concern in large production systems?

1. Summary
This paper describes FlashTier, a system architecture that tightly integrates solid-state block cache (SSC) into the storage hierarchy. The authors delineate a software component at the OS layer called cache-manager that is responsible for migrating data between the flash and disk tiers; and a solid-state block level cache that has been specially designed for caching.

2. Problem
SSD's are deployed as caches in front of cheap, slow and high capacity disks to obtain the high performance and low latency access of flash at the cost of disk drives. Use of SSD's as a cache is hindered by its narrow block interface and internal block management as most of them are designed to be drop-in disk replacements. There are three associated inefficiencies with SSD block cache:
->> Address Space Management: Address translation requires two levels of indirection. One mapping for disk to SSD and another by the Flash Translation Layer (FTL) for logical to physical address.
->> Consistency and Durability: Handling crash failures is expensive and it requires the maintenance of additional structures.
->> Free Space Management: Low write performance and endurance due to garbage collection for caching.

3. Contributions
The primary contributions of FlashTier is the system design that includes a solid-state block-level cache (SSC) and the cache manager. This design has three key aspects:
->> Address Space Management: FlashTier unifies the address space and cache block state split between the cache manager and firmware in SSC. The SSC is optimised for sparseness by using a special sparse hash map data structure (LBN -> PBA) that grows with the actual number of entries and uses approximately 8.4 bytes per entry. Additionally, the block state is stored in OOB area of each flash page. This is used to guide wear levelling and eviction policies. Also, a dirty-block bitmap is used to track dirty pages within the erase block.
->> Consistent Cache Interface: This helps in persisting cache contents across a reboot or crash and to ensure that stale data is never data is never returned from the cache. This is provided by six interfaces (read, write-clean, write-dirty, clean, evict, exists). Persistence is provided by a combination of logging, checkpointing and recovery.
->> Free Space Management: This is achieved by combining silent eviction with policies. Silent eviction integrates cache replacement with garbage collection and the garbage collector selects a flash page to clean and selects the top-k victim blocks based on a policy which ensures that FlashTier does not incur any overhead for rewriting valid pages. The SE-Util policy selects the erase block with the smallest number of valid pages to create erased data blocks. The SE-Merge policy employs the same strategy for selecting candidate victims and allows the erased blocks to be used for both data or log blocks.

The Cache Manager based on FlashCache supports both write-through and write-back modes and implements a recovery mechanism to enable cache use after crash. In write-through mode, the cache is consulted on every read and fetches the data from disk on a miss and writes it to the SSC with write-clean. In write-back mode, the reads are handled like write-through mode but on a write, the data is updated only on the SSC with write-dirty.

4. Evaluation
The implementation consists of a Cache Manager (FlashCache based), SSC functional emulator (FlashSim based) and SSC timing simulator which models the time for the completion of each request. There are two basic configurations:
- SSC using SE-Util Policy and log blocks fixed at 7% capacity
- SSC-R using SE-Merge policy and the fraction ranges from 0-20%

Performance
The system is evaluated using four real world traces that consists of (file server, mail server, user home directory, project directory) workloads. Experiments are conducted to compare the write-through and write-back modes against a native system. For the write intensive workloads (homes, mail) the FlashTier system with SSC outperforms native systems by 59-128% in write-back mode and 38-79% in write through mode. With SSC-R, the FlashTier system outperforms the native system by 101-167% in write-back mode and 65-102% in write-through mode. For read intensive (usr, proj) workloads, FlashTier performs identical to a native system.

Memory Consumption:
FlashTier with SSC consumes 11% more device memory and with SSC-R 160% but both configurations consume 89% less host memory.

Wearout:
On write-intensive workloads, SSC performs 45% fewer erases while SSC-R performs 57% fewer.
On read-intensive workloads, SSC performs 3% fewer erases while SSC-R performs 6% fewer.

Host Memory
The unification of address space and metadata across cache manager and SSC reduces total memory use by 60%.

Consistency
Consistency guarantee cost is just between 8-16% for SE-Util and SE-Merge compared to 18-29% for native no consistency system.

Recovery
The recovery time for FlashTier has improved drastically as compared to a native system and typically varies from 34ms (small cache) to 2.4 seconds (large cache) for several workloads.

Silent Eviction
With silent eviction, on write-intensive workloads; FlashTier with SSC outperforms native SSD by 34-52% and SSC-R by 71-83%. This improvement can be attributed to the reduction in time spent for garbage collection as silent eviction avoids reading and rewriting data.

Cache Miss and Wear Management
For read-intensive workloads, miss rate increases negligibly.
On write-intensive workloads, miss rate is slightly higher but still under 2.5%. This increase occurs because SSC eviction policy relies on erase block utilization rather than recency which evicts blocks that were later referenced and resulted in a miss. For SSC-R, the extra log blocks reduce the number of valid pages evicted and the miss rate increases by just 1.5%.
Both SSC and SSC-R reduce the total number of erases by an average of 26% and 35% and the overhead of copying by an average of 32% and 52% respectively.

Overall, the evaluation appears to be comprehensive covering all the aspects of the proposed system design.

5. Confusion
Why is the current cache interface insufficient ??
Does narrow block interface imply ATA/SCSI ??

1. Summary
This paper introduces a new design called flash-tier which is built upon solid-state cache (SSC) and is used to improve the caching feature of solid state devices (SSD). The paper talks about the limitations of the existing flash systems and then introduces their ideas/ design of cache, space management, addressing, guarantees and crash behavior. They then explain their design and 3 main features (address, free space management and consistent interface) in detail. Finally they talk about their various experimental results and how it outperforms the native system in various scenarios.

2. Problem
The design of SSD's have been mainly focused on improving storage. Using flash as a caching layer before disks are highly beneficial (performance and price is between DRAM and disks) but they need to be optimized for sparse address space (there are mappings for disk to SSD), having durability without expensive persistence & cache consistency and reduced demand for garbage collection. The design focus is mainly on replacing disks rather than acting as a cache.

3. Contributions
They have introduced a new design called flashtier which consists of a solid state (block-level) cache and a cache manager. It has 3 main features. They have a unified address space (avoid extra mapping) with sparse hash maps (reduces memory) and internal block state metadata (with reverse map and stored in out-of-band area of each flash page) which are used in policy decisions. They provide cache consistency guarantee using 6 interface operations (writes clean & dirty, reads, block cleaning, testing with exists), also here they have used an interesting idea of using a combination of checkpoint, logging and out-of-band writes to ensure persistence. I believe this along with the idea that data need not always be present in the cache are the main selling point of this design. They provide free space management using an idea of silent eviction to reduce copying overhead and they also propose 2 policies (SE-Util & SE-Merge) for selecting pages for eviction. Are these policies still being used? The cache manager supports both write-back (dirty write) and write-through (clean) mode.

4. Evaluation
The evaluation part of the paper is pretty heavy and they have been able to justify all their design decisions as well as provide the reasons behind the results. It would be nice to see how the system performs in various other types of workload e.g. read-write-read. Also when can you change the policy from write-through to write-back and vice versa? FlashTier has performed better than the native system in all these experiments (using both SE-Utile & SE-Merge policies). It is significantly better for write-intensive workloads (especially SSC-R with write back). They also have fewer erases (because of silent eviction and reduced copies) (except in case of read-intensive workload where they are slightly higher) and lesser memory consumption (unified address). The overhead for maintaining consistency is also less and slightly better than native system but the recovery time of flash-tier is significantly better/faster. They also show that flash-Tier spends relatively less time in garbage collection and has slightly higher miss rate for write intensive workload.

5. Confusion
Would like to see an example of how SE-Merge policy works. Are these policies (SE-Util & SE-Merge) used in the industry? Is the decision on when to use write-through vs write-back automated (or can it be made reliably automated)?

1. Summary
The paper introduces FlashTier,, a storage architecture built on top of SSCs (Solid State Cache) a new type of Solid State Drive (SSD) optimized to be a cache for a larger, slower backing store rather than a standalone disk. The paper highlights the various optimizations possible for this kind of workloads. An SSC is simulated over FlashSim to highlight the viability and gains of this solution.
2. Problem
SSDs are ideally positioned to be used as caches in front of hard disks. This is due their cost/access speeds as well as their persistence guarantees. However, most SSD firmware are designed for the drive to be used as a standalone storage device. A dedicated cache device benefits from various optimizations such as silent eviction of clean pages that are never utilized in a traditional SSD firmware and hence performance is left on the table.
3. Contribution
The paper makes a few key contributions. It removes a layer of address mapping by making the SSC work directly off a large sparse address space. This vastly reduces the state management requirement of the host. SSCs also provide crash consistency semantics of caches rather than those of storage disks. This makes for a faster recovery after a crash due to checkpointed persistent logs. The paper utilizes the fact that any clean data in a cache is also present somewhere else in the system to silently evict clean blocks reducing its reliance on costly and wear causing garbage collection. SSC also change the interface to the SSD to one resembling a cache to allow more granular control over what data is retained, evicted and returned on a read request.
4. Evaluation
The paper thoroughly evaluates each of the above contributions providing both the pros and cons of a certain design decision. They compare SSC against Facebook’s FlashCache as a cache manager that uses commodity SSDs for caching purposes. The authors emulated both a standard SSD and a SSC over FlashSim with valid metadata but invalid data to provide a common test bench. The authors test the memory requirements of a SSC large sparse address versus a SSDs small dense address space on both the host and the device. As a persistent cache is supposed to be crash consistent, the paper tests the recovery time following a crash for a native SSD with FlashCache and a SSC with FlashTier. THe paper also catalogues the garbage collection overheads of the new silent eviction policy. Overall the paper convinces the reader of the viability of the solution due to the real world workloads used to motivate the evaluation. I wonder if a custom SSD with this firmware could be used to evaluate this solution against a standards enterprise SSD for an apples to apples comparison.
5. Confusion
I am confused about the SE Utils policy keep a constant number of log blocks always available. The semantics of mapping both 4KB and 64 KB blocks is also not clear to me.

1. Summary
This paper describes FlashTier, a system architecture aimed at providing a data cache layer in the storage hierarchy using flash memory.
2. Problem
Solid state storage devices have been considered to be used as a layer of data cache on top of the hard disk, given its high access speed and relatively cheap cost. The non-volatility of SSD also makes it possible to recover data stored in cache after system crash or reboot. The goal of using SSD as cache efficiently poses several challenges to the design of FlashTier: Manage the sparse address space on cache, garbage collection and eviction of data and assuring consistency of data to support recovery.
3. Contributions
FlashTier provides a unified address space implemented on SSC firmware, and maintains a sparse mapping table from disk address to SSD address in its memory. Two writing modes are supported in FlashTier: Write-through mode, where a write to cache is always immediately reflected on disk, and write-back mode, where disk may not contain the newest data copy on cache. During cleaning dirty data in write-back mode has to be explicitly written back to the disk before it can be marked as clean. The garbage collection procedure for SSC can be sped up when the data can be just evicted rather compacted and copied around. FlashTier uses an operation log to persist changes to the sparse hash map. On recovery the SSC loads the latest checkpoint and uses the log record to roll forward and reconstruct the mapping structure as well as the reverse mappings in the data blocks.
4. Evaluation
To evaluate the performance of FlashTier it is compared against native system using the unmodified Facebook FlashCache
cache manager and the FlashSim SSD simulator. The performance is evaluated under four different workload in different length of duration. The result shows huge performance gain in write-intensive workload in both write-through and write-back mode, and benefit of garbage collection is offset by the need for consistency in read-intensive cases. The evaluation section contains a pretty comprehensive measurement on various aspects of the system, and especially for flash-based storage, the measurement on wear level is meaningful and shows the benefit of using FlashTier even if performance gain in some aspects are not very significant.
5. Confusion
If the sparse hash map is stored on a fixed portion on flash, how do we avoid the wearing on this portion?

Summary
The paper presents a new system called FlashTier to efficiently use Sold State Drive as Cache for Disks with large capacity by addressing the problems such as address translation using unified address space, Cache consistency by providing better interface for Solid state and better free space management using Checkpoint and logging
Problem
SSD’s are now used as cache which is visible to file system. But the full utilization of SSD as cache is lagging due to lack of proper interface and block management. Problems which occur are 2 level of indirection for address translation , several hours to recover from crash , And no proper free space management. So the paper presents a FlashTier System which aims to address these issues.
Contributions
1] Unified Address space:The cache manager now directly addresses block in Solid State Cache(SSC) by the logical block number of disk. The Cache manager doesn’t need to maintain mapping of disk block in SSC now. So faster access. The SSC optimized for spare address space using sparse hash table. Hence FlashTier reduces the Memory footprint both in host and in device.
2] Consistent Cache interface: SSC interface provides options such as write-dirty, write-clean, read, evict, clean, exists which help in efficient management of cache. Read now returns error if the requested block is not present in SSC. SSC never returns any stale data now hence cache manager can always consult SSC after a crash. SSC uses internal logging and checkpointing mechanisms for persistent mapping. Hence the overall cost of consistency is now reduced.
3] Better free space management: Freeing up space is done using silent eviction where lose the clean data block rather than copying it. In case of write back the block need not be written to disk. Silent eviction has 2 policies one with fixed log space and other with variable log space. Overall the Write through and write back performance is improved when compared to SSD.
Evaluation
Good real world workloads are picked up for evaluation of the system. Overall the Application performance increase with FlashTier system as shown in Figure when the application has more number of write requests. The Evaluation has clearly shown the performance improvement of the 3 features discussed in paper unified address space, cache consistency and free space management . Because of the unified address pace
reduced memory usage at cache manager level as shown in table 4. 78% low memory usage device and host combined. Faster recovery because usage of checkpoints and logging as shown in Figure 5. Less wear as total number of erases are reduced because of silent evaluation as clearly shown in Table 5.
Drawbacks:
All evaluations done in emulator. and Range of the data is in GB in workloads where as Sold state is used as cache when disk is large in terms of tera or peta bytes . So issues such as mentioned in evaluation -Insert into sparse hash map is 90% slower than SSD due to the rehashing operations , may effect largely when the data size is huge. So if the workloads were in Terabytes and the evaluation was in a proper Solid state drive system the evaluation would have been better.
Confusion
Please explain out of bound writes and flash planes.

1.Summary:
This paper is about the design of FlashTier, which consists of solid-state cache(SSC) designed for caching purposes and a cache manager at the operating system layer that directs caching. This system architecture addresses the limitations of using SSDs for caching. The authors have built a SSC simulator and a cache manager in Linux and have evaluated against the traditional SSDs as cache.

2.Problem:
Using SSDs as block cache has the following inefficiencies:
1) Cache manager has to maintain a mapping table within DRAM to translate disk addresses to SSD addresses. SSD block cache maintains another mapping table to convert logical addresses to flash(physical) addresses. These two levels of indirection pose memory overhead for both host DRAM and device memory.
2) To maintain consistency, cache manager needs to persist the cache block state and the mapped addresses. This is costly. As a result, it takes a long time for cache warmup after a crash.
3) SSDs do not scale well for write intensive workloads such as cache, thus providing less write performance during garbage collection.
The authors argue that caching and storage have different behaviors and come up with their own design.

3.Contributions:
1) A Cache manager that receives requests from block layer and redirects to either cache or disk. Two modes of usage - Write-through where data is written to disk and cache is updated simultaneously or during read requests(read heavy workloads) , Write-back where data is written to SSC and disk may not be updated (write-heavy workloads).
2) Unified address space: The cache manager directly addresses flash blocks in the SSC by disk logical addresses. SSC internally maps those address to physical locations in flash. SSC optimizes for sparse address space using a Sparse Hash Map. This solves the problems of two levels of indirections in addressing and persisting the address mappings.
3) Consistent cache interface: FlashTier provides commands such as write-dirty, write-clean, read, evict, clean and exists, which enable it to persist cache data during system reboot/crash and never return stale data. This is guaranteed by consistent reads following a cache write to dirty/clean data and cache eviction. It also uses logging and checkpointing for persistence and faster recovery.
4) Free space management: FlashTier employs silent eviction, where clean data is simply evicted rather than copying it. The cache manager uses evict/clean to identify cold clean data and evict them using SE-Util and SE-Merge policies for recycling.

4.Evaluations:
The authors have implemented FlashTier using Facebook's FlashCache as the cache manager and FlashSim as the SSC simulator, which are linux kernel modules.
They have focused evaluation on three key questions:
1) Benefits of providing sparse unified cache address space - FlashTier reduces memory usage by 60% of host memory relative to SSD Caches and overall 78% for device and host combined. Real world workloads are used for evaluation such as home, mail workloads and usr and project directories workloads.
2) Cost of cache consistency and recovery: They have compared the relative costs of consistency for SSD block cache, SSC persisting clean and dirty pages, and SSC Dirty persisting only dirty pages. Overall 16% less overhead for all workloads and request response time less than 26 microseconds.
3) Benefits of silent eviction and write performance: SSC outperforms SSD by 34-52% in garbage collection, 167% in cache performance. Also performs good for read intensive workloads despite of silent eviction.
In addition to evaluation of the features that FlashTier provide, comparison against native SSDs performance has also been made, which makes the evaluation complete in my opinion.

5.Confusion:
Details of flash storage.

Summary
The paper takes the intuitive idea that flash-based storage devices present a great opportunity as a performance-enhancing tier in front of conventional magnetic disks, and explores how both operating system and device implementations might usefully change in order to better support it. It presents FlashTier, a system architecture built upon solid-state cache(SSC), a flash device with an interface designed for caching. The authors implemented an SSC simulator and a cache manager in Linux and have carefully evaluated their performance.
The Problem
Flash is an attractive option for caching because of its price and performance benefits. However Flash has two characteristics that require special management to achieve high reliability and performance namely lack of in-place writes and use of address-mapping and the resulting garbage collection. Caching and storage have different behavior and different requirements. Firstly storage should optimise for dense address space while SSC storing active data only should optimise for a sparse address space . The next issue in hand is that consistency guarantees of cache are different from that of storage; for the latter ordering suffices while caching needs to guarantee that it will never return stale data or lose dirty data. Finally a major problem with SSDs is their limited write endurance. Facts that caching workloads are more intensive than regular workloads and that caches operate at full capacity make garbage collection an even bigger problem for SSCs. All the above factors motivated the authors to come up with an optimised flash based caching , FlashTier.
Contribution
FlashTier mainly exploits three features of caching workloads to improve over SSD-based caches.
1.FlashTier unifies address space and cache block state between the OS and the SSC to optimise for sparseness of cached blocks. It removes two levels of indirection for address translation.
2.FlashTier provides cache consistency guarantees to provide consistent reads after cache writes and eviction, and make both clean and dirty data as well as the address mapping durable across a system crash or reboot. It uses a combination of logging, checkpointing and out-of-band writes to persist its internal data.
3.To improve cache write performance , FlashTier provides free space management by silently evicting data rather than copying it within the SSC. It uses cost/benefit mechanism to select victim blocks.

Evaluation
The evaluation has been done quite convincingly. In trace-based experiments, it is shown that FlashTier reduces address translation space by 60% and silent eviction improves performance by upto 167%. furthermore, FlashTier can recover from the crash of a 100 GB cache in only 2.4 seconds.
I really appreciate the way the evaluation addresses performance statistics of each the three different contributions separately. A new thing I saw in this paper which I really liked is the tabulation of the characteristics of the four real-world traces .The traces are not only varied in type but also have different read write mix to analyse the performance impact of SSC and silent eviction. My only concern is that the evaluation is based on simulation only. Also a comparison of FlashTier with existing, deployed systems with commodity SSDs would have been a valuable addition.
Confusions
I could not understand the working of the sparse hash map.

1. Summary
This paper discusses FlashTier, a system architecture built upon the new concept of Solid State Caches (SSC). SSCs differ from regular SSDs in that the interface to the device, and the internal block management, are modified for usage specifically as a cache. FlashTier provides a block-level SSC system with unified logical address space, cache consistency on crash guarantees, and silent data block eviction. These result in sizeable improvements over existing SSD as cache implementations.

2. Problem
Existing SSDs being used as caches have shortcomings due to the nature of the interface and their internal management. This is due to the fact that SSDs are considered / handled as storage. Caches differ from storage in that they hold sparsely populated address spaces, provide cache consistency guarantees, and data need not be durable, as it will be present on the backing store. This meant that there were unleveraged opportunities for optimizing the SSD interface / internal management for the caching scenario.

3. Contributions
Primary contribution, in my opinion : The SSC concept - interface changes, internal data management change (data structure + eviction / garbage collection).
Interface changes - Cache data management (write-dirty, write-clean, read) and Free Space Management (evict, clean, exists)
Unified logical address space - disk addresses can directly be passed to SSC. (the SSC is aware that it is a cache.)
Sparse hash map - Address space management - lesser address translation space, grows linearly.
Block State - additional data for internal management like wear levelling, garbage collection, eviction.
Persistence - The mapping (metadata) is moved from main memory to within SSC, which also means that the caches can be persistent even on crashes. Earlier, metadata had to be persisted on each update -> overhead. Logging, checkpoints, and OutOfBand writes ae used to further provide low latency writes to distributed data, fast recovery, and low latency metadata writes.
Reduced garbage collection - better reliability, longer SSD life. GC is needed only for log blocks, and even this is reduced in SE-Merge.
Silent eviction - In storage, live data cannot be dropped, while the freedom to drop is leveraged in SSCs. This speeds up garbage collection, reduces copying and writing.
Eviction Policies - SE-Util, SE-Merge. Both select blocks with least used pages, different in that cleaned blocks may not / may be used for logging.


4. Evaluation
Address Translation Space - Sparse Hash Map - reduces usage of device memory by 60%.
Performance benefits - Silent eviction - performance benefits of upto 167%.
Wear - silent eviction - no need to copy during garbage collection - reduced wear upto 57%.
Crash Recovery - 100 GB cache crash recovery in 2.4 seconds, compared to 9.4 s (native), smaller cache 36 ms vs 133 ms.
Garbage Collection - much better perf on write-intensive workloads, thanks to silent eviction and SE-Merge.
Consistency costs - 3-5% increase in time, but less than 26 microseconds for all workloads.
My opinion -
Negative is that it seems to me that off the shelf SSDs cannot be plugged and played with minimal work, which can be done with the traditional native implementations. The Evaluation is thorough, and sensible in comparison to the existing state-of-the-art (FlashCache), albeit through simulations.

5. Confusion
Explanation on how the existing SSD interface was insufficient for consistency.. Also, how did SSD as cache implementations handle this?
What is the IT overhead in generating SSCs / converting SSDs to SSCs ?
What was the reaction to / impact of this paper?


1. Summary

This paper describes FlashTier, a system architecture built upon solid state cache with an interface designed for caching. It explores the changes to the interface and internal block management of conventional SSDs and changes to the cache manager for an effective caching device. FlashTier achieves this through a unified address space, cache consistency guarantees and reduced garbage collection costs.

2. Problem

The SSD-backed caches performance are hindered by the narrow block interface and internal block management of SSD. Storage and caching have different requirements. The SSD characteristics like erasing and address mapping do not allow them to cache well.

3. Contribution

The authors explore the different optimisation options and clearly lay the functionality distinctions between a cache and a persistent storage triggering the changes. Due to the requirement of storing only hot data, an SSC should optimise for a sparse address space. Caching data persistently across system restarts greatly improves cache effectiveness. Data is written frequently in cache and there is a greater demand for garbage collection due to them operating at full capacity. SSC defines two usage modes : write-through and write-back. Rather than maintaining a mapping table, SSC uses a unified address space. The address mapping is done internally to physical locations in flash. Space management is done through evict, clean and exists mechanisms. SSC provides the cache guarantees like durability and safety. The FlashTier interface provides six operations: write-dirty, write-clean, read, evict, clean, exists which is used by the cache manager. SSC uses a combination of logging, checkpoints and out-of-band writes to persist its internal data. FlashTier implements silent eviction mechanism by integrating cache replace with garbage collection. The paper describes two eviction policies: SE-Util and SE-Merge. The paper also provides implementation details. The authors provide a whole system consideration of the problem without limiting to a specific software layer. The authors assert their idea that solid state disks can be better caches if they don’t behave like disks. It simplifies garbage collection and favours relatively simple interfaces for providing efficient caching and quick reboots.

4. Evaluation

The system has been carefully and convincingly evaluated. The authors evaluate the system over four main workloads: home, mail, usr, proj. They evaluate the different aspects of system performance and compare it with the performance of the other systems. SSC and SSC-R outperforms the native system for write intensive workloads, though performance in the read intensive workloads is comparable. The authors also analyse the system performance in terms of memory consumption, whereat, device and host memory usage. Consistency, recovery times is also evaluated. The silent eviction mechanism is evaluated through performance in garbage collection, cache misses and wear management. Overall, the system seems to outperform the existing native interfaces.

5. Confusion

Would like to review logging.

1. Summary
The paper presents FlashTier, a solid state cache architecture which improves the performance of SSD based caching. The proposed technique greatly improves performance (1.6x) and reduces the host memory footprint considerably. It also provides faster cache data recovery after crashes.

2. Problems
The major problem with using SSD as cache is that they were designed as drop in replacements for disks. This leads to following issues:
Unlike disks, caches have only sparse data address range, i.e they store the details about few hot pages. Hence cache address range need not correspond to the size of the disk.
Existing crash recovery mechanisms like barriers do not provide address consistency between data on different devices. When designing cache, one should ensure this between disk and the cache.
SSDs have wear levelling issues and have limited number of writes. Thus unnecessarily moving the dead data across the cache will just lead to unnecessary writes and reduce device lifetime.

3. Contribution
The FlashTier design is built upon the SSC architecture and tries to address the issues in SSD based caches which are designed for disk use case.
The proposed design presents a unified address space which removes the need to maintain a separate main memory to flash disk translations. Also, they have implemented sparse index hash table whose size is dependent on the amount of blocks cached in SSC rather than a fixed table. Also, since the table is in SSC, the cache manager need not ensure persistence.
SSC exposes mechanisms like evict, clean and exists which the cache manager can use to indicate SSC once it does not need any data. It implements what is known as silent eviction which combines cache management with garbage collection. This indication is used by the garbage collector and it will not move the data if marked clean or evict. This creates more space each time garbage collector is invoked and hence significantly improves the performance.
The paper presents two different policies to manage the erase blocks and allowing writes at data block or log block level granularity.
This design provides persistence through a combination of logging, checkpointing and out of band metadata updates. It is ensured in both the cases of write-back and write-through mechanisms. Checkpointing ensures that the logs are small and aids in faster recovery after crashes.

4. Evaluation
The paper presents a comprehensible evaluation of the proposed technique. The evaluation focusses on various overhead involved in the mechanisms like cache persistency, address space maintaincance,etc. and performance improvements due to the policies adopted like silent eviction(SE-util and SE-Merge). The chosen workloads characterize the model under read intensive and write intensive use case conditions. To show the improvements in performance, the paper compares IOPS with various configurations of SSC and explain how the proposed technique performs better due to efficient garbage collection technique. It also tabulates the memory overheads in the device and host memory. The paper evaluates the consistency by characterizing the overheads involved(logging) and recovery time.

5. Confusion
What are the out-of-band(OOB) area in SSD? Are these fixed locations? In that case do they wear out faster than data block locations?
Does this technique give similar performance improvement when adopted for smaller datasets like a laptops? Is it relevant for consumer grade applications?

1. Summary
This paper describes a solid state cache (SSC) called FlashTier. It is a system architecture optimized to use a flash device as a cache for high capacity disks. The cache manager within the OS along with the modified flash device interface and internal block management provide a lightweight, consistent and durable caching mechanism.

2. Problem
Present Solid State Drives (SSDs) are designed to be a drop in replacement of disks. As a result, using them as a cache in front of a high capacity disks creates an inefficient system. An interface designed for disks will not be flexible enough to optimize for different caching policies. A cache stores data from a separate address space, rather than a native address space. In order to manage such a cache, traditional systems use two level of translations. One from disk address to SSD address (stored in DRAM) and another from logical to physical address (FTL). This leads to significant overheads. Cache Properties are significantly different from a storage drive behavior. Data in a cache need not be durable while its consistency requirements also differ from a normal storage device. These properties could be exploited to significantly improve the performance of such a system.

3. Contribution
The authors introduce SSC, a flash device optimized for caching, and develop a cache manager optimized to leverage the features of such a device. They identify that caching requirements are different from storage requirements and use this observations for optimization. SSC provides a unified address translation from disk address to physical address. As a result, the host memory does not need to store a translation from Disk Address to SSD Address. In order to accommodate the large address space of a disk, they use a sparse address data structure for translations. The flash device interface is modified to provide more control to the OS for cache management. The interface allows a way to distinguish between dirty and clean blocks The cache could be utilized as a write through or write back cache. Writes can be clean or dirty depending on the use case. Crash Recovery and Persistence is provided using logging, checkpointing and recovery. Finally Free space management is significantly optimized. Distinction between clean and dirty blocks helps make garbage collection faster. A garbage collector can silently remove clean (or evict) data improving its performance while reducing device wear. The unified address translation results in a lightweight cache manager which needs to maintain mappings for dirty data only. This helps the cache manager to write back data to disk as and when necessary.

4. Evaluation
FlashTier implementation consists of a cache manager, SSC emulator and a SSC timing simulator. They use 4 real world workloads for evaluation, 2 are write heavy while the other 2 are read heavy. FlashTier shows comparable performance for read heavy workloads while significantly outperform for write heavy workloads. Although the device memory overhead is more for FlashTier, the increase is not significant enough to cause an alarm. However, the host memory overhead is significantly reduced for FlashTier. The paper also evaluates the cost of guaranteeing persistence. The overhead of consistency for persisting clean and dirty data in FlashTier is lower than a native system for write heavy workloads while it is almost equivalent for the read heavy workloads. The time for recovery from crash is significantly better for FlashTier than the native system. Silent eviction improves garbage collection performance significantly without any noticeable increase in cache misses. The FlashTier system also reduces the device wear by reducing the total number of blocks erased for merge operation.

5. Questions
1. Can we talk about the challenges of maintaining crash consistency in Class? From a point of view of what sort of meta data should be maintained, what could go wrong if they are not maintained?
2. What are Out of Band regions?

1. Summary
The paper designs new caching interface for flash device directed by management software running at OS block layer at the same time solving challenges of SSD like durability, flexibility, separate address space and correctness hinder by its narrow block interface and internal block management.
2. Problem
A new caching interface is required to accommodate high volume of cache blocks cheaply. Data structures need to be differentiated based on device usage for storage and cache. Large cache and poor disk performance result in long cache warming periods. Making cache data durable is expensive and frequent garbage collection can hurt reliability by copying data to make empty blocks. SSD/hard disk provided crash consistency guarantee like barriers do not address consistency between data on different devices.
3. Contributions
It introduced unified logical address space to reduce the cost of cache block management within OS and SSD, guarantees cache consistency by allowing cached data to be used after crash with support for both write-back and write-through caching and leverages cache behavior to silently evict data blocks during garbage collection to improve performance. It provides isolation between caching device and its internal structures with system software managing cache and disk storing data. SSD-backed caching improved cold start performance with persistent cache contents, is cheaper than DRAM and increases cache effectiveness. It is optimized for sparse address space for effective storage of hot data. SSC employs data eviction in garbage collection with reverse maps which can effectively erase clean blocks. It guarantees durability of dirty data, safety to consult cache for data and invalidate cache data and is implemented in SSC itself for simplicity. It follows LFS semantics like logging, checkpoints and OOB writes to persists its internal data to allow for low latency writes and fast recovery. Eviction policies are introduced like SE-Util which erase data blocks with smallest valid pages while SE-Merge erase both logs and data at the cost of increase memory usage.
4. Evaluation
Prototype was implemented as cache manager, FlashCache and SSC functional emulator, FlashSim in kernel with timing simulator. Reduction in host memory usage is achieved even though device memory increases. Overall total memory usage reduction is observed which came from address space and metadata unification across cache manager and SSC. Free space management improves performance with fewer erase cycles than native SSD more prominently in write-back cache. Silent eviction policy reduces wearout by reducing erase count in write-heavy workloads at the cost of evicting useful data that can be written again. Garbage collection cost is reduced by employing SE-Merge eviction policy by increasing log blocks, thus performance advantages manifest at the cost of doubling device memory. Even though remove and lookup operations provide same latency, insert latency is reduced with spare hash map usage in SSC. Synchronous logging for insert/remove operations lead to consistency overhead in preserving clean and dirty blocks. Exists operation proved to be useful in fast recovering of write-back cache manager state. SSC-R configuration seem to be performing better in above aspects including miss rate than SSC.
5. Confusion
Why is write-back cache more prominent in increase in performance even when write-through is optimized with silent eviction policy?

Summary:
The paper describes FlashTier, a system architecture which guarantees persistent and consistent caching in the storage hierarchy with the use of Solid state-based cache (SSC) that provides different service guarantees than the SSD. The paper also describes the properties and implementation of an SSC.

Problem:
The use of off-the shelf SSDs as a cache for a disk is inefficient considering the difference in design goals of an SSD and a cache. For example, while an SSD must support dense address spaces, a SSD backed-cache has to handle only sparse address spaces. While, an SSD needs to guarantee persistence of all data blocks, a SSD backed-cache needs to guarantee persistence only for dirty blocks that are not updated in the disk. The paper describes the implementation of an SSC that addresses these differences.

Contributions:
1) FlashTier has a unified address space in the Disk and the SSC. This obviates the need to maintain a separate mapping between disk block adresses and Solid state addresses. FlashTier also tries to optimize the mapping from the global disk address and the physical address in the flash for sparsely populated address space. It does this by using an efficient sparse hash map data structure.
2) Since SSC is used as a cache, data that is written to both disk and SSC is marked clean in the SSC. This allows the SSC to silently evict the clean data without affecting the correctness. In SSD backed caches, this was not possible as the interface to the disk was limited to a read/write/trim commands and did not carry more information about data being read/written. SSC handles this by providing write_clean and write_dirty primitives to address this problem. It also exposes a clean and an evict operation for the cache manager to have better control. SSC thus provides an interface that is consistent with the use of Solid state-backed cache. Persistence for the write_dirty operation is guaranteed by the SSC using synchronous commits. FlashTier borrows ideas from LFS to support persistence.
3) FlashTier provides two different policies for the silent eviction of data blocks. In both policies, clean data blocks with the least amount of live data. However, in one policy (SE_Util), such erased data blocks are reserved only for other data blocks while in the other (SE_Merge), erased data blocks are used as a data block or multiple smaller log blocks. The requests that cannot be satisfied with the silent eviction policies are handled by the default garbage collector,

Evaluation:
The authors evaluate FlashTier on a number of fronts like performance in terms of IOPS relative to the native implementation of using a SSD as a cache, memory consumption due to the data structures to maintain the dirty blocks and wearout in terms of number of erases performed. The authors also evaluate the cost of maintaining consistency due to the overheads of logging and checkpointing relative to the system which does not need such structures to guarantee persistence. The paper also demonstrates the performance improvement using the Silent eviction policies. Overall, the authors have demonstrated that FlashTier achieves the goals that were set.

Confusion:
How does FlashTier perform at the system level? Given the SSC’s performance, does the scheduler have to be kept in the loop for making preempting decisions on an I/O which might get satisfied by the SSC?

1. Summary
FlashTier is a novel design that provides caching between OS and disks by exposing an interface through a Flash device. It provides a unified address space to directly map large and sparse disk address space, provides disk-like cache consistency guarantees to ensure correctness on crash, and optimizes garbage collection to evict data blocks. Thus, it improves performance, reduces memory consumption and wear on the device.

2. Problem
A caching layer between RAM and disk through Flash is beneficial as it is cheaper than RAMs and faster than disks. SSD-based cache have mapping of dense disk address space, and not active data from a sparse address space. They provide expensive persistence and simple consistency guarantees, and also a difficult and costly garbage collection mechanism as it offers limited writes. So, it is important to understand that the use and behavior of caching and storage is different and design caching as per that.

3. Contributions
FlashTier is designed to have two components- a cache manager: above the device driver that sends instructions to flash and/or disk and a solid state cache(SSC): that stores cached data. The main idea is to have a sparse hashmap data structure to account for the sparse address spaces from disk to flash addresses that contain only active data, containing the disk block address and indexed by LBN. Then it provides consistency semantics through SSC interface (six operations) that is appropriate for a cache to guarantee unlike being disk-like: persist cached data and not return stale data. The interface provides consistent reads after cache writes and eviction, and makes metadata, clean and dirty blocks durable. SSCs persist its data using logs that persist changes on the sparse hashmap, checkpointing the mapping structures and roll-forward for crash recovery. Finally, the SSCs perform a silent eviction mechanism of applying cache replacement on write-clean blocks without the need to copy valid pages, along with two policies: SE-Util, which selects the erase block with smallest number of valid pages and SE-Merge, which allows erased blocks to store metadata.

4. Evaluation
The authors compare FlashTier against SSD-based caching. They use real-world workloads- file server, mail, small enterprise data center with user home and project directories. They measure the performance of the write-back and write-through modes against the write-back mode of SSD-cache and observe that SSC performs very well for write-intensive workloads while no improvement for read-intensive ones. The number of erases is significantly reduced using SSC: more with write-intensive workload than read. Due to the unified address space and metadata, memory usage results show that SSC provides a huge savings of 78%. They then evaluate the cost of the consistency model against a no-consistency system to show a reduced crash recovery time with slight overheads to maintain the clean and dirty blocks. And finally, the silent eviction mechanism helps FlashTier have much lower garbage collection time, while also decreasing the number of erases mostly with write-intensive workloads.
The evaluation is pretty detailed and covers all the key aspects of the system design. The workloads have the right mix of characteristics and real-world behavior that can measure the performance of the SSC interface and mechanisms.

5. Comments/Confusion
A brief introduction on the storage devices and the device driver policies and interfaces, along with the trend in this area would be helpful. Couldn’t understand the working of the SE-Merge policy.

1. Summary
This paper talks about FlashTier, a novel durable Flash based caching system for disks. FlashTier extends the interface for SSDs to directly support cache operations, addressing using the disk address and optimized garbage collection using silent clean evicts.
2. Problem
SSD backed storage is appealing because of the attractive price/performance of flash. However SSD interface is designed for block based storage making it a mismatch to cache behavior. SSD based caches need to maintain an in memory map for cached blocks. Over this the flash FTL adds an additional layer of translation for wear leveling and garbage collection. The garbage collection mechanism assumes all blocks are needed and copies around cached data which could be backed in the disk. Thirdly, warming up a large cache is time consuming and so keeping cache state durable is favorable. However this adds overheads due to the need for synchronous metadata updates.
3. Contributions
Overall the authors propose a modified SSD device - SSC with a suitable interface for cache operations and a light cache management layer in the system:
1- The first main idea is to merge the two translations layers from the system and the FTL into the SSC. The SSC device is addressed using disk block numbers. The sparse address space is stored efficiently using a optimized google sparse-hashmap data structure. (The system uses a hybrid mapping)
2- Secondly the SSC is made aware if cache blocks are clean or dirty using separate operations - write-clean and write-dirty. SSC leverages this information to perform silent evictions of clean data blocks during garbage collection phases. The flash can thus avoid copying the blocks reducing both latency and wear overheads. The author evaluate two policies for selecting silent eviction victims, the second one allows the freed data blocks to be utilized for logs.
3- Since the SSC stores mapping information the cache manager only tracks dirty blocks to implement LRU replacement. This ensures there is free space in the cache for new writes.
4-Lastly, the SSC supports crash consistency by using logs and regular checkpoints. The authors define consistency guarantees for clean and dirty data which allow asynchronous group commits for clean data updates. Recovery is done using the checkpoint and rolling forward of the log.
4. Evaluation
The authors compare their system against the state of art FlashCache design using IO traces with varying amounts of writes. They evaluate both the I/O throughput and memory overhead for the design, FlashTier improves performance in write intensive workloads due to silent evictions and fares lower overall memory overhead. They also measure the relative overhead due to maintaining consistency - clean and dirty data, they do not clearly explain why the native system faces higher overheads in spite of persisting only dirty data updates. The authors also isolate benefits of silent evictions and measure write amplification and wear leveling.
5. Confusion
Do the logs for write-clean operations get buffered and written out lazily)?
also has this new system been commercialized?

1. Summary
The authors changed the interface and internal block management of SSD to be used as cache devices. Also, they provided semantics for crash endurance, better wear management and policies for writing to the disk. They introduced OS based Cache Manager and SSC (SSD as cache). They were able to prove that their design performs better, uses less memory, provides better crash guarantees semantics compared to Facebook’s Native FlashCache.

2. Problem
Flash based disks were designed to be used as backing store whereas their characteristics lead them to be used as caches. But there were no interface or policies which could provide cache friendly semantics in terms of block interface and internal block management.

3. Contribution
The main contribution of the authors was using the fact that the change in the mapping of the Logical blocks to the Physical blocks could be effectively stored in SSC using sparse hash maps instead of linear address space. This opened up several opportunities for faster access to the blocks and implementation of policies. In addition, they also maintain additional mapping for internal operations, to be used by Garbage collector (GC).
They introduced write-back and write-through policy for marking dirty and clean data blocks. Since a clean data block is backed in the disk, it can be silently evicted by the GC when there is memory pressure. This ensured consistency and durability guarantees. Also, since the mapping of the data blocks are persistent across crash/reboot, it obviates the slow cache warming issues.
The authors describe several properties; write modes - write-clean and write-dirty, reads afters write-clean, which can suffer from error. This is because a clean block may have been evicted by GC before its content could be read. Other operations were eviction (synchronous), clean (asynchronous) and exists (which checks if a block is dirty). Persistence was guaranteed by means of logging (storing the writes to metadata in batches), checkpointing (for faster recovery and smaller log replay) and recovery. They also discuss two main policies SE-Util and SE-Merge - which describe how data blocks and log blocks are managed, with the former allowing for data blocks only whereas latter allowing erased blocks for both data and log blocks. This determines how to pick victim block for eviction.

4. Evaluation
The evaluation was thorough; the authors evaluated their implementation by comparing the various modes of FlashTier along with Facebook’s Native FlashCache. They proved that it performs more I/O operations for write intensive operations due to performance gains from garbage collection mechanisms. They then compared the memory consumption of the system under two different modes of operation - FlashTier C and FlashTier C/D and proved that it has a lesser memory overhead comparatively, mainly due to unification of the address space and metadata across the cache manager and SSC. They then evaluated the consistency and showed that it was much faster to recover for FlashTier since it has to replay only the logs from the last checkpoint, whereas Native had to read from the entire OOB region. Also, the comparison of silent eviction across SSD, SSC and SSC-R proved that they had to do much fewer writes; the garbage collection tests corroborated the fact.

5. Confusion
I am still not clear with why exactly the existing interface for SSD was not sufficient for SSD caches as briefly mentioned in the paper.

1. Summary
The paper describes the design and implementation of a system architecture - FlashTier, which uses a flash device with a special interface for caching. This special flash device is called a Solid State Cache (SSC) and it gives a better performance as a cache than the traditional SSDs.

2. Problem
SSDs when deployed as caches for slow HDDs along with software for migrating data between SSDs and HDDs can improve storage performance. But SSDs are designed for storing persistent data and they have different consistency requirements. Data in cache can be present in another location and so they do not have to be strictly durable, cache needs to make sure that it never returns stale data while a storage device has to provide ordering guarantees for writing the data to ensure durability. The SSDs internally maintain a mapping table to map logical blocks to physical blocks and when traditional SSDs are used as cache the software has to maintain an extra level of indirection to map the logical address to block addresses in the HDD which is inefficient.

3. Contributions
Recognizing the problems with using traditional SSDs as cache the paper proposes a system called SSC. SSC provides consistency and durability guarantees for the cached data. It unifies the address space between the cache manager running on host and the firmware running in SSC, this design greatly reduces host memory footprint as compared to other caching solutions because the cache manager does not need to maintain a hash map. The paper also improves cache write performance by silently evicting the data instead of copying it within the SSC when garbage collection mechanisms are triggered, this not only simplifies the mechanisms but also helps in increasing the life of the flash device. A consistent interface between the OS and SSC as proposed in this paper helps to provide durability for both clean and dirty data across reboots.

4. Evaluation
The authors ran their evaluations on SSC which was implemented as a simulation in FlashSim and the cache manager which was based on Facebook's FlashCache and built as a kernel module. The paper does a thorough evaluation of the proposed architecture, the authors ran the real world i/o workloads with two different cache eviction policies and both write-back and write-through cache configurations and observed a performance gain of 38%-167% as compared to a caching system based on native SSD(FlashCache). Authors show that even though the device memory used increases the host memory significantly decreases and the number of erases decrease to maximum of 57%. The authors show that the recovery times after a crash were significantly reduced with the use of a checkpointing and logging system. Garbage collection times were also reduced, which resulted in overall better performance of SSC.
I feel that the evaluation of the system was pretty detailed, it highlighted the performance gains of SSC over a cache implemented on a traditional SSD and also reenforced the authors' design decisions for unified address space management and free space management mechanisms.

5. Confusion
Can we talk more about the design and mechanics of SSDs in class?

1. Summary: This paper presents a new system-wide architecture, and a new storage medium design built on a SSD, which can efficiently act as a cache and improve the effective IOPS of the system. The authors achieve this by redefining the interface of SSDs, and implementing policies to effectively make it aware that it is being used as a cache.
2. Problem: The SSDs were designed to be used as a replacement for disks. Thus, they differ from cache in important ways: 1) Disks don’t have the concept of data evictions, central to caches. 2) In a typical system, cache and the memory unit below it in the hierarchy (RAM), have the same address space. But with SSDs as cache, it meant OS would have to maintain an extra translation table to map the logical blocks to flash address. Not only would this translation cause extra overheads, it needs to be durable for quick recovery from crashes. 3) The consistency requirements of caches and storage media are different. In midst of these fundamental differences between SSDs and caches, authors propose a flash device called SSC, and a OS level cache manager, both allowing OS to effectively use SSC as cache.
3. Contributions: The authors have combined the best of two worlds: storage media and cache. They achieve this by changing the internal block management of conventional SSDs. They also added a kernel-level cache manager which operates above device drivers. The SSC keeps an in-cache copy of the logical block - flash address translation, presenting a single address space to the cache manager, and reducing its complexity. The internal block management was changed by tweaking the SSC firmware to provide interfaces to the cache manager that allowed a finer control over SSC data. The cache manager can now evict data, check if it exists, control cache free space, and write data in different modes. Since they can track the dirtiness of a block too, they added support for silent evictions. This was one of the most important design decisions since it helped in efficient garbage collection, reduced write amplification, and provided effective wear distribution. The persistent nature of SSC meant that the data and metadata are always in SSC, and can be used for quick recovery. At the same time, since SSC is now a true cache, they can afford to use a Bloom filter-like structure to store translations on the OS side, thus reducing the host memory usage. They also added sparse hashmap data structure to optimize for sparse address spaces in the SSC. Overall, they not only solved the problem at hand, but paved way for numerous other optimizations too.
4. Evaluation: The authors clearly evaluate each design policy by implementing a cache manager, a SSC functional emulator, and a timing simulator. They use a mix of write-intensive and read-intensive workloads for their evaluation. The performance of SSC is better than simple write-back SSD for write-intensive workloads. Write-back SSC performs better because it writes to disk less often, and out-of-place overwrites in SSC help to prevent erase-write operation of a typical SSD. For read operations, SSC performs slightly poorly because of silent evictions. Because it removes translation states from the OS, SSC reduces host memory usage significantly, but because of additional structures like sparse hashmap, the disk memory usage increases slightly. Since SSC checkpoints, and provides “exists” operations in its interface, the recovery time for SSC is significantly less across all workloads. The authors also show the extra overhead added due to consistency guarantees. The silent eviction policy clearly shows a remarkable performance improvement in writes, and helps in spreading the wear across SSC by reducing the erase operations. Overall, most aspects of the design were evaluated, though they missed out on comparing the performance of their cache manager with traditional manager. Since, their cache manager has more decisions to make, it might perform a bit poorly.
5. Confusion: Isn’t a write endurance of 10^4 too less to be of any commercial value? How do the policies designed here translate to other forms of fast persistent storage? How did the cache manager in traditional SSD manage to get a cache-like performance when the device itself was not aware that it is a cache?

1. Summary
FlashTier is a hardware/software codesign of using flash devices as cache. It provides better efficiency, performance and crash recovery.

2. Problem
SSDs designed to be replacements of disks are not suitable to be used as disk cache. There are two levels of address mapping. The interface and garbage collection behavior are not tuned for cache. Maintaining consistency by writing metadata synchronously harms the performance greatly.

3. Contributions
The SSC uses the same address space as its paired disk. The only address translation is from block address to flash location and it is handled by the SSC using a sparse hash map. By eliminating the additional mapping, the memory usage is reduced and crash recovery is simplified.
The cache-aware interface helps improve the performance of crash recovery and garbage collection. As the SSC know explicitly which blocks are clean, the garbage collector can simply erase those blocks without doing a copy.
Cache durability is achieved by combining logging, checkpoints and out-of-band writes. The first two ideas seem to be borrowed from journaling file systems and the last is a feature of flash. All of the above provides fast synchronous meta write.

4. Evaluation
FlashTier is implemented with simulators on Linux, and it is compared with the native implementation FlashCache + FlashSim with respect to performance, memory usage and wearout. 4 real world traces from write intensive to read intensive are used. As the results show, unifying the address space costs more device memory but less host memory, and the overall memory consumption is reduced greatly. Consistency guarantee overhead is lower than native for write intensive traces and recovery time is always lower. The silent eviction provides great performance boost for write intensive workloads and does less erase operations. In general, the main contributions of FlashTier help it beat the native system as expected.

5. Confusion
Why data blocks and log blocks are mapped at different granularities? Will log blocks get erased more often?

1. Summary
The paper presents the design of fast flash memory based caching architecture, FlashTier. Traditional caching systems used SSD as a cache between system memory and persistent disk storage. In this work, authors highlight the inefficiencies of this approach and propose a new design based on solid state cache (SSC) which has an interface different from SSD to implement a more efficient cache architecture.
2. Problem
Conventional SSDs are designed with persistent storage design goals, but are now being increasingly used as a cache in front of disk. However, this approach does not consider the characteristics of caching workloads and thus fails to incorporate optimizations and flexibilities that help caching. SSD based systems have two layers of address translation - one, disk address to SSD block number and second, SSD block number to SSD physical address; for use of SSD as persistent storage the address maps are optimized for dense address spaces, while caching prefers sparse address spaces. Secondly, since clean data in cache can be evicted without any care, SSDs fail to exploit such flexibilities in cache management due to their storage based design focus. Finally, the consistency requirements from a cache are very different from what SSDs currently offer and thus need to be managed differently. Overall, SSD based caching fails miserably to exploit any of the flexibilities that its use as a cache would allow.
3. Contributions
The biggest contribution of this work is that they highlight the fact that the use of SSD as cache has very different design requirements and choices in comparison to its use as persistent storage. Thus they propose a system design, in hardware as well as software, to address the problem highlighted above. FlashTier uses only a single layer of address translation where SSC accepts the disk address directly and then internally maintains a mapping to flash block addresses using optimized sparse hash map data structures. The most crucial part of their design is enhancing the flash interface to suit the needs of cache management. The interface allows SSC to distinguish between dirty and clean blocks, and thus makes garbage collection faster by silently-evicting the clean data without any copying. This also reduces the device wear because of reduced number of block writes. Further, the interface allows the cache manager in OS to have more control over cache management. The design support write-back and write-through caching, each with its own set of optimizations. The authors also formalize the consistency guarantees offered by SSC. They use logging to make the writes persistent without hurting performance as much as SSDs do. Further, checkpointing is used to speed-up crash recovery. Lastly, the design accommodates various policies (like SE-Util and SE-Merge) for efficient free space management by using silent eviction in addition to normal compaction based garbage collection, which allows dynamic adjustment to workload requirements.
4. Evaluation
The authors have done a fantastic job of evaluating the design by implementing a cache manager, SSC emulator and SSC timing simulator. They have used two write-heavy workloads and two read-heavy workloads. Overall, FLashTier outperforms SSD designs by 65%-167% for write-heavy workloads. For read-heavy ones, the performance is comparable. The authors show that the performance benefits of FlashTier come with minimal increase in device memory usage. On the other hand, FlashTier shows a significant decrease in host memory usage due to unified address scheme to the effect that total combined memory usage is reduced by 60% to 78%. They have also evaluated the cost of overheads associated with guaranteeing persistence. Persisting data for write-heavy workloads in cache has much less overhead in FlashTier compared to native systems - it’s 18-29% in native systems as against 8-16% in FlashTier. For read-heavy benchmarks, the performance degradation is almost negligible. While silent eviction improves the garbage collection performance by upto 81%, with almost no degradation in performance due to risk of increased cache misses. Lastly, the benefits to wear management have been presented where FlashTier significantly reduces average number of block writes and the copying overhead. Overall, FlashTier boosts system performance for write-heavy workloads without compromising the read-heavy performance.
5. Confusion
The semantics of “evict” are a bit confusing as paper contradicts itself. Does an “evict” request from cache manager guarantee that the block is no longer cached or is it merely a “hint” where one can observe cache hits under evict?
Some details about design and usage of out -of-band (OOB) regions would be helpful.

1. Summary
This paper describes a system architecture called FlashTier that is built upon a sold-state cache. It aims to support two usage methods, write-through and write-back, an does so in an SSC simulator in Linux. The system is compared against traditional caching on SSDs.

2. Problem
SSDs are often used as caches in front of cheap and slow disks, which provides the benefits of flash at a lower cost. SSDs, however, have a narrow block interface, internal disk-replacement policies, and do not suport in-place writes, which can make them unsuitable for caches. Caches have specific behaviors: 1) cache data doesn't have to be durable. 2) Caches store data from different address spaces rather than natively. 3) Consistency requirements differ. FlashTier explores ways in which SSC can be integrated into the storage hierarchy.

3. Contributions
FlashTier is a block-level caching system with a cache manager (above the disk device driver, sends requests to the flash device or disk), and the SSC which actually stores the cached data and assists in managing it. There are three basic mechanisms at work:
1) Unified address space: data can be written to the SSC at the disk address, which removes the need for a map between disk and SSD.
2) Cache consistency: on an SSD-backed cache, cache metadata must be stored and durable. Clean and dirty data have their own set of guarantees, with a common factor being the guarantee that stale data will never be returned. Internal SSC metadata is always persistent and recoverable after a crash.
3) Garbage collection: an SSC may just silently evict data instead of copying and erasing. This frees up more space faster.
There are two usage modes. The cache manager receives requests from the block layer and decides when the cache should be used. In write-through (WT) mode, the CM writes the data to the disk and populates the cache. This guarantees that the SSC conains only clean data and is best used for read-heavy workloads. In write-back (WB) mode, the cache manager can write to the SSC without updating the disk, resulting in dirty data in the cache. This complicates cache management, which means that when the disk needs for more space, the CM has to check the table of dirty cached blocks and make sure that they aren't evicted. This is best for write-heavy workloads.
There are several major points in the system design:
1) Address space management: unifies addresses and reduces translation needs. A sparse hash map is used for caching at the block level, and this map is kept in memory. A fixed portion of the blocks are mapped at 4KB pages and the rest at 256KB blocks. The SSC can also maintain a reverse map to support fast translation for physical addresses when garbage collecting.
2) Free space management: internally, flash is stored as a set of blocks which contain pages, and only blocks can be erased at a time. Modern SSDs use logs to help patch this problem, merging data blocks in with garabage collection. In SSC, blocks are silently evicted without having to be copied. Only write-clean or explicit clean blocks can be gc'd this way.
3) Consistent cache interface: SSC provides six operations: write-dirty, write-clean, read, evict, clean , and exists. They each have their own set of guarantees: write-clean guarantees that data is durable before returning, and write-clean guarantees that reading will return either data or an object no found error.
4) Persistence: logging, checkpoints, and out-of-band writes are used to persist internal data. Recovery mechanisms calculate the difference between the last log and the sequence number, and it uses the logs to replay the checkpoint as needed.
4. Evaluation
The main comparison is against traditional SSD cachiFor write-intensive workloads, SSDs use small frequent metadata writes and increase response time by 24-37%. FlashTier increases it by a smaller amount, 18-32%. Similar improvements can be seen across host memory usage and performance, with the final conclusion being that SSC improves performance on write-intensive workloads, decreasing the write amplification and erases needed. Read-intensive workloads sees less of an improvement. Overall, the evaluation seems pretty comprehensive in terms of the specific types of workloads tested and the specific criteria covered.
5. Confusion
What exactly does traditional caching on an SSD look like? What adaptations would be needed to allow replacement on a page-level cale?

1. Summary
This paper is about FlashTier , a caching system built upon a solid state cache device with a management software at the operating system that directs caching.It provides a uniform address space,cache consistency guarentees and silently evicts data during garbage collection hence improving performance.
2. Problem
FlashTier aims to address the following problems: Firstly,since a cache exposes only hot data it has sparse address space when compared to a disk.Secondly,a SSD provides consistency and crash recovery by using barriers to provide ordering between requests but a different mechanism is needed to maintain ordering between requests to multiple device.Thirdly,caching is a write intensive task which can diminish flash cell endurance in the context of garbage collection.
3. Contributions
Two components of FlashTier are the cache manager and a Solid State Cache(SSC).FlashTier supports a write-through mode(read workloads) and write-back mode(write workloads).For write through caching the data may be silently evicted,for write back caching the following policies can be used for eviction: Evict, Clean, Exists.The cache manager maintains a mapping table indexed by logical block number(LBN).The SSC can be referenced using LBNs and exposes a unified address space. The sparse address space mappings for a cache are maintained in a sparse hash-map.

FlashTier provides the following operation to provide a consistent cache interface : write-dirty, write-clean, read, evict, clean, exists. SSC relies on logging , checkpointing and out-of-band writes to persist internal data.SSC uses roll-forward recovery to reconstruct the different mappings in device memory after power failure.It is guaranteed that dirty data is durable and not lost during crash , and the cache manager can always safely consult the cache.
The silent eviction mechanism is used for garbage collection.The following two policies can be used : SE-Util(creates only erased data blocks) and SE-Merge(creates erased blocks for data or log blocks)
4. Evaluation
The cost/benefit of FlashTier design components are compared against traditional caching on SSD : for write intensive workloads FlashTier significantly improves the write performance , for read intensive workloads the system performance is nearly identical. It is shown that the cache manager requires no per-block state in memory when in write-through mode, thus the memory usage is effectively zero.On the other hand , native cache system used same amount of memory for write-back or write-through mode.For write intensive workloads it is shown that FlashTier consistency overhead is lower than a native system due to lack of synchronous logging for insert and remove operations. similar performance for read workloads.It is shown that for write intensive workloads the garbage collection policy of silent eviction(SSC and SSC-R) outperform the native SSD.This is because reading and re-writing of data is avoided.Also, FlashTier improves wear management as it decreases the number of blocks erased for various operations.

The evaluation section of this paper was particularly good because for each contribution , the performance comparison for write/read intensive applications had been presented which provides clarity about the environments in which FlashTier can be leveraged to get good cache performance.
5. Confusion
What is out of band area of flash page ? Please explain the following concepts with respect to flash storage : flash package, die, plane, block and page.

Summary
This paper introduces a new system architecture, FlashTier, providing caching in the storage heirarchy for disks. It is built upon solid-state cache(SSC), a flash device with an interface designed for caching. The authors implement such a system with a FlashSim simulator, simulating SSC, and a software management layer, Cache manager, in the Linux kernel. Such a system is further evaluated for it's performance boost and overheads in using one.

Problem
Traditional SSDs, when used as a cache, did not match up to the caching standards owing to it's design as a persistent storage device. The software implementations of such a caching system were limited by the device's narrow storage interface, multiple levels of address space, and free space management within the device.

Contributions
a. The paper provides a design for a new flash device, SSC, with a new set of interfaces and a software management layer targetting the limitations found in using traditional SSDs as a cache. SSC helps in storing and managing cached data and the caching manager forwards the requests for reads/writes to the SSC or the disk.
b. The multiple levels of addressing is reduced by unifying the address space and cache block state split between the cache manager running on host and firmware in SSC. A sparse hash map is maintained in SSC for the translation of a logical block number to the actual physical page number for the flash device. Thus, the host does not have to maintain a persistent mapping for the cache and the OS/cache manager can directly talk with logical block addresses. A block state is maintained in each block for garbage collection and to notify usage statistics to eviction policies.
c. New interfaces are introduced to facilitate write-back and write-through cache policies and to help with evection and read/write requests.
d. To maintain persistence, a combination of logging and checkpoints is used. Changes made to the sparse hash map are logged. In order to reduce the length of such a log, checkpointing is done in a timely fashion. On a crash, the log, after the latest checkpoint, is used to replay the transactions and maintain consistency.
e. Silent eviction policies like SE-Util and SE-Merge are used to evict pages for freeing space in the device. Such eviction reduced the overhead of copying data during the traditional garbage collection done in SSDs. These policies also helped in reducing number of erases on flash device, thus increasing the life of a flash device.

Evaluation
SSC was implemented on a FlashSim simulator with the cache manager built as a part of Linux kernel. Two configurations, SSC and SSC-R, were used based on the SE-Util and SE-Merge eviction policies respectively. Such a system was analyzed in a detailed fashion over four real-word I/O workloads - departmental email server, file server, small data center hosting user home directories and project directories. In terms of performance in IOPS, the FlashTier system outperformed native SSD caches with a performance boost ranging over 38-167 % over the different workloads. Due to the unification of address spaces, the FlashTier system showed a significant reduction in the host memory usage but at a cost of slight increase in the device memory usage. The recovery times after a crash were significantly reduced owing to the use of a checkpointing and logging system. Garbage collection times were seen reduced while increasing the reliablity of the devices by reduction of erase operations. Overall, the evaluation was really good, detailed and focused on each of the newly added features with an evaluation of it's overheads under an environment comprising of a good mix of read and write intensive workloads.

Confusion
Not very clear on the inside-structure of a flash device(like dies, blocks, planes). Would be good if this gets covered in class.

1. summary
FlashTier mitigating the cache model of SSD is implemented to deploy the flash cache wisely by eliminating the mapping information between SSD and disk and reducing the lookup time. It provides fast garbage collection with silent eviction, fast crash recovery with log and checkpoint policy, and cache management policies such as write-through and write-back.

2. Problem
Previous caching methods using SSD may sacrifice small amount of the memory capacity because the data structure holding the mapping information between flash logical and physical address should be stored in the memory. In addition, caching requires long cache warming periods in case that the size of data is big, i.e, bootup. The tendency of cache, frequently updates data in place, results in aggravating the write wear leveling of flash because merging old-clean data with new coming data causes frequent garbage collection. Especially it makes the write wear leveling worse when reaching at full capacity.

3. Contributions
The flash memory has a issue of write wear leveling. Most, or all, SSD adapts the file system similar to the policy of LFS to increase the lifetime by avoiding in-place writing. In my opinion, this paper takes an advantage of this policy in order to remove redundant software layer or mapping information in memory. To do so, FlashTier can eliminate additional address translation for SSD and unify address among SSD and disk because flash memory already has it in the its controller. In addition to this, the crash recovery is straightforward because SSD adapts pseudo, or pure, LFS.
There are several policies to improve the performance of SSD as a cache. Silent eviction is introduced to lessen the burden of garbage collection. It makes the pages which has clean block to be free pages so that the operation of garbage collection does not cause additional write operation. Furthermore, cache manager can be implemented to change the space of log and data to utilize the capacity wisely and reduce the garbage collection frequency.
FlashTier also provides the cache management policies, write-through for read-heavy workload and write-back for write heavy workload. Write-through policy write data into both SSD and disk simultaneously while write-back policy write data into SSD with dirty flag. Therefore, write-back policy needs some management such as when evicts the dirty blocks and when recover the dirty blocks.

4. Evaluation
The evaluation is conducted with three components: cache manager, SSC emulator and SSC simulator. The paper uses the SSC simulator because the hardware is not available now. The author evaluated SSC methods including policies and SSD with several applications. The results showed that the performance of SSC is improved on the applications execution and the SSC-R, SSC with variable log and data blocks, with Write-Back shows the best among options. The SSC-R consumes more memory to maintain the memory allocation information while the requisite memory for holding the information is not significant. The garbage collection performance and recovery time outperforms the conventional SSD cache.

5. Confusion
Can you explain what is sparse hash map and how to use sparse hash map in this paper?

1. Summary
The paper introduces a new interface to flash-based storage to allow for a better use of flash as a cache of data on a slower disk.
2. Problem
Today, SSDs have become popular as a drop-in replacement of disks, but because of the “disk nature” of SSDs, they fail to perform well as a cache. Solid-state caches (SSCs) need to be faster and more flexible to perform well, but certain characteristics of SSDs, such as their erase operation and their address mapping, disallow SSDs to do well as a cache.
3. Contributions
Through a new design of device manager and a new device interface, the authors propose a better way to use flash as a cache in order to avoid SSDs’ drawbacks. FlashTier uses a sparse hash map to allow for a sparse mapping since the LBN would be representing the disk’s block number. It also has an improved free space management algorithm because data may be evicted from cache rather than being copied to a new block as it would be in a normal SSD. This would lower the wear-leveling costs. FlashTier also provides consistency between cache and disk by preventing reading stale data using atomic operations as well as providing both write-through and write-back options.
In kernel, a cache manager lies between the block layer and device drivers to control the cache and enforce its cache policies, possibly based on workload. It would have its own data structures to keep track of LRU blocks, dirty blocks in cache, etc.
4. Evaluation
The authors modified an SSD simulator to work as their SSC design. They also modified Facebook’s FlashCache to work like their cache manager in kernel. The goal of this project was to improve performance of SSCs through a new interface while also providing consistency guarantees. The authors used four real-world workload traces to measure performance under different FlashTier policies and compared them to the original FlashCache. Performance wise, the write-intensive workloads did much better than the original FlashCache, and FlashTier performed similarly for read-intensive workloads. They also show their design reduced total memory usage by almost 80% for device and host combined. In addition, their consistency policies perform much better than previous designs, especially during recovery time. This evaluation seems complete as 2 write-intensive as well as 2 read-intensive workloads were used. The authors clearly show that in every aspect, their design performs better and is more efficient. Through the results, they concluded that the SSC with SE-Merge policy for erasing performs best to provide low miss rates, yet better wear-leveling and consistency.
5. Confusion
Does FlashCache come with policies/mechanisms to deal with many-caches-one-disk or many-caches-many-disks scenarios or is the one-cache-one-disk mapping the most used in industry and the most focused on in research?

1. Summary FlashTier improves the performance of flash-based disk caches by redefining the hardware interface exposed by flash drives intended to be used as RAM, altering the division of labor between the cache control software and the drive. The authors push address mapping logic into the flash drive, to support sparse address spaces, and rather than disk-like persistence, the drive provides cache-style consistence guarantees.

2. Problem DRAM is fast and expensive, while rotational media is slow and cheap; SSDs provide a convenient middle ground, with speed suitable for for a cache in between conventional disks and RAM. However, SSDs provide an interface suitable for persistent storage. SSDs expose a dense address space, necessitating translation between cache and disk addresses; thus, the OS kernel must keep cache state, such as address mappings in kernel memory. This state must be made persistent with additional SSD writes, or users must endure a long warm-up period upon crashes. Additionally, SSDs perform internal cleanup to support Flash's erase-then-write behavior. By definition caches store hot data, and garbage collection over cached data has the potential both slow SSD accesses as well as reduce the lifespan of the storage media greatly.

3. Contributions The authors note that the semantics provided by a correctly functioning cache differ from those of a hard disk. While a disk must persist all writes, a cache's obligation is to never return stale data; a cache is free to lose data, as long as it returns a 'not-present' error. This loosening of the persistence requirement frees the cache to perform management in a way that is more consistent with the performance and longevity properties of Flash, as well as allowing the hardware to make internal decisions about data placement and removal. Internally, a SSC uses an optimized sparse hash table to map logical addresses to physical addresses, allowing software to query it with regular disk addresses. The SSC also provides an extended version of a disk interface which allows the cache manager to perform standard reads and writes, as well as flagging data as dirty or clean, and requesting eviction. Rather than freeing blocks via garbage collection, the cache can simply choose evict a block to obtain more space, erasing clean pages, as well as writing dirty ones to disk. By doing this, FlashTier eliminates write amplification, improving performance and increasing the lifespan of the hardware. To further improve performance FlashTier behaves like a log-structured file system, writing to designated log blocks, and periodically checkpointing and coalescing data pages. FlashTier also supports write through and write back behavior, depending on workload properties. Since cache state is managed and logged internally and is recoverable, FlashTier remains warm across crashes.

4. Evaluation As it is infeasible to obtain actual FlashTier hardware, the authors evaluate a simulated version of the hardware on several real-world workloads. The authors show that both writeback and write-through caches outperform a standard SSD cache model by a factor of as much as 2.5x on write-dominated workloads, while improving longevity. FlashTier also maintains performance parity with a standard SSD cache in read-heavy workloads. FlashTier also exhibits order of magnitude improvement in RAM usage across all experiments, but may use as much as three times more solid-state device space than a standard SSD-based cache. I found these experiments convincing, as they created a realistic model, and used real workloads that illustrate meaningful performance characteristics.

5. Confusion It would be nice to have an overview of the components and read/write/erase granularity of Flash. In particular, I am unfamiliar with the notion of a "plane" in a flash device, and what it's access characteristics are.

Summary
The paper describes FlashTier system that consists of two components - a solid-state cache, a flash device specifically designed for caching, and a cache manager that manages caching at OS layer.

Problem
High speed SSD's are often deployed as cache in front of high capacity disk storage to improve storage performance. However, an SSD cache is hindered by the narrow block interface and internal block management of SSD that is designed to server as a disk replacement. This leads to three inefficiencies with SSD block cache:
1. Address space management - There are two level of indirection for address translation - one mapping is stored in DRAM(disk to SSD address) and another in FTL(logical to physical address)
2. Consistency and Durability - Handling crash failures can be complicated and time consuming to ensure consistency and durability.
3. Free Space management - Garbage collection ends up in low write performance and low endurance in case of caching as cache works at full capacity demanding more garbage collection.

Contribution
There are two main contributions of this paper - SSC, a flash device designed for caching and design of cache manager that leverages the SSC for better performance. SSC provides unified cache address space, a consistent cache interface and cache aware free space management. The cache manager directly addresses flash blocks in the SSC by disk LBN. The SSC optimizes for sparse address space using Sparse Hash Map that ensures low memory footprint in SSC memory. The consistent cache interface helps in persisting cached data across system reboot or crash and never return stale data because of inconsistent mapping. Thus, its always safe for the cache manager to consult SSC even after a crash. SSC employs silent eviction to lose clean data rather than copying it. This is done by integrating cache replacement with garbage collection. There are two policies to select victim blocks for eviction - SE-Util that selects the erase blocks with smallest number of valid pages and SE-Merge that uses the same policy for selecting candidate victims but allows the erased block to be used for either data or logging.
Cache manager supports either write back or write through modes of usage. In write-through mode, the cache manager writes data to the disk and populates the cache either on read requests or at the same time as writing to disk. In write-back mode, the cache manager may write to the SSC without updating the disk.

Evaluation
FlashTier implementation consists of cache manager, SSC emulator (provides support for SSD controller and hierarchy of NAND-flash) and SSC timing simulator. Evaluation is done on four real world traces that consists of file server workload, mail server workload, project directories workload and user home directories workload. The authors conduct experiment for write-back and write-through modes of caching. FlashTier improves cache performnace for write-intensive workloads by 101-168% for write-back mode, 65-102% for write-through mode having SE-Merge policy. FlashTier performs equally well as SSD block cache despite silent evictions. FlashTier provides a 78% and 60% reduction in total memory usage for device and host combined for SE-Util policy and SE-Merge policy respectively. Consistency guarantee cost is just between 8-16% for SE-Util and SE-Merge compared to 18-29% for native SSD. Recovery times in SSC has drastically improved compared to native SSD and the recovery times ranges from 34 ms to 2.4 seconds for various workloads. From silent eviction perspective, SE-Util outperforms native SSD by 34-52% and SE-Merge outperforms by 71-83%. This improvement is largely due to garbage collection avoiding reading and rewriting data. Silent eviction also improves the reliability as it decreases the number of blocks erased for merge operations. SE-Util and SE-Merge reduces the total number of erases by an average of 26% and 35%, and the overhead of copying valid pages by an average of 32% and 52% respectively. However, for read workload, SE-Util increases the number of erases by 5% as the data that is evicted needs to be brought back in and rewritten.
Overall, the evaluation covers all the read & write workload and discusses properly the behavior of SSC w.r.t SSD. However, it would have been interesting to see how a RMW(Read-Modify-Write) workload would look like from performance perspective. I think the write through policy in such a case would perform better as the cache traps the write itself thereby improving the performance.

Confusion
1. I would like to have a discussion on how adapting generational garbage collection technique will work here as we don't want to keep on choosing erase blocks having hot data?
2. How does the idea of SSC work on other faster NVM devices? Is there any architectural/design change needed to support other NVM devices as a cache?

Post a comment