SummaryStore

SummaryStore is an approximate time–series store, designed for analytics, capable of storing large volumes of time-series data (∼petabyte per node); it preserves high degrees of query accuracy and enables near real-time querying at unprecedented cost savings. SummaryStore contributes time-decayed summaries, a novel abstraction for summarizing data streams, and returns reliable error estimates alongside the approximate answers, supporting a range of machine learning and analytical workloads such as forecasting, anomaly detection, and traffic monitoring.
 

 

News

SummaryStore to appear at SOSP '17.
Position paper on building highly-available geo-distributed stores for ML applications to appear at SysML, co-located with NIPS '18.
Position paper on the role of approximate storage systems in machine learning to appear at AISys, co-located with SOSP '17.
Source code: available at Github under Apache 2.0 License

Team Members

Nitin Agrawal (contact)
Ashish Vulimiri
 

Publications

Low-Latency Analytics on Colossal Data Streams with SummaryStore
Nitin Agrawal, Ashish Vulimiri.
Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17), Shanghai, China, October, 2017.
[PDF]
 
Learning with Less: Can Approximate Storage Systems Save Learning From Drowning in Data?
Nitin Agrawal, Ashish Vulimiri
Workshop on AI Systems at Symposium on Operating Systems Principles (SOSP), Shanghai, China, October 28, 2017.
[PDF]

Building Highly-Available Geo-Distributed Datastores for Continuous Learning
Nitin Agrawal, Ashish Vulimiri
Workshop on Systems for ML at NIPS '18, Montreal, Canada, December 7, 2018.
[PDF]