SummaryStore is an approximate time–series store, designed for analytics, capable of storing large volumes of time-series data (∼petabyte per node); it preserves high degrees of query accuracy and enables near real-time querying at unprecedented cost savings. SummaryStore contributes time-decayed summaries, a novel abstraction for summarizing data streams, and returns reliable error estimates alongside the approximate answers, supporting a range of machine learning and analytical workloads such as forecasting, anomaly detection, and traffic monitoring.



SummaryStore to appear at SOSP '17.
Position paper on building highly-available geo-distributed stores for ML applications to appear at SysML, co-located with NIPS '18.
Position paper on the role of approximate storage systems in machine learning to appear at AISys, co-located with SOSP '17.
Source code: available at Github under Apache 2.0 License

Team Members

Nitin Agrawal (contact)
Ashish Vulimiri


Low-Latency Analytics on Colossal Data Streams with SummaryStore
Nitin Agrawal, Ashish Vulimiri.
Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17), Shanghai, China, October, 2017.
Learning with Less: Can Approximate Storage Systems Save Learning From Drowning in Data?
Nitin Agrawal, Ashish Vulimiri
Workshop on AI Systems at Symposium on Operating Systems Principles (SOSP), Shanghai, China, October 28, 2017.

Building Highly-Available Geo-Distributed Datastores for Continuous Learning
Nitin Agrawal, Ashish Vulimiri
Workshop on Systems for ML at NIPS '18, Montreal, Canada, December 7, 2018.