SummaryStore
SummaryStore is an approximate time–series store,
designed for analytics, capable of storing large volumes of
time-series data (∼petabyte per node); it preserves high
degrees of query accuracy and enables near real-time querying at
unprecedented cost savings. SummaryStore contributes time-decayed
summaries, a novel abstraction for summarizing data streams, and returns
reliable error estimates alongside the approximate answers, supporting a
range of machine learning and analytical workloads such as forecasting,
anomaly detection, and traffic monitoring.
News
SummaryStore to appear at SOSP '17.
Position paper on building highly-available geo-distributed stores for
ML applications to appear at SysML, co-located with NIPS
'18.
Position paper on the role of approximate storage systems in machine learning to appear at AISys, co-located with SOSP '17.
Source code: available at Github under Apache 2.0 License
Team Members
Nitin Agrawal (
contact)
Ashish Vulimiri
Publications
Low-Latency Analytics on Colossal Data Streams with SummaryStore
Nitin Agrawal, Ashish Vulimiri.
Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17), Shanghai, China, October, 2017.
Learning with Less: Can Approximate Storage Systems Save Learning From Drowning in Data?
Nitin Agrawal, Ashish Vulimiri
Workshop on AI Systems at Symposium on Operating Systems Principles (SOSP), Shanghai, China, October 28, 2017.
Building Highly-Available Geo-Distributed Datastores for Continuous
Learning
Nitin Agrawal, Ashish Vulimiri
Workshop on Systems for ML at NIPS '18, Montreal,
Canada, December 7, 2018.