Picnic Point at Dawn. Photo by: Jeff
Miller, UW-Madison University Communications
Dec 11, 2017 9:30AM-10:45AM, 1257 CS
|
|
9:30-9:45 AM |
Adding Global Indices in
Quickstep Lokananda Dhage Munisamappa and Om Jadhav Quickstep is a new data processing platform being
developed at the U. Wisconsin. It currently has a block-based storage that
includes support for indices that are local to a block; i.e. there are not
traditional global indices. In this talk we describe our design and implement
of a global index in Quickstep. |
9:45-10:00 AM |
Building a Suffix Tree/Array index
on string attributes in QuickStep The
goal of this project is to study suffix trees and arrays, compare them with
respect to time and space efficiency and build an index-based search for the
string attributes in Quickstep. A suffix tree is a data structure which
represents all the suffixes of a given string whereas a suffix array is an
ordered array of suffixes of the text. As these are very powerful data
structures, they have numerous applications. Some of these are string
matching, finding longest common substring or subsequence between two strings
etc. Our aim here is to study these for implementing an efficient search for
string patterns in Quickstep, and in this talk we describe our initial
approach. |
10:00-10:15 AM |
Decomposing the
Monolithic Database Dennis Zhou Databases today adhere to a monolithic process
model where all services are consolidated. While this makes sense for box
product databases, this model is problematic at scale where cloud providers
must keep these services online. Furthermore, when partnering with a cloud
provider that also owns the software, there is the opportunity to roll out
both bug fixes and performance improvements. In this talk, I will discuss
decomposing the monolithic model and the benefits of such a model in the
cloud. |
10:15-10:30 AM |
Revisiting query scheduling policies
for Lambda Functions Karan Bavishi Lambda Functions have been gaining a lot of traction
recently as a way of deploying applications. They promise reduced costs as
the user is only charged for the actual execution time and not idle capacity.
Lambdas also have a deadline associated with them equal to the configured
timeout. Current query scheduling algorithms such as FIFO can result in
higher costs, especially for short queries. We explore an alternative policy
based on Earliest Deadline First, which tries to exploit buffer pool locality
by comparing predicates. Results show that median job response time is
improved by about 5.6X, but results in 2.7X worse times for the 99th %ile |
10:30-10:45 AM |
Efficient Storage and
Retrieval of JSON objects in relational systems Saranya Baskaran
and Palaniappan Nagarajan There has been an explosion of RESTful services,
that uses JSON format for exchanging data, including using NOSQL storage for
such data. Although these NoSQL storages have few significant advantages in
storing semi structured data, they lack the advantages of traditional RDBMS.
Our proposal is an efficient way to store and manipulate such semi
structured/unstructured JSON data on top of a relational/SQL based system. We
present our approach and point to our plans to conduct a detailed evaluation. |
Dec 13, 2017 9:30AM-10:45AM, 1257 CS
|
|
9:30-9:45 AM |
Adding Fine-grained
Concurrency control in Quickstep Quickstep
is a new data processing kernel that is being developed in the Computer
Sciences Department at the University of Wisconsin. Currently, the system has
a basic DB-level locking scheme, where the entire database gets locked for a
transaction. This project aims to modify the Lock manager in Quickstep to use
intentional locks at a lower granularity level; and potentially, extend it to
perform lock escalation when necessary. In this talk, we will describe our
work on implementing a Gray-style locking manager in Quickstep. |
9:45-10:00 AM |
Pre-Execution Concurrency Control
Using Predicate Logic To
achieve fully serializability, we propose a
mechanism that enforces a serialized transaction order prior to query
execution. To do this we analyze the theoretical tuples (based on the SQL
request) a query could possibly touch before beginning query execution. This
allows the database system to admit only those queries that are known not to
conflict with other active queries. Compared to other locking mechanisms,
this approach reduces runtime system complexity since isolation is handled
completely outside of query execution. |
10:00-10:15 AM |
Integration of Bitweaving and Quickstep Modern servers have a large amount of compute power
and optimizing single box performance is important. In this talk, we will
introduce our work to integrate BitWeaving in the
current open-source version of Quickstep, with the goal of exploiting the
full potential of a single machine for scan. We present our implementation
and selected benchmark results. |
10:15-10:30 AM |
Comparing Different Join Algorithms:
Partitioning vs Non-partitioning Philip Martinkus and Zubeyr Eryilmaz We
have compared several in-memory hash-based joining
algorithms on a modern multicore system. We have focused on the best two
algorithms: radix partitioning and no partitioning. Our results show that
each algorithm has its own strengths and weaknesses. For uniform data the
radix partitioning is faster and for skewed data no partitioning performs
better. |
10:30-10:45 AM |
Fusing Multiple Hash
Join Operators Dylan Bacon and Aarati Kakaraparthy We will discuss the inefficiencies faced by
Quickstep with how it processes multiple equijoin hash joins in a row. Our
solution, to attempt to fuse together multiple operators into single
operators to save on the processing and data streaming between operators,
will be presented, as well as challenges and results that we have seen along
the way with this work. Future expansions of the new operators will also be
discussed. |
If you are looking for the CS 764 course home page, click here.