Zheguang Samuel Zhao
Email: zzhao6 [at] wisc.edu
445 Henry Mall Room 518
Madison, WI 53706
- I am a student at University of Wisconsin at Madison. I will receive Bachelor of Science Degree in Computer Science in May 2012.
- I grew up in Canton, China, where the Founding Father, Dr. Sun Yat-Sen, was born and later initiated the revolution to establish the first republic government in China.
- When I am not with computers, I am an athlete in heart. I love running and soccer. I played in a soccer team called "ShockWave" in the intramural soccer league. I usually run 6-10 miles in the afternoon by Lake Monona.
My interests are databases and systems.
I am involved in "QuickStep", a database research group led by Professor Jignesh Patel.
QuickStep is an ongoing research project in the Database Systems Group. It aims to build an in-memory database engine for multi-core, large memory environment.
Under the guidance of Professor Jignesh Patel, I have been working on the following projects within the scope of QuickStep. My works are inspired by the Data Morphing Technique.
- QuickStep Storage
- Guided by Craig Chasseur, I am working on the storage system in QuickStep. The storage module eventually will support a variety of different storage types, e.g. RowStore, ColumnStore, Data Morphing, Hash Index, Cache-Sensitive B+tree, R-tree, etc. My short-term goal is to migrate my CSB-tree implementation to QuickStep and research on its performance.
- Cache Sensitive B+Tree (CSB-Tree)
- This work is inspired by R. A. Hankins and J. M. Patel's paper and J. Rao and K. A. Ross' paper.
In particular, I implemented the Full CSB-Tree running on an independent sandbox, and examined the effect of node sizes on the cache performance. Without going into too much details here,
my main discovery was:
- CSB-Tree insertion achieves the best cache performance when the node size is set to the range of a few times (3~6x) the cache line size, but not larger.
- CSB-Tree search generally performs better for larger cache line sizes.
- Bulkloaded CSB-Trees have significant improvement on search than CSB-trees built by random insertion.
- When the node size is much larger than several times the cache line size, the major player to shoot up the cpu cycles and cache misses is the group moving operation.
- On a 64-bit architecture, the advantage of CSB-Tree over the traditional B-Tree is enlarged due to CSB-Tree's capability to pack more keys than the B-Tree in an index node (3x than on a 32-bit machine). The CSB-tree runs about 2x faster than the B-Tree in my model (experiments in collaboration with Yinan Li on his implementation of B-tree with special tuning).
- Concurrency Control on Cache Sensitive B+Tree as Compared to Hash with New Partition Techniques
- The lack of examination of concurrency control on CSB-Tree in the literature prompts me to look at the protocol design. One interesting area to look at is how multi-threads in one transaction and multi-threads in multi-transactions can fit together on in-memory data structure in the context of preserving ACID properties.
- One interesting algorithm I am implementing is a protocol where readers does not acquire latches, and only updaters acquire latches in a short period. This protocol is inpired by S. K. Cha et al's paper.
- - After this implmentation, I will look at a latch-free protocal with tree partition techniques.
- I am also kindly guided by Yinan Li on this project.
I love spending my spare time hacking on some of my random ideas as follows.
- Instead of Google+ or like an entire web page, you want to just keep some parts of it that intersts you. WebClip is a new type of web browsing tool that allows you to quickly clip out parts of the web page. A legitimate "part" of a web page ranges from a phrase, a paragraph, to pictures and video. HTML5 will be the heart of WebClip.
Classes I enjoyed the most:
- Google Papers
- Database Storage
- Computer Architecture
- Papers on C++ Atomic Types and Operations