My current research interests lie in harnessing the power of data visualization and its applications to various problem domains, especially the power of visual summaries to understand and navigate large datasets. Michael Gleicher (graphics) is my advisor through this endeavor!
My graduate work so far has involved visualizing the binding interfaces of proteins to ligand substrates, visualizing overviews of large datasets and supporting identification and comparison tasks using WebGL, as well as visualizing co-occurrences of mutations on virus genomes. Currently, I am working toward creating both descriptive and prescriptive understanding of the design of visual summaries, looking at how visual summaries are designed with respect to characteristics of the data and analyst task.
I am on the job market. Feel free to contact me about any opportunities, whether you are from the academic, industry, or government world. You can access my résumé (short-CV) or CV through their respective links. Statements on research, teaching, diversity inclusion, and references are available on request.
I did my undergraduate work at the University of Washington in Seattle. In five years there, I gained a bachelors degree in both Computer Science and Chemistry (ACS certification). My primary research advisor was Dave Bacon (currently at Google) in the area of quantum computation. Before coming to Madison, I spent two years at Microsoft working on their telemetry-gathering systems. Yes, sending your crash reports gets those kernel-crashes fixed!
Scatterplots are among the most common methods for exploring and presenting data. We look at teasing apart the design decisions by focusing on the trade-offs of task affordances. In this work, we synthesize recent work to derive tweleve abstracted scatterplots that help to formulate a basis for prescriptive scatterplot design.
In the larger picture, we are looking at how we can take both task and data characteristics into account for prescriptive scatterplot design. This involves collecting the full set of scatterplot design decisions and their individual affordances. As a result of this work, we expect to identify areas of the scatterplot design space that are under-explored, and provide guidance for effective scatterplot design, given the required affordances and relevant characteristics of the data.
Log event data are become increasingly prevalent in analysis scenarios. In this work, we identify several common trends of noise that appear in telemetry or machine log data. We then propose multiple methods for pre-processing this data, and how these methods may integrate into the analysis workflow to help focus the analysis. We envision that these methods could be used in an iterative fashion in a future visual analytics tool and become a valuable provenance component of such visualizations.
Our exploration of the co-occurrences of mutations in populations of viral genomes led us to propose several designs for the rapid identification of these co-occurrences. Identifying these co-occurrences is critical step to target potential therapies for fast-mutating viruses, especially those in the class of RNA viruses. Co-occurrences of mutations can indicate that the virus is successfully adapting to outside pressures, such as the host's immune system. We present a design study that looks at the problem of identifying potentially interesting co-occurrences of events (in this case, mutations), and show both a negative example (MatrixViewer) and a positive example (CooccurViewer) that helps analysts identify interesting co-occurrences.
Implementing the splatterplot paradigm for WebGL to allow for arbitrary data to be loaded into a widely-accessible prototype. A splatterplot offers higher-level judgments of data over a conventional scatterplot by trying to minimize overdraw by communicating dense regions by closed regions and KDE blurring, while also preserving potentially interesting features such as outliers. This methodology scales well in the browser for modern GPUs, and can support comparisons between multiple data series.
Our preliminary publication submitted to the DSIA workshop discusses the potential benefits of utilizing WebGL and HTML5 binary scaffolds for supporting interactive visualizations and summarizations of large amounts of data in the browser.
This project led us to ask and run empirical studies to see how perception of lightness constancy comes into play when displaying data using color on surfaces (such as molecular surfaces) that use shading to convey a sense of three-dimensional shape. Our work uses a sequential series of experiments to test the presence of lightness constancy, the effect of structural information, approximation of shadow darkening, the effect of luminance-encoded color ramps, and addition of optional shape cues (stereoscopic and suggestive contours). We provide a series of conclusions about supporting color constancy to ensure viewers obtain accurate judgments of shadowed data in computer-generated visualizations.
For machine learning methods that predict structural features on proteins, it can be difficult to understand the performance of the classifier. Summary statistics are usually used to evaluate these models, but these measures lack the fidelity to analyze how the classifier works over many proteins with different structures and chemistry. We have developed a visualization platform to view the results of protein structural classifiers to be able to see the performance over both the entire test corpus and the protein structure.
Adiabatic quantum computation is a class of quantum computation that seems to be more physically-realizable as compared to a full quantum computer. However, we postulate, through iterative experimentation, that AQC only offers a polynomial speedup over classical computation, not the exponential speedup offered by (theroetical) quantum computation.
Last updated on 22 February 2017.