Query Optimization for Object-Relational Database Systems

Download Postscript Download PDF

Abstract

Modern database systems are placing an increasingly heavy burden upon their query optimizers. Commercial vendors are all scrambling to add object-relational features to their database systems, but unfortunately, optimizer technology has not kept pace with these advances for a number of reasons. Writing an optimizer, debugging it, and evaluating different optimization strategies remains a time-consuming and difficult task. Another problem is that attempts to design optimizers that can easily be extended to incorporate new operators, algorithms, or search strategies have met with limited success. Finally, even the best of optimizers very often produce sub-optimal query evaluation plans. This problem is further aggravated by the presence of novel data domains and user-definable data-types and functions which make it very difficult to maintain statistics and estimate query execution costs.

In this thesis, we present some solutions to the problems that are currently facing query optimizers. We describe OPT++ an architecture that significantly improves the extensibility and maintainability of a query optimizer. OPT++ considerably simplifies both, the task of implementing an optimizer for a new database system, and the task of evaluating alternative optimization techniques and strategies to decide what techniques are best suited for that database system. We use this as a tool to implement a number of relational and object-relational optimization techniques and search strategies and perform a study to compare their relative performance. We also describe {\em Dynamic Re-optimization}, a technique that can be used to tackle sub-optimality of plans produced by a query optimizer. The basic idea is to collect statistics at key points during the execution of a complex query. These statistics are then used to optimize the execution of the query, either by improving the resource allocation for that query, or by changing the execution plan for the remainder of the query. To ensure that this does not significantly slow down the normal execution of a query, the Query Optimizer carefully chooses what statistics to collect, when to collect them, and the circumstances under which to re-optimize the query.

Navin Kabra (navin@cs.wisc.edu)