Back to index
Inclusion of New Types in Relational Data Base Systems
Michael Stonebraker
University of California, Berkeley
Scribe: Zuyu Zhang
One-line Summary
The paper explores a mechanism to enable access methods and query optimizations over user-defined data types and operators for columns in a relational DBMS.
Overview/Main Points
- Background
- DBMS types have int, float, char array, ...
- Applications would like polygon (and answer queries like where is the overlap polygon), points, lines, line groups, complex numbers, time series, video, and images .
- Hard to support queries (2d boxes as ((x1, y1), (x2, y2))) using operations in DBMS
- create table boxes (id: int, x1: float, x2:float, y1:float, y2: float);
- To find all boxes overlapping (1, 0, 1, 0, 1): SELECT boxes.* From boxes where not ((boxes.x2 ≤ 0 or boxes.x1 ≥ 1) or (boxes.y1 ≤ 0 or boxes.y2 ≥ 1))
- Too hard to understand.
- Maybe too slow due to both lack of query optimazations for such complex queries, and syntax checkings for too many clauses.
- Instead, we want to
- create a type boxType
- with operations like ''overlapes''
- with special indexes (like R-Trees)
- But how to use new indexes?
- Create table boxes(id: integer, box: boxType)
- Select boxes.* from boxes where boxes.box overlaps (0, 1, 0, 1);
When adding ADTs to RDBMS,
- run ADT methods in an address space separate from that of the RDBMS
- Reliable and security reasons: a buggy, or even worse, a malicious ADT rountine may crash the DBMS by overwriting DBMS data structures or the entire data base with zeros.
- When an error occurs, it is easy to distinguish whether bugs exist in the ADT routine or the DBMS.
- run ADT methods in the same address space as that of the RDBMS
- For performance concerns so that a round trip communication overhead could be avoided.
- A hybrid approach: two environments for ADT procedures
- A protected environment for debugging purposes
- A trusted environment within the unprotected DBMS for reliable ADT rountines
Why adding schema inheritance to the relational data model likely to be easier than adding ADTs and their methods to a relational DBMS?
- When adding ADTs and their methods, we have to
- extend access methods which retrieves and updates data and should work well with the log manager (event recording and crash recovery), the concurrency control manager (xact mangement), and the buffer manager (page layout).
- support query optimizations for user-defined data types and operators.
Object Relational db (Illustra)
Relevance
R-tree.
Offtopic: stored procedure
Flaws