DTE Overview

DTE (Data Transformation Engine) is a distributed database system whose aim is to support database style querying on the Web. This system is mainly a read only database and the emphasis is on non traditional database issues such as:

Support for short response time queries.
Support for ``Top N'' style queries.
Random sample of a query.
Support for gestalts (ontologies) on the Web.

In what follows, we will describe the basic architecture of the system by explaining what happens after a request (data definition or query) is submitted to the system.
DTE is called by instantiating an object of class Engine:

class Engine {
public:
    Engine();
    submit(string request);
    Schema* getSchema();
    Tuple* getNext();
};

Every request results in a stream of tuples (which may be empty).
Tuple is a structure that is implemented as an array of ADT pointers:

class Tuple {
ADT* fields[0];
public:
void* operator[](int i){return fields[i];}
};

ADT pointers are implemented as void pointers:

typedef void ADT;

Some basic ADT are implemented as follows: integer implemented as native C type int, real numbers are implemented as native C type double, and strings are implemented as native C char arrays terminated by NULL character.

The application that receives the answers have to interpret them according to the Schema given by getSchema method.
As the result of calling the getSchema method, the parser module is invoked.
Parsing step is largely trivial and is accomplished by using standard tools for lexical analysis (flex) and parsing (bison).
Input to the parsing module is the request string, output is the parse tree object.

class ParseTree {
public:
virtual const Schema* getSchema();
virtual Iterator* createExec() = 0;
};

So, most abstractly, every request (not necessarily a query) results in an Iterator that is described by a Schema. Schema is a list of type identifier pairs. Iterator is an object that supports getNext interface.
We have several subclasses of the parse tree, each corresponding to the different type of request.
The main one is the QueryParseTree, that results from a SQL query. The others are the data definition commands, such as: CreateIndexParseTree, DropIndexParseTree, InsertParseTree, DeleteParseTree, CreateTableParseTree and similar.
Method createExec() returns an empty Iterrator for all data definition parse trees while it returns the query answer for the QueryParseTree.

class Iterator {
public:
    virtual void initialize();
    virtual const Tuple* getNext();
    virtual const TypeIDList& getTypes();
};

If the query fails to parse, a parse exception is returned by the system, otherwise, the createExec() method is invoked on the parse tree object, which in turn calls the type checker module.
Type checking is similar for data definition request and for queries but not identical.
Here we will only discuss the query typechecking (QueryParseTree)
Important part of the QueryParseTree class is the Query object.
Query is internally represented by the following class:

class Query {
    Expression[] selectClause;
    TableAlias[] fromClause;
    Expression whereClause;
};

Class Expression is an abstract base class and represents SQL expressions such as: t1.m, 1, t1.m+ 1, t1.m = t2.m, etc. All these expressions are created by parser.

Example of a TableAlias is .a.b.c as t1. You may notice an unconventional specification of the FROM clause. Tables in DTE can be located within other collections. In this example, table c is located in a collection of tables called b, which is in turn located in the collection a. Collections may be of different types, in DTE we have special tables that act as collections and we call them catalogs or directories.
Typical catalogs may look like this:

Relation 1.1

a collection 1.2

x singleton 1.3

Relation 1.2

b collection 2.1

y singleton 1.4

Catalogs are tables with fixed format. Schema of a catalog file is (string relationName, bit isCollection, number relationId). Relation id is made up of two parts, first part denotes server and the second part denotes relation within the server. By convention, root catalog of every server has relationId equal to 1. We have made a distinction between relation names, which are for user convenience only, and relation ids which are used by the system to refer to tables. By this distinction, internal references in the system will not brake when user renames or moves relations.

Another important type of collection is a remote DTE server.Conceptually, servers just store a collection of relations and in that sense they are similar to directories. However, implementation wise, they are quite different. While the directories are actively traversed by the local server, all the operations that relate to a remote server are just handed over to be done autonomously by the remote server.

Consider now what happens when the system has to find table .a.b.c. It first looks at the root catalog (1.1) for the entry a. If finds that this entry exists and is a name for relation 1.2. Then it finds the entry b in 1.2 and discovers that this entry is an object on another server (number 2). At this point system forwards the request to resolve table c to the server 2. Assume that the response from the server 2 was that the relation c has an id of 4.5. Mapping of names to relation ids is not implemented yet.

At this time server 1 knows that server 4 has table t1 and that this table is registered under relation id number 5. From this information (server and relation ids) we construct a class that implements DataSource interface. This interface provides methods to retrieve all the relevant information about a particular table, including its schema.
In general, when resolving a table name, we might have to cross more than one server boundary. If the system was naively chasing names like this every time, the name resolution will be very inefficient. This is an additional reason to maintain only destination server id when referencing other tables.

Example of some classes that implement DataSource interface can be found in DataSources package .

Once the schema is known, server 1 can type check the query. Type checking is done in two steps:

Load the schemas of all the tables referenced in the from clause in the SymbolTable.
Walk the SELECT and WHERE clause and invoke typeCheck method on every expression.

If the typechecking is successful the server 1 proceeds to optimize the query. For optimization, one needs to know statistics about relevant tables, access methods available for them, capabilities of servers involved in the query and physical properties of each server, including the network traffic and server load. All the information that is table specific can be retrieved from the DataSource interface. All the information that is site specific can be retrieved from the SiteDescriptor interface.

Once the optimization is completed, parts of the best plan are shipped to the servers involved as directives. Nodes in the best plan are used to create the tree of iterators that are using pull mechanism to retrieve all answers. To learn more about iterators, check out the Iterator package.

Presently, DTE does not have a buffer manager, it is relying on the underlying operating system to do the buffering. We plan to incorporate the Range Buffer manager for accessing local and remote files. Range buffer manager will not affect the optimization of the queries, but will speed up the execution.

The only join method implemented is the nested loops join. Inner table is loaded in virtual memory (not in buffer pool) and we do not keep track of the amount of memory allocated.
We have an implementation of Gestalts. The only queries possible on a gestalt at the present time are single table queries (selections and aggregates).
Selection is being pushed down to the Gestalt members, aggregates are not (they should be too).