For this part of the project, you will implement the front-end and database utilities of Minirel. You will also have to design and implement the Minirel catalog relations. To help you implement the front-end quickly, we will provide a parser to parse user commands and (a subset of ) SQL. The parser works by reading in one line from stdin, parsing the line and making the appropriate calls to the back-end.
There are three database utilities each of which can be run as a standalone UNIX program
We have provided
complete implementations of dbcreate <dbname> and dbdestroy <dbname> and a skeleton of
minirel <dbname> functions
for these three routines in the files minirel.C, dbcreate.C, and dbdestroy.C
respectively. Each of these executables can be created by giving appropriate
arguments to make.
This creates the database dbname. You should examine this code carefully as it will give you some hints about how to implement operations like create relation(). The program first creates a unix directory to hold the database and then changes the working directory to that directory. Next it creates two relations called RelCatalog and AttrCatalog for the database catalogs. First of course it creates these two relations by creating instances of the RelCatalog and AttrCatalog classes. More details about this will be given in the section on Catalogs.
This destroys the database dbname. This is done by using the following UNIX command:
ostrstream sysString;
sysString << "rm -r " << dbname;
system(sysString.str());
By doing this, the directory dbname
and all the files (relations) it contains will be deleted.
This is the main
routine which is used to access and query the Minirel database dbname. When minirel is invoked with a dbname, the very first thing it does is to change
its working UNIX directory (using chdir)
to dbname. Each Minirel database
corresponds to a UNIX directory and each relation within that database
corresponds to a file within the directory. The minirel utility starts off by creating a buffer manager, opening
the relation and attribute catalogs and then calling the function parse().
Important: at this point global variable relCat refers to the relation catalog table
and global variable attrCat refers to the attribute catalog table. Keep in mind these two
variables, because when you implement various methods of class RelCatalog, you may
have to refer to the attribute catalog table, and vice versa, when you implement
various methods of class AttrCatalog, you may have to refer to the relation
catalog table.
With the call to parse(), an infinite loop will start prompting the user for
a new command, calling the appropriate routines of the lower layers to execute
the command and then prompting the user again. The parser (provided by us) will
be responsible for generating the prompt. After a command is parsed, it is
transformed by the interp()
function (in file parser/interp.C) into a sequence of operators (e.g. selects,
joins, creates, destroys, prints, loads etc.), which are then executed one by
one by the database back-end. Executing an operator involves calling some
routine which you have or will implement in other parts of the project. For
example, when the front-end receives a select-project query, the parser will
call a procedure named select()
that will in turn call the methods of the HeapFileScan class to scan the
appropriate heapfile.
Once minirel has been invoked, it can be used to either answer queries (using SQL DML) or to perform various utility operations (via SQL DDL). (Note: DML = Data Manipulation Language, and DDL = Data Definition Language.) Part5 of the project is mainly concerned with implementing the front-end utilities necessary to support a part of SQL DDL. The syntax of these commands will be described using a pseudo context-free grammar. All commands to the Minirel front-end must end with a semi-colon.
<SQL_COMMAND>
::= <DDL_STATEMENT>;
| <DML_STATEMENT>;
<DDL_STATEMENT> ::= <CREATE>
| <DESTROY>
| <LOAD>
| <PRINT>
| <HELP>
| <QUIT>
We now describe the semantics of each utility in detail.
<CREATE>::= CREATE TABLE relName (attr1 format1, ..., attrN
formatN) ;
The create command creates the
relation relName with the given
schema. Relation and attribute names can be at most MAXNAME = 32 characters
long. Each format argument has a type specification ('int', 'real' , 'char' or
'char' followed by an integer inside a pair of parenthesis specifying the
length of the string. e.g. 'char(20)'). For the 'char' format valid lengths are
between 1 to 255 (both inclusive). This command results in an invocation
of the RelCatalog::createRel() method that updates the catalogs
appropriately and then creates a HeapFile to store the tuples of the relation.
Thus, once you have implemented the method RelCatalog::createRel(), parse() will
call that method to execute this command.
<DESTROY>::= DESTROY TABLE relName ;
The destroy command should
destroy the HeapFile for relation relName. Finally it should remove all
relevant catalog entries for that relation. All this is done by invoking the
RelCatalog::destroyRel method.
Thus, once you have implemented the method RelCatalog::destroyRel,
parse() will call that method to execute this command.
<PRINT> ::= PRINT TABLE relName
The print routine prints the contents
of the specified relation in a reasonably tidy format. This is done by the
UT_Print function.
Thus, once you have implemented the function UT_Print,
parse() will call that function to execute this command.
<LOAD> ::= LOAD TABLE relname FROM (filename) ;
The load utility bulk loads tuples into a relation (which must have already
been created) using data in a binary UNIX file named filename. The load command
begins by examining the relation catalog to determine length of the tuple
length. Then it opens the UNIX file filename containing the input data. Each
input tuple is in binary format (i.e. there is no space wasted for delimiters).
The next step is to create an instance of the InsertFileScan class, passing the
relname as parameter which cause the appropriate HeapFile to be opened.
Finally, one tuple at a time should be read from the UNIX file and inserted
into the HeapFile using the InsertFileScan::insertRecord() method. After
loading all the tuples, the UNIX file should be closed and the InsertFileScan
object should be deleted. This will result in the underlying DB files being
closed. All this happens in the UT_Load function (described below).
Thus, once you have implemented the function UT_Load,
parse() will call that function to execute this command.
<HELP> ::= HELP [TABLE relName]
;
If a relname is not specified, help lists the names of all the relations in
the database. Otherwise, it lists the name, type, length and offset of each
attribute together with any other information you feel is useful. This is done
by the RelCatalog::help function.
Thus, once you have implemented the function RelCatalog::help,
parse() will call that function to execute this command.
<QUIT> ::= QUIT ;
The quit command exits minirel, causing all buffers and relations to be
flushed to disk. Prior to terminating, the system should close down any open
files (including catalogs). This is achieved by the UT_Quit function.
Thus, once you have implemented the function UT_Quit,
parse() will call that function to execute this command.
Summary: To implement the above commands, you will have to
implement RelCatalog::createRel(), RelCatalog::destroyRel, RelCatalog::help,
UT_Print, UT_Load, and UT_Quit, among others. These methods and functions
are described below.
One of the neat things about a relational database is that in addition to the data, even the metadata that describe what relations exist in the database, the types of their attributes and so on are also stored in relations, called catalogs. Minirel has two catalog relations : relcat and attrcat. The relcat relation contains one tuple for every relation in the database (including itself). The attrcat relation contains one tuple for every attribute of every relation (including the catalog relations) and this tuple contains information about the attribute. Both attrcat and relcat are created by the dbcreate utility and together they contain the schema of the database. Since each database has its own schema, relcat and attrcat must be placed in the UNIX directory corresponding to the database. relcat and attrcat are instances of the RelCatalog and AttrCatalog classes respectively.
#define RELCATNAME
"relcat" // name of relation catalog
#define ATTRCATNAME
"attrcat" // name of attribute catalog
#define MAXNAME
32
// length of relName, attrName
#define MAXSTRINGLEN
255 // max length of string attribute
// schema for tuples in the relation
relcat:
// relation name : char(32) <-- lookup
key
// attribute count : integer(4)
typedef struct {
char
relName[MAXNAME]; // relation
name
int
attrCnt;
// number of attributes
} RelDesc;
class RelCatalog : public HeapFile {
public:
// open relation catalog
RelCatalog(Status &status);
// get relation descriptor for a
relation
const Status getInfo(const string &
relName, RelDesc& record);
// add information to catalog
const Status addInfo(RelDesc &
record);
// remove tuple from catalog
const Status removeInfo(const string
& relName);
// create a new relation
const Status createRel(
const string
& relName,
const int
attrCnt, // number of elements in attrList[]
const attrInfo
attrList[] // see defn of attrInfo below
);
// define attrInfo
typedef struct {
char relName[MAXNAME];
// relation name
char
attrName[MAXNAME]; // attribute name
int attrType; //
INTEGER, FLOAT, or STRING
int attrLen; // length
of attribute in bytes
void *attrValue; //
ptr to binary value
} attrInfo;
// destroy a relation
const Status destroyRel(const string
& relName);
// print catalog information (relation
may be an empty string)
const Status help(const string &
relName);
// get rid of catalog
~RelCatalog();
};
RelCatalog::RelCatalog()
The constructor just invokes the HeapFile constructor with the constant
RELCATNAME (declared above).
RelCatalog::~RelCatalog()
You need to do nothing here as the underlying HeapFile will be
close automatically.
const Status createRel(const string & relName, const int attrCnt,
const attrInfo attrList[])
First make sure that a relation with the same name doesn't already exist (by
using the getInfo() function described below). Next add a tuple to the relcat
relation. Do this by filling in an instance of the RelDesc structure above and
then invoking the RelCatalog::addInfo() method. Third, for each of the attrCnt
attributes, invoke the AttrCatalog::addInfo() method of the attribute
catalog table (remember that this table is referred to by the global
variable attrCat), passing the appropriate
attribute information from the attrList[] array as an instance of the AttrDesc
structure (see below). Finally, create a HeapFile instance to hold tuples of
the relation (hint: there is a procedure to do this which we have seen
in the last project stage; you need to give it a string that is the relation
name). Implement this function in create.C
const Status destroyRel(const string & relName)
First remove all relevant schema information from both the relcat and
attrcat relations. Then destroy the heapfile corresponding to the relation
(hint: again, there is a procedure to destroy heap file that we have
seen in the last project stage; you need to give it
a string that is the relation name). Implement this function in destroy.C
const Status addInfo(RelDesc & record)
Adds the relation descriptor contained in record to the relcat relation. RelDesc represents both the
in-memory format and on-disk format of a tuple in relcat. First, create
an InsertFileScan object on the relation catalog table. Next, create
a record and then insert it into the relation catalog table using
the method insertRecord of InsertFileScan.
const Status getInfo(const string & relName, RelDesc& record)
Open a scan on the relcat relation by invoking the startScan() method on
itself. You want to look for the tuple whose first attribute matches the string
relName. Then call scanNext() and getRecord() to get the desired tuple.
Finally, you need to memcpy() the tuple out of the buffer pool into the return
parameter record.
const Status removeInfo(const string & relName)
Remove the tuple corresponding to relName from relcat. Once again, you have
to start a filter scan on relcat to locate the rid of the desired tuple. Then
you can call deleteRecord() to remove it.
const Status help(const string & relName)
If relation.empty() is true (empty() is a method on the string class), print
(to standard output) a list of all the relations in relcat (including how many
attributes it has). Otherwise, print all the tuples in attrcat that are relevant
to relName. Implement this function in help.C
// schema of tuples in the attribute catalog:
typedef struct {
char relName[MAXNAME]; // relation name
char attrName[MAXNAME]; // attribute name
int attrOffset; // attribute offset
int attrType; // attribute type
int attrLen; // attribute length
} AttrDesc;
class AttrCatalog : public HeapFile {
friend RelCatalog;
public:
// open attribute catalog
AttrCatalog(Status &status);
// get attribute catalog tuple
const Status getInfo(const string &
relName,
const string & attrName,
AttrDesc &record);
// add information to catalog
const Status addInfo(AttrDesc &
record);
// remove tuple from catalog
const Status removeInfo(const string
& relName,
const string & attrName);
// get all attributes of a relation
const Status getRelInfo(const string
& relName,
int &attrCnt,
AttrDesc *&attrs);
// delete all information about a
relation
const Status dropRelation(const string
& relName);
// close attribute catalog
~AttrCatalog();
};
AttrCatalog(Status &status)
Just call the new HeapFile constructor (see below), passing the constant
ATTRCATNAME (defined above)
~AttrCatalog()
Do nothing. File gets closed automatically by HeapFile destructor.
const Status getInfo(const string & relName, const string &
attrName, AttrDesc &record)
Returns the attribute descriptor record for attribute attrName in relation
relName. Uses a scan over the underlying heapfile to get all tuples for
relation and check each tuple to find whether it corresponds to attrName. (Or
maybe do it the other way around !) This has to be done this way because a
predicated HeapFileScan does not allow conjuncted predicates. Note that the
tuples in attrcat are of type AttrDesc (structure given above).
const Status getRelInfo(const string & relName, int &attrCnt,
AttrDesc *&attrs)
While getInfo() above returns the description of a single attribute, this
method returns (by reference) descriptors for all attributes of the relation
via attr, an array of AttrDesc
structures, and the count of the number of attributes in attrCnt. The attrs array is allocated by this function, but
it should be deallocated by the caller.
const Status addInfo(AttrDesc & record)
Adds a tuple (corresponding to an attribute of a relation) to the attrcat
relation.
const Status removeInfo(const string & relName, const string &
attrName)
Removes the tuple from attrcat that corresponds to attribute attrName of
relation.
const Status dropRelation(const string & relName)
Deletes all tuples in attrcat for the relation relName. Again use a scan to
find all tuples whose relName attribute equals relation and then call
deleteRecord() on each one. Implement this function in destroy.C
When the Minirel parser reads in a command, it will parse the command and call the necessary function to execute it. Some of the front-end DDL commands such as create and destroy are implemented by invoking the appropriate methods on the catalog classes. The implementation of the other utility functions load, print and quit are described below.
void UT_quit(void)
This function should simple delete the RelCatalog and AttrCatalog objects
(causing the underlying files to be closed), delete the BufMgr object (causing
any remaining dirty pages to be flushed to disk) and then terminate the minirel
program by calling exit().
const Status UT_load(const string & relName, const string &
fileName)
Loads the tuples from the UNIX file fileName into the specified
relation. The relation must already exist before UT_load is
invoked.
const Status UT_print(const string & relName)
Prints the values of all the attributes of all tuples of the specified
relation. We will give you this code.
You should continue to use the same error handling mechanism that you used for the previous parts of the project. Feel free to augment this with new error codes and messages as needed.
Start by downloading the Zip file for this stage (since you are not doing this stage, you can just ignore this). This Zip file contains the following files that are relevant to this part of the project (in addition to other files which were created while developing the lower layers).
In the Zip file you will find a shell program called uttest
along with two directories uttestqueries and uttestdata. If you look at the
files in the directory uttestqueries you will notice that they contain a
progressively more difficult series of tests . The directory uttestdata
contains some small tables that will get loaded by the test queries.
To run all the tests, simply invoke uttest while in the
part 5 directory. If you want to run one test at a time simply type uttest
i where where i is a number between 1 and 10. For example, uttest
3 will run the test in the file ut.3 in the uttestqueries directory.