CS 186: Introduction to Database Systems
Homework 5: Functional Dependencies and Schema Refinement
Fall, 1996
Redundancy is at the root of several problems associated with relational
schemata (schemata is the plural of schema). The problems include
redundant storage and insert/delete/update anomalies. This exercise introduces
you to a normalization tool called designview, which is distributed
with minibase. It allows you to design schemata, identify functional dependencies,
isolate sources of potential redundancy, and decide how you want to deal
with the redundancy. As you work through this assignment, you will be asked
questions; your answers to these questions will form your solution set
for this assignment. The questions to be answered for the assignment
are numbered and highlighted in the text. Be sure to answer all of them!
Creation of a Relation
- Open up the designview tool by typing designview.(Note: if
you have modified your class account environment, you may need to add the
line
setenv PLATFORM
hppa1.1-hp-hpux9.05
to your .cshrc.)
- Under the File menu, select New. This will open a window
in which you can create a new schema.
- In the Database Attributes frame click on the Add button.
This will open a window where you can specify the information for the new
attribute.
- In the Name: box enter StudentID.
- In the Type: box click on Int. Click the OK button.
- In a similar manner, create attributes Course (char[20]), Grade
(char[1]), Hour (datetime), Room (char[10]), and Instructor
(char[20]). For the attributes with char types don't forget
to enter the appropriate length in the Length: box.
Creation of Functional Dependencies
- In the Functional Dependencies frame click on the Add
button. This will open a window where you can edit the new functional dependency.
- In the All Attributes frame click on the Course attribute.
- In the Antecedent frame click on the Add button. Notice
that Course now apears in the Antecedent frame.
- Now click on the Instructor attribute in the All Attributes
frame.
- In the Consequence frame click on the add button.
- Click on the OK button. Note that the FD Course ->Instructor
now appears in the Functional Dependencies frame.
- In a similar manner, create the FDs:
- {Course,StudentID} -> {Grade}
- {Hour,StudentID} -> {Course}
- {Hour,Instructor} -> {Room}
- {Hour,Room} -> {Course}
Creation of a Schema
Now that we have defined what attributes and FD's the database contains
we must decide on a schema. To start with, we will be simple-minded and
simply take all of the attributes in the DB as the only relation. To do
this:
- In the Source Relations frame, under the Names: box,
click on the add button.
- A new relation window will pop up, enter the name "Base".
Then hit OK.
- Click and drag over all the attributes in the Database Attributes
frame.
- Click on the Add button under the Attributes: box. All
the attributes should now appear in the box.
- We are done creating an initial schema. Go to the File menu
and select Save As, and save the schema in some convenient place
in your directory. (Remember the name you choose!)
- Close the schema creation window by selecting Exit from the
File menu.
- Important: In order to do any work with the
schema you have just created you must now load it. To do this select Open
from the File menu of the original designview window, and load the
file in which you saved the schema.
Potential problems
With the schema that you have just created there are several problems
caused by the functional dependencies. The first set of questions is designed
to illustrate these problems.
Consider a possible set of data that may be stored in the above schema:
Course |
Teacher |
Room |
Hour |
Student ID |
Grade |
CS 186 |
Hellerstein |
Soda 306 |
12:30 TR |
2154 |
A |
CS 186 |
Hellerstein |
Soda 306 |
12:30 TR |
1129 |
A |
CS 186 |
Hellerstein |
Soda 306 |
12:30 TR |
8510 |
AB |
CS 186 |
Hellerstein |
Soda 306 |
12:30 TR |
4243 |
C |
CS 286 |
Stonebraker |
Cory 150 |
11:00 TR |
9821 |
C |
CS 286 |
Stonebraker |
Cory 150 |
11:00 TR |
8435 |
A |
CS 286 |
Stonebraker |
Cory 150 |
11:00 TR |
2835 |
A |
CS 150 |
Patterson |
Soda 505 |
1:00 TRF |
7235 |
AB |
CS 150 |
Patterson |
Soda 505 |
1:00 TRF |
8510 |
AB |
CS 150 |
Patterson |
Soda 505 |
1:00 TRF |
2449 |
AB |
CS 170 |
Blum |
Soda 405 |
2:25 MWF |
9821 |
B |
CS 170 |
Blum |
Soda 405 |
2:25 MWF |
8510 |
A |
CS 170 |
Blum |
Soda 405 |
2:25 MWF |
2154 |
BC |
Questions on the Original Schema:
- Give an example of an anomaly
resulting from an update in the Teacher field.
- Give an example of an insertion
anomaly that could occur.
- What would happen if all
the students were to drop Professor Hellerstein's course?
Decomposition of the schema can eliminate the above problems!
BCNF Testing & Decomposition in Designview
We learned in class that if all of the relations in a schema are in
BCNF, then no redundancies should arise. To test if the Base relation is
in BCNF, do the following:
- In the main designview window, you should see one box, representing
the Base relation you created. (If you don't, perhaps you forgot to open
your schema?)
- You will now begin an editing session on the Base relation.
This session will keep track of any changes you make to Base. To
start the session, click on the Base box, and choose Start New Session
from the Relation menu (alternatively, you may simply double-click
on the Base box, or press the Start Session button.) A new window
should pop up, again containing the Base box. We will continue in this
new window.
- In order to test to see if Base is in BCNF, click on it in the new
window, and under the Relation menu choose Test/Decompose...
A nested menu will pop up; choose Test BCNF from this menu.
- A new BCNF Test window will pop up, showing any FDs which
violate BCNF. Clearly, Base was not in BCNF!
- In the BCNF Test window, note the FD Hour, Instructor->
Room.
- Explain why this FD violates
BCNF.
- Now, highlight that FD with the mouse, and press the decompose button.
Look back at the session window: note that two new boxes have appeared
(you may need to resize the session window to see this.) These two boxes
represent a decomposition of the Base relation into two relations, Base1
and Base2.
- What is the difference
between Base2 and Base?
- How did the tool choose
the columns of Base1?
(Note that the tool has automatically carried out the decomposition
technique we learned in class!)
Testing Properties of the Decomposition
Our next task is to ensure that the decomposition we just performed
is lossless-join; if it is not, we will be unable to reconstruct the original
relation!
- First, we will use the tool to test if the decomposition was lossless-join.
To do so, in the session window, click on the Base relation. Then, from
the Relation menu, choose Test/Decompose..., and from the
nested menu choose Test Lossless Join.
- Is the decomposition lossless-join?
- Prove that the answer the
tool gave for the previous question is correct, based on the columns in
Base1 and Base2, and the FDs.
Now we must check that the decomposition is dependency-preserving. If
it is not dependency-preserving, it will be expensive to maintain the dependencies.
- In the session window, click on the Base relation. Then, from the Relation
menu, choose Test/Decompose..., and from the nested menu choose
Test FD Preserving.
- Is the decomposition dependency-preserving?
- Assuming that Hour,
Teacher -> Room were the only FD in this schema, how would you
guarantee (by hand) that the decomposition was dependency-preserving?
Completion of the BCNF Decomposition
Continue using the tool in this fashion -- one FD at a time -- to decompose
the relation into BCNF. (Note: you only decompose the "leaves"
of the decomposition tree. "Internal nodes" represent relations
that have already been decomposed.) As you decompose, the tool is actually
checking automatically whether or not your decompositions are lossless-join
and dependency-preserving.
- At some point, the tool
will warn you that your decomposition is not dependency-preserving.
At that point, press the "Continue" button. What was your choice
of FD to decompose that led to the warning?
When you arrive at a schema which is in BCNF, choose Save As
from the Session menu, and when prompted for a name for the
session, type 'MyBCNF'. Then close the session window by choosing Exit
from the Session menu.
Conversion to SQL
In the main designview window, the Base relation should still be visible
in an unchanged form. However, designview has remembered your "MyBCNF"
decomposition session. To see the SQL commands that create your decomposed
tables, select the Base relation and press the Show SQL button.
Write down the resulting SQL DDL commands on scratch paper (or cut-and-paste
into your favorite editor.)
Unfortunately, designview does not provide all the SQL constraints required
to guarantee that dependencies are maintained by the system.
- Modify the SQL DDL from
designview to guarantee the dependencies.
Automatic Decomposition in Designview
In our previous session, we decomposed Base one FD at a time, to get
a BCNF decomposition. Designview can automate these individual steps. We
will try this now:
- Start a new session in the main designview window.
- In the new Session window, click on the Base relation.
- From the Relation menu, choose Test/Decompose... and
then Decompose into BCNF.
- You will receive a warning about "non-binary lossless join".
Ignore it by pressing the OK button when it pops up.
- You will then receive a warning about a non-dependency-preserving decomposition.
Ignore this by pressing Continue.
- You may then receive warnings about existing relations. For each
of these warnings, choose Replace With an Existing Relation.
- Briefly describe how (if)
the resulting schema differs from your previous BCNF schema.
Since this decomposition was not dependency-preserving, we will delete
it and consider a 3NF decomposition instead.
- To delete this BCNF decomposition, click on the Base relation in the
New Session window.
- From the Relation menu, choose Undo Decomposition.
- In the resulting dialog box, choose Delete. The session window
should revert to its previous state.
- From the Relation menu, choose Test/Decompose... and then Decompose
into 3NF.
- Handle any dialog boxes as you did above, until you get the new 3NF
decomposition.
- Is the resulting decomposition
lossless-join?
- Is it dependency-preserving?
- Briefly describe how (if)
the resulting schema differs from your original BCNF schema.
- Save this session: from the Session menu, choose Save As.
Save the session as "My3NF".
- Exit this session: from the Session menu, choose Exit.
SQL for the 3NF Decomposition: Using Multiple Saved Sessions
You now have two sessions saved: MyBCNF and My3NF. To choose the
default session, press the View Details button in the main designview
window. Three extra frames pop up: one for attributes of Base, one for
the FDs, and one with a list of Decomposition Sessions. We want to make
the My3NF decomposition be the default:
- In the Decomposition Sessions frame, click on My3NF. Then, using
the middle mouse button, click-and-hold on My3NF again. A menu pops up;
choose Set as Default. Note that My3NF is moved to the top of the
list in that frame -- the topmost session in that list is the current default
session in designview.
- Hide the details by pressing the Hide Details button.
Now you can see the SQL for the 3NF decomposition by clicking the Show
SQL button.
- Write down the resulting SQL DDL, but modify
it as needed to ensure that dependencies are preserved.
A Dependency-Preserving, BCNF Decomposition
In general, there is no dependency-preserving BCNF decomposition of
every schema. However, it happens that there is a dependency-preserving
BCNF decomposition of our schema, even though designview doesn't find
it automatically. The final part of this assignment
is to find a dependency preserving BCNF decomposition for Base.
You may want to use designview to help you, but you shouldn't rely
on it to find the decomposition for you; it won't!
- Write down the full SQL
DDL for a dependency-preserving, BCNF decomposition of the original
Base relation.
|