Database Management Systems

by Raghu Ramakrishnan and Johannes Gehrke

CS 186: Introduction to Database Systems

Homework 5: Functional Dependencies and Schema Refinement

Fall, 1996

Redundancy is at the root of several problems associated with relational schemata (schemata is the plural of schema). The problems include redundant storage and insert/delete/update anomalies. This exercise introduces you to a normalization tool called designview, which is distributed with minibase. It allows you to design schemata, identify functional dependencies, isolate sources of potential redundancy, and decide how you want to deal with the redundancy. As you work through this assignment, you will be asked questions; your answers to these questions will form your solution set for this assignment.  The questions to be answered for the assignment are numbered and highlighted in the text. Be sure to answer all of them!

Creation of a Relation

  • Open up the designview tool by typing designview.(Note: if you have modified your class account environment, you may need to add the line
             setenv PLATFORM hppa1.1-hp-hpux9.05
    to your .cshrc.)
  • Under the File menu, select New. This will open a window in which you can create a new schema.
  • In the Database Attributes frame click on the Add button. This will open a window where you can specify the information for the new attribute.
  • In the Name: box enter StudentID.
  • In the Type: box click on Int. Click the OK button.
  • In a similar manner, create attributes Course (char[20]), Grade (char[1]), Hour (datetime), Room (char[10]), and Instructor (char[20]). For the attributes with char types don't forget to enter the appropriate length in the Length: box.

Creation of Functional Dependencies

  • In the Functional Dependencies frame click on the Add button. This will open a window where you can edit the new functional dependency.
  • In the All Attributes frame click on the Course attribute.
  • In the Antecedent frame click on the Add button. Notice that Course now apears in the Antecedent frame.
  • Now click on the Instructor attribute in the All Attributes frame.
  • In the Consequence frame click on the add button.
  • Click on the OK button. Note that the FD Course ->Instructor now appears in the Functional Dependencies frame.
  • In a similar manner, create the FDs:
    • {Course,StudentID} -> {Grade}
    • {Hour,StudentID} -> {Course}
    • {Hour,Instructor} -> {Room}
    • {Hour,Room} -> {Course}

Creation of a Schema

Now that we have defined what attributes and FD's the database contains we must decide on a schema. To start with, we will be simple-minded and simply take all of the attributes in the DB as the only relation. To do this:

  • In the Source Relations frame, under the Names: box, click on the add button.
  • A new relation window will pop up, enter the name "Base". Then hit OK.
  • Click and drag over all the attributes in the Database Attributes frame.
  • Click on the Add button under the Attributes: box. All the attributes should now appear in the box.
  • We are done creating an initial schema. Go to the File menu and select Save As, and save the schema in some convenient place in your directory. (Remember the name you choose!)
  • Close the schema creation window by selecting Exit from the File menu.
  • Important: In order to do any work with the schema you have just created you must now load it. To do this select Open from the File menu of the original designview window, and load the file in which you saved the schema.

Potential problems

With the schema that you have just created there are several problems caused by the functional dependencies. The first set of questions is designed to illustrate these problems.

Consider a possible set of data that may be stored in the above schema:

Course Teacher Room Hour Student ID Grade
CS 186 Hellerstein Soda 306 12:30 TR 2154 A
CS 186 Hellerstein Soda 306 12:30 TR 1129 A
CS 186 Hellerstein Soda 306 12:30 TR 8510 AB
CS 186 Hellerstein Soda 306 12:30 TR 4243 C
CS 286 Stonebraker Cory 150 11:00 TR 9821 C
CS 286 Stonebraker Cory 150 11:00 TR 8435 A
CS 286 Stonebraker Cory 150 11:00 TR 2835 A
CS 150 Patterson Soda 505 1:00 TRF 7235 AB
CS 150 Patterson Soda 505 1:00 TRF 8510 AB
CS 150 Patterson Soda 505 1:00 TRF 2449 AB
CS 170 Blum Soda 405 2:25 MWF 9821 B
CS 170 Blum Soda 405 2:25 MWF 8510 A
CS 170 Blum Soda 405 2:25 MWF 2154 BC

Questions on the Original Schema:

  1. Give an example of an anomaly resulting from an update in the Teacher field.
  2. Give an example of an insertion anomaly that could occur.
  3. What would happen if all the students were to drop Professor Hellerstein's course?

Decomposition of the schema can eliminate the above problems!

BCNF Testing & Decomposition in Designview

We learned in class that if all of the relations in a schema are in BCNF, then no redundancies should arise. To test if the Base relation is in BCNF, do the following:

  • In the main designview window, you should see one box, representing the Base relation you created. (If you don't, perhaps you forgot to open your schema?)
  • You will now begin an editing session on the Base relation.  This session will keep track of any changes you make to Base. To start the session, click on the Base box, and choose Start New Session from the Relation menu (alternatively, you may simply double-click on the Base box, or press the Start Session button.) A new window should pop up, again containing the Base box. We will continue in this new window.
  • In order to test to see if Base is in BCNF, click on it in the new window, and under the Relation menu choose Test/Decompose...  A nested menu will pop up; choose Test BCNF from this menu.
  • A new BCNF Test window will pop up, showing any FDs which violate BCNF. Clearly, Base was not in BCNF!
  • In the BCNF Test window, note the FD Hour, Instructor-> Room.
    1. Explain why this FD violates BCNF.
  • Now, highlight that FD with the mouse, and press the decompose button.  Look back at the session window: note that two new boxes have appeared (you may need to resize the session window to see this.) These two boxes represent a decomposition of the Base relation into two relations, Base1 and Base2.
    1. What is the difference between Base2 and Base?  
    2. How did the tool choose the columns of Base1?

(Note that the tool has automatically carried out the decomposition technique we learned in class!)

Testing Properties of the Decomposition

Our next task is to ensure that the decomposition we just performed is lossless-join; if it is not, we will be unable to reconstruct the original relation!

  • First, we will use the tool to test if the decomposition was lossless-join. To do so, in the session window, click on the Base relation. Then, from the Relation menu, choose Test/Decompose..., and from the nested menu choose Test Lossless Join.
    1. Is the decomposition lossless-join?
    2. Prove that the answer the tool gave for the previous question is correct, based on the columns in Base1 and Base2, and the FDs.

Now we must check that the decomposition is dependency-preserving.  If it is not dependency-preserving, it will be expensive to maintain the dependencies.

  • In the session window, click on the Base relation. Then, from the Relation menu, choose Test/Decompose..., and from the nested menu choose Test FD Preserving.
    1. Is the decomposition dependency-preserving?
    2. Assuming that Hour, Teacher -> Room were the only FD in this schema, how would you guarantee (by hand) that the decomposition was dependency-preserving?

Completion of the BCNF Decomposition

Continue using the tool in this fashion -- one FD at a time -- to decompose the relation into BCNF. (Note: you only decompose the "leaves" of the decomposition tree.  "Internal nodes" represent relations that have already been decomposed.) As you decompose, the tool is actually checking automatically whether or not your decompositions are lossless-join and dependency-preserving.  

    1. At some point, the tool will warn you that your decomposition is not dependency-preserving. At that point, press the "Continue" button. What was your choice of FD to decompose that led to the warning?

 When you arrive at a schema which is in BCNF, choose Save As from the Session menu, and when prompted for a name for the session, type 'MyBCNF'. Then close the session window by choosing Exit from the Session menu.

Conversion to SQL

In the main designview window, the Base relation should still be visible in an unchanged form.  However, designview has remembered your "MyBCNF" decomposition session. To see the SQL commands that create your decomposed tables, select the Base relation and press the Show SQL button. Write down the resulting SQL DDL commands on scratch paper (or cut-and-paste into your favorite editor.)

Unfortunately, designview does not provide all the SQL constraints required to guarantee that dependencies are maintained by the system.

    1. Modify the SQL DDL from designview to guarantee the dependencies.  

Automatic Decomposition in Designview

In our previous session, we decomposed Base one FD at a time, to get a BCNF decomposition. Designview can automate these individual steps.  We will try this now:

  • Start a new session in the main designview window.
  • In the new Session window, click on the Base relation.
  • From the Relation menu, choose Test/Decompose... and then Decompose into BCNF.
  • You will receive a warning about "non-binary lossless join".  Ignore it by pressing the OK button when it pops up.
  • You will then receive a warning about a non-dependency-preserving decomposition.  Ignore this by pressing Continue.
  • You may then receive warnings about existing relations.  For each of these warnings, choose Replace With an Existing Relation.
    1. Briefly describe how (if) the resulting schema differs from your previous BCNF schema.

Since this decomposition was not dependency-preserving, we will delete it and consider a 3NF decomposition instead.

  • To delete this BCNF decomposition, click on the Base relation in the New Session window.
  • From the Relation menu, choose Undo Decomposition.
  • In the resulting dialog box, choose Delete. The session window should revert to its previous state.
  • From the Relation menu, choose Test/Decompose... and then Decompose into 3NF.
  • Handle any dialog boxes as you did above, until you get the new 3NF decomposition.
    1. Is the resulting decomposition lossless-join? 
    2. Is it dependency-preserving?
    3. Briefly describe how (if) the resulting schema differs from your original BCNF schema.
  • Save this session: from the Session menu, choose Save As. Save the session as "My3NF".
  • Exit this session: from the Session menu, choose Exit.

SQL for the 3NF Decomposition: Using Multiple Saved Sessions

You now have two sessions saved: MyBCNF and My3NF. To choose the default session, press the View Details button in the main designview window. Three extra frames pop up: one for attributes of Base, one for the FDs, and one with a list of Decomposition Sessions. We want to make the My3NF decomposition be the default:

  • In the Decomposition Sessions frame, click on My3NF. Then, using the middle mouse button, click-and-hold on My3NF again. A menu pops up; choose Set as Default. Note that My3NF is moved to the top of the list in that frame -- the topmost session in that list is the current default session in designview.
  • Hide the details by pressing the Hide Details button.

Now you can see the SQL for the 3NF decomposition by clicking the Show SQL button.

    1. Write down the resulting SQL DDL, but modify it as needed to ensure that dependencies are preserved.

A Dependency-Preserving, BCNF Decomposition

In general, there is no dependency-preserving BCNF decomposition of every schema. However, it happens that there is a dependency-preserving BCNF decomposition of our schema, even though designview doesn't find it automatically. The final part of this assignment is to find a dependency preserving BCNF decomposition for Base.  You may want to use designview to help you, but you shouldn't rely on it to find the decomposition for you; it won't!

    1. Write down the full SQL DDL for a dependency-preserving, BCNF decomposition of the original Base relation.