|
|
Normalization
Go to Normal Forms, Examples, BCNF Q1. What is Normalization? A1. Normalization imposes constraints on
tables to make sure that facts are not repeated in the tables, and to reduce
the possibility of inconstancies. Normalization
also makes it easier to make changes to the database design (schema). Q2. What undesirables can normalization
keep out? A2. Normalization eliminates data
redundancy and inconsistent dependency. Q3. Why is redundancy undesirable? A3. Redundancy wastes disk space and
creates maintenance problems. If data that exists in more than one place must
be changed, the data must be changed in exactly the same way in all
locations. Q4. What is an "inconsistent
dependency"? A4. It's when you have an attribute
(fact) not about the primary key, which means that fact is in
the wrong table. For example, a student table that contains the address of
the Department of his/her major. That address is in the wrong tab le, because
that address should be in the Department table. That address is not a fact
about the Student Primary Key. Q5. Why "inconsistent
dependency" undesirable? A5. Because it leads to deletion
and insertion anomalies. We have a deletion anomaly when deletion of a
record loses information we did not intend to lose. For example, if in a
Student table we had Student Advisor and Adv-Room and we dele ted a student
whose advisor advised no one else. Then we would lose the advisor's room
location (deletion anomaly). Similarly we would not know an advisor's room
number until that advisor was assigned to a student (insertion anomaly). Q6. OK, what are these constraints
(rules)? A6. . Each constraint (rule) is called a
"normal form." If the first rule is observed, the database is said
to be in "first normal form." If the first three rules are
observed, the database is considered to be in "third normal form."
Although other lev els of normalization are possible, third normal form is
considered the highest level necessary for most applications. Q7. What do these constraints (rules) do
to the original tables? A7. The constraints (rules) in general
break up larger tables into smaller tables so that each fact is a fact about
the Primary Key and nothing else. Go to Top Q8. What are these constraints (rules)? Q8. The rules are: First
|
|
|
The Boyce Codd
Q10 What is the BCNF? I heard of it, but I did not really understand it.
Q10. An example may help. Suppose a student can have more than one major and we would like to keep track of the student's major(s) and students advisors in the following table:
|
Students |
StudentId |
Major |
Advisor |
|
100 |
Math |
Hilbert |
|
|
150 |
Psychology |
Jung |
|
|
200 |
Math |
Courant |
|
|
300 |
Psychology |
Ruth |
|
|
300 |
Com. Sci. |
Vasilaky |
Since StudentId repeats it can't be the Primary Key. We can choose either StudentId and Major as the Primary Key or StudentId and Advisor as Primary Key. Say we choose StudentId and Major as the Primary Key. But that means that the remaining field Advisor is a fact about both StudentId and Advisor. So we know that Hilbert is an advisor for math majors and advises student 100.
This table is in 3NF but it still has anomalies (inconstancies). It's in 2NF because the advisor is a fact about both the student advised and the major he/she advises. It's in 3Nf because advisor is a fact only about the primary key (StudentId, Major).
Suppose student 300 drops out of school. If we delete student 300 we lose
the fact that Dr. Ruth Advises psychology. This is a deletion anomaly. Also how
can we know that Dr. Freedman advises Economics until student major in
Economics? This is an insertion anomaly. So we have inconsistent dependency.
An attribute is a determinant if it
determines another attribute. For example StudentId determines Major. Advisor
determines the major she/he advises.
BCNF
A table is in BCNF if it's in 3rd NF
and every determinant can be used as a Primary Key.
In our example Advisor attribute
determines Major but is not a possible Primary Key. StudentId and Advisor
together is a possible (candidate) Primary Key.
Normalized:
STUDENTS
|
StudentId |
Advisor |
|
100 |
Hilbert |
|
150 |
Jung |
|
200 |
Courant |
|
300 |
Ruth |
|
300 |
Vasilaky |
ADVISORS
|
Advisor |
Major |
|
Courant |
Math |
|
Vasilaky |
Comp Sci |
|
Ruth |
Psychology |
|
Hibert |
Math |
|
Jung |
Psychology |
Fourth Normal Form A table is in the 4NF if it's in BCNF and has no
attribute with mutivalued dependencies.
Suppose a Student can have more than one major
and more than one activity. For example:
|
StudentId |
Major |
Activity |
|
100 |
Music |
Swimming |
|
100 |
Accounting |
Swimming |
|
100 |
Music |
Tennis |
|
100 |
Accounting |
Tennis |
|
150 |
Math |
Jogging |
Note that all three attributes make up the
Primary Key.
Note that StudentId can be associated with
many major as well as many activities (multivalued dependency). Multivalued
dependency lead to modification anomalies. Suppose student 100 signs up for
skiing. Then we would insert (100, Music, Skiing)
This row implies that student 100 skies as
Music major but not as an accounting major, so in order to keep the data
consistent we must add one more row (100, Accounting, Skiing). This is an
insertion anomaly. Here are the tables Normalized:
Student-Major
|
StudentId |
Major |
|
100 |
Music |
|
100 |
Accounting |
|
100 |
Math |
Student-Activity
|
StudentId |
Activity |
|
100 |
Skiing |
|
100 |
Swimming |
|
100 |
Tennis |
|
150 |
Jogging |
Go to Top