Wednesday, April 4, 2012

Database Normalization - Third Normal Form

In previous two posts we discussed First Normal Form and Second Normal Form. Today, we will discuss Third Normal Form.

Third Normal Form (3NF)
This rule breaks down the table structure a bit further. For a table to be in 3NF, it must also satisfy 1NF and 2NF. To make a 2NF table satisfy 3NF, remove the columns that don't fully depend on the primary key.

Imagine you have a payroll application which records the total hours an employee worked, the pay rate and total pay for the week. The following table schema can be used...

PayRoll

  • EmployeeID
  • WorkedDate
  • HoursWorked
  • PayRateID
  • Total
Let's see if this table satisfies 1NF and 2NF. There is no duplicate data in the same row. The Payroll table is associated to Employee table via EmployeeID and to PayRates via PayRateID. It has no redundant data, has a primary key (EmployeeID and WorkedDate), and foreign keys (EmployeeID and PayRateID). Thus, all conditions for 1NF and 2NF are satisfied.

What about 3NF? Is there any field here that doesn't depend on Primary Key? HoursWorked is fully dependent on EmployeeID and WorkedDate. Different Employees could work different number of hours on different dates. Also, PayrateID is fully dependent on Primary Key as well.

At first it may appear that PayrateID doesn't depend on WorkedDate but rather only on EmployeeID because an Employees' pay will be same day after day. But what if the Employee works on a holiday when they will be paid time and a half?

What about Total column? Total column is really HoursWorked * PayRate, hence it is a computed column and as such isn't fully dependent on primary key. For this table to be in 3NF, this column must be removed. So, where do you store this column? Actually no where. This is a computed column and you should compute it on the fly when retrieving the data.

As I mentioned in the first post about Normalization, the normalization goes up to 6NF and more normalization types may still be possible, but 3NF is generally the standard for most databases.

Thank you.


No comments:

Post a Comment

As always, your comments are welcome and appreciated!