Normalize the following table.
Cust_Id | Cust_name | Cust_address | Zip_code | city | House_no | Items | Company | Price |
---|---|---|---|---|---|---|---|---|
1 | Bob | Ktm, Pok | 001 | Sanothimi | 009 | Mobile | Samsung | 50000 |
2 | John | But, Bhair | 002 | Lalitpur | 010 | TV | Sony | 80000 |
3 | Cristina | Narayangadh | 003 | Baneshwor | 111 | Camera | LG | 55000 |
4 | Katrina | Dang | 004 | New Road | 112 | Laptop | Dell | 65000 |
A: Normalize the table (Cust_Id, Cust_address, Zip_code, city, Items, Price ...) : 83
Steps for Normalization
Before answering the question, let me tell you there is a printing mistake in the question. The column 'Cust_address' should have been named as 'city' and vice-versa.
While approaching such questions in exams we have two options.
- Mention your assumptions and answer the question as it is.
- Or, correct the question, mention the changes you have done to it and answer it accordingly. This approach has been taken in this article.
In this answer, I am going with the first option i.e., answer it as it is.
Back to top1. First Normal Form(1NF)
The only column that may(or may not) be violating the First Normal Form is Cust_address as it seems to have non-atomic values ('Ktm, Pok' and 'But, Bhair'). Before deciding whether it is violating 1NF, let's examine the data. We can't see a proper co-relation between these values and the corresponding city values 'Sanothimi' and 'Lalitpur' which are single-valued. So, we consider them as values for the same location i.e., something like 'Main Road, Durbar Marg' or 'XYZ Hotel, Airport'.
In such cases, we can either:
- Consider Cust_address as a composite attribute and divide it into two columns.
- Or, we can consider it a 'single address string' and use it as it is.
We take the first option and divide it into two columns (Cust_address_1 and Cust_address_2):
1. R1
Cust_Id | Cust_name | Cust_address_1 | Cust_address_2 | Zip_code | city | House_no | Items | Company | Price |
---|---|---|---|---|---|---|---|---|---|
1 | Bob | Ktm | Pok | 001 | Sanothimi | 009 | Mobile | Samsung | 50000 |
2 | John | But | Bhair | 002 | Lalitpur | 010 | TV | Sony | 80000 |
3 | Cristina | Narayangadh | NULL | 003 | Baneshwor | 111 | Camera | LG | 55000 |
4 | Katrina | Dang | NULL | 004 | New Road | 112 | Laptop | Dell | 65000 |
Now, since there aren't any no-atomic values, table R1 is in the First Normal Form.
Back to top
2. Second Normal Form(2NF)
1. Check for Partial Dependencies
Let's check if there are any partial dependencies in R1. Currently, we have considered a single attribute Primary Key, that is 'Cust_Id'. All other attributes are derived from Cust_Id. So, there aren't any partial dependencies and R1 is in Second Normal Form.
Back to top
3. Third Normal Form(3NF)
There are two sets of transitive dependencies in the relation R1.
The first set of transitive dependencies, related to Cust_Id and Items:
Cust_Id → Items
Items → Company
i.e., Cust_Id → Items → Company
Also,
Items → Price
i.e., Cust_Id → Items → Price
Let's decompose R1 into R2 and R3 to bring it to the Third Normal Form.
Note: Here, the column Items is not suitable to be a Primary Key because item names are not unique. It is always a safer option to add a surrogate(artificial) primary key for it. So, we are adding the extra column Item_ID as a primary key.
1. R2
Item_Id | Items | Company | Price |
---|---|---|---|
1 | Mobile | Samsung | 50000 |
2 | TV | Sony | 80000 |
3 | Camera | LG | 55000 |
4 | Laptop | Dell | 65000 |
2. R3
Cust_Id | Cust_name | Cust_address_1 | Cust_address_2 | Zip_code | city | House_no | Item_Id |
---|---|---|---|---|---|---|---|
1 | Bob | Ktm | Pok | 001 | Sanothimi | 009 | 1 |
2 | John | But | Bhair | 002 | Lalitpur | 010 | 2 |
3 | Cristina | Narayangadh | NULL | 003 | Baneshwor | 111 | 3 |
4 | Katrina | Dang | NULL | 004 | New Road | 112 | 4 |
There is still another transitive dependency in the relation R3.
The second set of transitive dependency, related to Cust_Id and Zip_code:
Cust_Id → Zip_code
Zip_code → City
i.e., Cust_Id → Zip_code → City
So, let's decompose R3 into R4 and R5. We remove the column 'city' from table R4.
3. R4
Cust_Id | Cust_name | Cust_address_1 | Cust_address_2 | Zip_code | House_no | Item_Id |
---|---|---|---|---|---|---|
1 | Bob | Ktm | Pok | 001 | 009 | 1 |
2 | John | But | Bhair | 002 | 010 | 2 |
3 | Cristina | Narayangadh | NULL | 003 | 111 | 3 |
4 | Katrina | Dang | NULL | 004 | 112 | 4 |
4. R5
Zip_code | city |
---|---|
001 | Sanothimi |
002 | Lalitpur |
003 | Baneshwor |
004 | New Road |
There aren't any transitive dependencies anymore in R4 and R5. Hence, our final tables are R2, R4 and R5 which are in the Third Normal Form(3NF).
We can stop here because the question has not asked a Specific Normal Form and in most cases, the relations in 3NF are considered good enough.
Back to top