Normalize the following table(question corrected).
Cust_Id | Cust_name | city | Zip_code | Cust_address | House_no | Items | Company | Price |
---|---|---|---|---|---|---|---|---|
1 | Bob | Ktm, Pok | 001 | Sanothimi | 009 | Mobile | Samsung | 50000 |
2 | John | But, Bhair | 002 | Lalitpur | 010 | TV | Sony | 80000 |
3 | Cristina | Narayangadh | 003 | Baneshwor | 111 | Camera | LG | 55000 |
4 | Katrina | Dang | 004 | New Road | 112 | Laptop | Dell | 65000 |
A: Normalize the following table (Cust_Id, city, Zip_code, Cust_address, House_number, Items, Price). Question Corrected: 85
Steps for normlization
This is a follow-up article to the previous article where we discussed the printing mistake and ways to approach it. There we solved the question as it is, i.e., without correcting it. Here, we solve the same question with another approach. That is, we correct(i.e., swap) the columns(Cust_address and city) and answer the question accordingly.
Note: Even after correcting the columns, the data is still incomplete and non-matching. For example, the customer(Cust_Id, 1) has two cities Ktm and Pok. But the corresponding Cust_address, Zip_code, and House_no are single. Another example is, for Cust_Id 3, the city is Narayangadh and Cust_address is Baneshwor. We will ignore these issues and proceed to normalization.
Back to top1. First Normal Form(1NF)
The column city has non-atomic values ('Ktm, Pok' and 'But, Bhair'), which means multiple cities. Let's decompose the table and bring the multi-valued attribute 'city' to a different table R2 and the rest of the attributes will be in R1.
1. R1
Cust_Id | Cust_name | Zip_code | Cust_address | House_no | Items | Company | Price |
---|---|---|---|---|---|---|---|
1 | Bob | 001 | Sanothimi | 009 | Mobile | Samsung | 50000 |
2 | John | 002 | Lalitpur | 010 | TV | Sony | 80000 |
3 | Cristina | 003 | Baneshwor | 111 | Camera | LG | 55000 |
4 | Katrina | 004 | New Road | 112 | Laptop | Dell | 65000 |
2. R2
Cust_Id | city |
---|---|
1 | Ktm |
1 | Pok |
2 | But |
2 | Bhair |
3 | Narayangadh |
4 | Dang |
Now, the tables R1 and R2 are in First Normal Form since there aren't any no-atomic values.
Back to top
2. Second Normal Form(2NF)
1. Check for partial dependencies.
In R1, there is a single(non-composite) primary key. Hence R1 does NOT have partial dependencies.
In R2, there are only prime attributes. Hence, R2 does NOT have partial dependencies too.
Thus, R1 and R2 are in the Second Normal Form.
Back to top
3. Third Normal Form(3NF)
There are transitive dependencies in the relation R1.
Cust_Id → Items
Items → Company
i.e., Cust_Id → Items → Company
Also,
Items → Price
i.e., Cust_Id → Items → Price
Let's decompose R1 into R3 and R4 to bring it to the Third Normal Form.
Note: Here, the column Items is not suitable to be a Primary Key because item names are not unique. It is always a safer option to add a surrogate(artificial) primary key for it. So, we are adding the extra column Item_ID as a primary key.
1. R3
Item_Id | Items | Company | Price |
---|---|---|---|
1 | Mobile | Samsung | 50000 |
2 | TV | Sony | 80000 |
3 | Camera | LG | 55000 |
4 | Laptop | Dell | 65000 |
2. R4
Cust_Id | Cust_name | Zip_code | Cust_address | House_no | Item_Id |
---|---|---|---|---|---|
1 | Bob | 001 | Sanothimi | 009 | 1 |
2 | John | 002 | Lalitpur | 010 | 2 |
3 | Cristina | 003 | Baneshwor | 111 | 3 |
4 | Katrina | 004 | New Road | 112 | 4 |
There aren't any transitive dependencies anymore in R3 and R4. Hence, our final tables are R2, R4 and R5 which are in the Third Normal Form(3NF). The question has not targeted any particular normal form. In such cases, the relations in 3NF are considered good enough.
Back to top