Normalize the table (Cust_Id, Cust_address, Zip_code, city, Items, Price ...)

Normalize the following table.

Cust_Id Cust_name Cust_address Zip_code city House_no Items Company Price
1 Bob Ktm, Pok 001 Sanothimi 009 Mobile Samsung 50000
2 John But, Bhair 002 Lalitpur 010 TV Sony 80000
3 Cristina Narayangadh 003 Baneshwor 111 Camera LG 55000
4 Katrina Dang 004 New Road 112 Laptop Dell 65000

Before answering the question, let me tell you there is a printing mistake in the question. The column 'Cust_address' should have been named as 'city' and vice-versa.

Normalization_Original_Question

While approaching such questions in exams we have two options.

  1. Mention your assumptions and answer the question as it is.
  2. Or, correct the question, mention the changes you have done to it and answer it accordingly. This approach has been taken in this article.

 

In this answer, I am going with the first option i.e., answer it as it is.

Back to top

1. First Normal Form(1NF)

The only column that may(or may not) be violating the First Normal Form is Cust_address as it seems to have non-atomic values ('Ktm, Pok' and  'But, Bhair'). Before deciding whether it is violating 1NF, let's examine the data. We can't see a proper co-relation between these values and the corresponding city values 'Sanothimi' and 'Lalitpur' which are single-valued. So, we consider them as values for the same location i.e., something like 'Main Road, Durbar Marg' or 'XYZ Hotel, Airport'.

In such cases, we can either:

  1. Consider Cust_address as a composite attribute and divide it into two columns.
  2. Or, we can consider it a 'single address string' and use it as it is.

 

We take the first option and divide it into two columns (Cust_address_1 and Cust_address_2):

1. R1

Cust_Id Cust_name Cust_address_1 Cust_address_2 Zip_code city House_no Items Company Price
1 Bob Ktm Pok 001 Sanothimi 009 Mobile Samsung 50000
2 John But Bhair 002 Lalitpur 010 TV Sony 80000
3 Cristina Narayangadh NULL 003 Baneshwor 111 Camera LG 55000
4 Katrina Dang NULL 004 New Road 112 Laptop Dell 65000

 

Now, since there aren't any no-atomic values, table R1 is in the First Normal Form.

 

Back to top

2. Second Normal Form(2NF)

1. Check for Partial Dependencies

Let's check if there are any partial dependencies in R1. Currently, we have considered a single attribute Primary Key, that is 'Cust_Id'. All other attributes are derived from Cust_Id. So, there aren't any partial dependencies and R1 is in Second Normal Form.

 

 

Back to top

3. Third Normal Form(3NF)

There are two sets of transitive dependencies in the relation R1.

The first set of transitive dependencies, related to Cust_Id and Items:

Cust_Id Items

Items Company

i.e., Cust_Id Items Company

 

 

Also,
Items Price
i.e., Cust_Id Items Price


Let's decompose R1 into R2 and R3 to bring it to the Third Normal Form.

Note: Here, the column Items is not suitable to be a Primary Key because item names are not unique. It is always a safer option to add a surrogate(artificial) primary key for it. So, we are adding the extra column Item_ID as a primary key.

1. R2

Item_Id Items Company Price
1 Mobile Samsung 50000
2 TV Sony 80000
3 Camera LG 55000
4 Laptop Dell 65000

 

2. R3

Cust_Id Cust_name Cust_address_1 Cust_address_2 Zip_code city House_no Item_Id
1 Bob Ktm Pok 001 Sanothimi 009 1
2 John But Bhair 002 Lalitpur 010 2
3 Cristina Narayangadh NULL 003 Baneshwor 111 3
4 Katrina Dang NULL 004 New Road 112 4

 

There is still another transitive dependency in the relation R3.

The second set of transitive dependency, related to Cust_Id and Zip_code:

Cust_Id Zip_code
Zip_code City
i.e., Cust_Id Zip_code City

 

So, let's decompose R3 into R4 and R5. We remove the column 'city' from table R4.

3. R4

Cust_Id Cust_name Cust_address_1 Cust_address_2 Zip_code House_no Item_Id
1 Bob Ktm Pok 001 009 1
2 John But Bhair 002 010 2
3 Cristina Narayangadh NULL 003 111 3
4 Katrina Dang NULL 004 112 4

 

4. R5

Zip_code city
001 Sanothimi
002 Lalitpur
003 Baneshwor
004 New Road

 

There aren't any transitive dependencies anymore in R4 and R5. Hence, our final tables are R2, R4 and R5 which are in the Third Normal Form(3NF).

We can stop here because the question has not asked a Specific Normal Form and in most cases, the relations in 3NF are considered good enough.

 

 

Back to top