Explain how aggregation and generalization help to minimize redundancy while designing ER Diagram.
A: Role of aggregration and generalization in minimizing redundancy: 77
Aggregation, Generalization and Specialization are concepts introduced in EER (Extended ER Model). These concepts help database designers to design schemas for complex systems. Besides, they also minimize redundancy in Relational Model as well as ER Model. Let's explore how.
Table of Contents
1. Aggregation:
The basic ER model lacks the features to:
- Establish a relationship between two(or more) 'relationship sets'.
- Establish a relationship between a 'collection of entity sets' and 'relationship sets'.
Thus, aggregation was introduced in Extended ER Model to support such situations. Aggregation not only helps us handle such complex situations but also minimizes redundancy. It helps in minimizing redundancy by reducing the number of relationship sets in the ER diagram. To understand it better, let's see an example.
Suppose, we have the following entity sets:
Student ( sid, sname, email )
Project ( pid, pname, topic )
Teacher ( tid, tname )
Assumptions:
- A student may work on multiple projects.
- A project may involve multiple students.
- A teacher guides multiple students on various projects.
In our database, we should maintain the following information:
- Which student works on which project?
- Which teacher guides which student on which project? Note that it's not only 'which student or which project'. It is 'which teacher guides which student on which project?'.
Let's represent these entity sets and relationship sets in ER Diagram. For simplicity, the attributes have not been included.
1. ER Model without aggregation
Without aggregation, we will end up with the following relationship sets:
Student works_on Project.
Student is guided_by Teacher.
Project is guided_by Teacher.
2. Relational Model without Aggregation
Entities in the Relational Model:
Student
sid | sname |
---|
Project
pid | pname | topic |
---|
Teacher
tid | tname |
---|
Relationship sets in the Relational Model:
Student_works_on_Project
sid | pid |
---|
Student_guided_by_Teacher
sid | tid |
---|
Project_guided_by_Teacher
pid | tid |
---|
3. ER Model with aggregation
With aggregation, we will have the following relationship sets:
- 'Student' works_on 'Project'.
- 'Student_works_on_Project' (aggregated relationship) is guided_by 'Teacher'.
4. Relational Model with Aggregation
Entities in the Relational Model:
Same as before
Relationship sets in the Relational Model:
With aggregation, the previous relations have been reduced to a single relation for the relationship.
Student_Project_guided_by_Teacher
sid | pid | tid |
---|
Here, we can see clearly that the number of relationship sets has been reduced from three to two. Also, the relations have been reduced from three to one. This is how 'aggregation' minimizes redundancy in both ER Model and Relational Model.
Back to top
2. Generalization
In Generalization, several entity sets sharing common attributes are generalized into superclasses or higher-level entities. Generalization helps in minimizing redundancy in ER diagrams and Relational models. Whenever there are entity instances that belong to more than one entity set, with the help of generalization and specialization, we can save the common details of those entities in a single relation instead of multiple. To understand this more clearly, let's see it in action.
For example, consider the following entities and their attributes:
Actor ( ID, name, country, no_of_awards, first_movie )
Musician ( ID, name, country, no_of_awards, musical_instrument )
1. ER Model without Generalization
Here, the common attributes (ID, name, country, no_of_awards) are being repeated in both the relations.
2. Relational Model without Generalization
Actor
ID | name | country | no_of_awards | first_movie |
---|---|---|---|---|
1 | Khagendra | Nepal | 5 | Sirumarani |
2 | Atif Aslam | Pakistan | 10 | Bol |
3 | Ali Zafar | Pakistan | 6 | Tere Bin Laden |
4 | Al Pacino | America | 50 | Me, Natalie |
Musician
ID | name | country | no_of_awards | musical_instrument |
---|---|---|---|---|
1 | Sugam Pokhrel | Nepal | 10 | Guitar |
2 | Atif Aslam | Pakistan | 10 | Guitar |
3 | Ali Zafar | Pakistan | 6 | Keyboard |
4 | Zakir Hussain | India | 80 | Tabla |
Here, we can see that the information of Ali Zafar and Atif Aslam is being repeated in two relations/tables. This is because they are both actors and musicians. As there are common attributes between the entity sets 'Actor' and 'Musician', let's generalize them into a single entity set Artist.
The common attributes will be stored in the new schema 'Artist'.
Artist( ID, name, country, no_of_awards)
Actor( ID, first_movie )
Musician( ID, musical_instrument )
3. ER Model with Generalization
Here, the common attributes (ID, name, country, no_of_awards) have been shifted to the superclass entity-set Artist. Hence, those attributes appear only once instead of being repeated in Actor and Musician. Thus, it can be said that, by using Generalization, we have minimized the redundancy in the ER diagram.
4. Relational Model with Generalization
Based on the new ER diagram, we can create the following relations.
Artist
ID | name | country | no_of_awards |
---|---|---|---|
1 | Khagendra | Nepal | 5 |
2 | Atif Aslam | Pakistan | 10 |
3 | Ali Zafar | Pakistan | 6 |
4 | Al Pacino | America | 50 |
5 | Sugam Pokhrel | Nepal | 10 |
6 | Zakir Hussain | India | 80 |
Actor
ID | first_movie |
---|---|
1 | Sirumarani |
2 | Bol |
3 | Tere Bin Laden |
4 | Me, Natalie |
Musician
ID | musical_instrument |
---|---|
5 | Guitar |
2 | Guitar |
3 | Keyboard |
6 | Tabla |
Here, only the primary key ID is being repeated in the tables 'Actor' and 'Musician'. The value for other attributes(name, country, no_of_awards) are stored only once. Hence we can conclude that if generalization is used properly in ER diagrams, it helps in minimizing the redundancy in Relational Model too.
Back to top