Role of aggregration and generalization in minimizing redundancy

Explain how aggregation and generalization help to minimize redundancy while designing ER Diagram.

Aggregation, Generalization and Specialization are concepts introduced in EER (Extended ER Model). These concepts help database designers to design schemas for complex systems. Besides, they also minimize redundancy in Relational Model as well as ER Model. Let's explore how.

Back to top

1. Aggregation:

The basic ER model lacks the features to:

  1. Establish a relationship between two(or more) 'relationship sets'.
  2. Establish a relationship between a 'collection of entity sets' and 'relationship sets'.

 

Thus, aggregation was introduced in Extended ER Model to support such situations. Aggregation not only helps us handle such complex situations but also minimizes redundancy. It helps in minimizing redundancy by reducing the number of relationship sets in the ER diagram. To understand it better, let's see an example.

Suppose, we have the following entity sets:

Student ( sid, sname, email )
Project ( pid, pname, topic )
Teacher ( tid, tname )

 

Assumptions:

  • A student may work on multiple projects.
  • A project may involve multiple students.
  • A teacher guides multiple students on various projects.

 

In our database, we should maintain the following information:

  • Which student works on which project?
  • Which teacher guides which student on which project? Note that it's not only 'which student or which project'. It is 'which teacher guides which student on which project?'.

 

Let's represent these entity sets and relationship sets in ER Diagram. For simplicity, the attributes have not been included.

1. ER Model without aggregation

 ER_Diagram_Without_Aggregration

Without aggregation, we will end up with the following relationship sets:

Student works_on Project.
Student is guided_by Teacher.
Project is guided_by Teacher.

     

    2. Relational Model without Aggregation

     

    Entities in the Relational Model:

    Student

    sid sname email

     

    Project

    pid pname topic

     

    Teacher

    tid tname


    Relationship sets in the Relational Model:

    Student_works_on_Project

    sid pid

     

    Student_guided_by_Teacher

    sid tid

     

    Project_guided_by_Teacher

    pid tid

     

     

    3. ER Model with aggregation

     ER_Diagram_With_Aggregration

    With aggregation, we will have the following relationship sets:

    1. 'Student' works_on 'Project'.
    2. 'Student_works_on_Project' (aggregated relationship) is guided_by 'Teacher'.

     

    4. Relational Model with Aggregation

     

    Entities in the Relational Model:

    Same as before

     

    Relationship  sets in the Relational Model:

    With aggregation, the previous relations have been reduced to a single relation for the relationship.

     

    Student_Project_guided_by_Teacher

    sid pid tid

     

    Here, we can see clearly that the number of relationship sets has been reduced from three to two. Also, the relations have been reduced from three to one. This is how 'aggregation' minimizes redundancy in both ER Model and Relational Model.

     

     


     

    Back to top

    2. Generalization

    In Generalization, several entity sets sharing common attributes are generalized into superclasses or higher-level entities. Generalization helps in minimizing redundancy in ER diagrams and Relational models. Whenever there are entity instances that belong to more than one entity set, with the help of generalization and specialization, we can save the common details of those entities in a single relation instead of multiple. To understand this more clearly, let's see it in action.

    For example, consider the following entities and their attributes:

    Actor ( ID, name, country, no_of_awards, first_movie )
    Musician ( ID, name, country, no_of_awards, musical_instrument )

     

    1. ER Model without Generalization

    Without-Generalization (1).jpg

    Here, the common attributes (ID, name, country, no_of_awards) are being repeated in both the relations.

    2. Relational Model without Generalization

     

    Actor

    ID name country no_of_awards first_movie
    1 Khagendra Nepal 5 Sirumarani
    2 Atif Aslam Pakistan 10 Bol
    3 Ali Zafar Pakistan 6 Tere Bin Laden
    4 Al Pacino America 50 Me, Natalie

     

    Musician

    ID name country no_of_awards musical_instrument
    1 Sugam Pokhrel Nepal 10 Guitar
    2 Atif Aslam Pakistan 10 Guitar
    3 Ali Zafar Pakistan 6 Keyboard
    4 Zakir Hussain India 80 Tabla

    Here, we can see that the information of  Ali Zafar and Atif Aslam is being repeated in two relations/tables. This is because they are both actors and musicians. As there are common attributes between the entity sets 'Actor' and 'Musician', let's generalize them into a single entity set Artist.


    The common attributes will be stored in the new schema 'Artist'.

     

    Artist( ID, name, country, no_of_awards)
    Actor( ID, first_movie )
    Musician( ID, musical_instrument )

     

    3. ER Model with Generalization

    With-Generalization

    Here, the common attributes (ID, name, country, no_of_awards) have been shifted to the superclass entity-set Artist. Hence, those attributes appear only once instead of being repeated in Actor and Musician. Thus, it can be said that, by using Generalization, we have minimized the redundancy in the ER diagram.

    4. Relational Model with Generalization

    Based on the new ER diagram, we can create the following relations.

    Artist

    ID name country no_of_awards
    1 Khagendra Nepal 5
    2 Atif Aslam Pakistan 10
    3 Ali Zafar Pakistan 6
    4 Al Pacino America 50
    5 Sugam Pokhrel Nepal 10
    6 Zakir Hussain India 80

     

    Actor

    ID first_movie
    1 Sirumarani
    2 Bol
    3 Tere Bin Laden
    4 Me, Natalie

     

    Musician

    ID musical_instrument
    5 Guitar
    2 Guitar
    3 Keyboard
    6 Tabla

     

    Here, only the primary key ID is being repeated in the tables 'Actor' and 'Musician'. The value for other attributes(name, country, no_of_awards) are stored only once. Hence we can conclude that if generalization is used properly in ER diagrams, it helps in minimizing the redundancy in Relational Model too.

    Back to top
    Asked in Year
    2019
    Course
    BBM
    University
    TU