Database Normalization in SQL
As a database developer, we might often come across terms like normalization and denormalization of a database. Database normalization is a technique that helps to efficiently organize data in a given database. Essentially, it is a systematic approach to decompose a larger table into smaller tables that would help to get rid of data redundancy and other inserts/ update anomalies.
According to the definition in Wikipedia -
“Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as part of his relational model.
So why is this required? without Normalization in SQL, we may face many issues such as
- Insertion anomaly: It occurs when we cannot insert data to the table without the presence of another attribute
- Update anomaly: It is a data inconsistency that results from data redundancy and a partial update of data.
- Deletion Anomaly: It occurs when certain attributes are lost because of the deletion of other attributes.
It usually divides a large table into smaller ones, so it is more efficient. In 1970 the First Normal Form was defined by Edgar F Codd and eventually, other Normal Forms were defined.
Normalization entails organizing the columns (attributes) and tables (relations) of a database to ensure that their dependencies are properly enforced by database integrity constraints. It is accomplished by applying some formal rules either by a process of synthesis (creating a new database design) or decomposition (improving an existing database design).”
Database Normalization
Database Normalization is a set of rules that are applied to a database, such that the schema of the database ensures that all the rules are being followed. These rules are also known as Normal Forms and are widely used while designing database solutions.
The database normalization process can be divided into following types:
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Boyce-Codd Normal Form or Fourth Normal Form (BCNF of 4NF)
- Fifth Normal Form (5NF)
- Sixth Normal Form (6NF)
1. First Normal Form (1NF):
- The First Normal Form is achieved when each of the table cell contains only a single value, and also, each record needs to be unique.
- The first point seems to be clear enough that there can’t be more than one value assigned for each column in a database table.
- The second point is achieved by the usage of Primary Key (A Primary key is the single column of a table that uniquely identifies a database record.)
- A point to note here is that the Primary Key can be composed of more than one key. A Composite Key is the set of columns that can be used in conjunction with each other to uniquely identify a database record.
2. Second Normal Form (2NF):
- A table can be taken to the 2nd Normal Form only if it is fully compliant with the rules laid for being called 1st NF.
- The next rule that comes in to picture is that there should be one and only one column that should act as a Primary Key. The concept of Composite Key doesn’t allow us to a table to be in 2nd NF if it has a Composite Primary Key.
- In order to reduce the dependency of using a Composite Primary Key, the table which has a Composite Primary Key should be disintegrated into 2 different tables and use the dependencies via the newly created tables. Hence comes the concept of Foreign Key.
- A Foreign Key is a column that references the Primary Key of another table to uniquely connect your tables.
3. Third Normal Form (3NF):
- A table can be taken to the 3rd Normal Form only if it is fully compliant with the rules laid for being called 2nd NF.
- The next rule that comes into picture is that there should be no transitive functional dependencies.
- A transitive functional dependency is that scenario when changing a non-key column may affect in changing any other non-key column values.
- Ideally, a table cannot go further than this level in the process of normalization, but for the sake of knowing things, we will go through the others as well.