Database Normalization Explained Simply
Database normalization organizes tables to reduce redundancy and improve data integrity. Here’s what you actually need to know.
Why Normalize?
- Eliminates duplicate data
- Prevents update anomalies (update one place, not many)
- Prevents delete anomalies (deleting one fact doesn’t lose another)
- Ensures data consistency
First Normal Form (1NF)
Rule: Each cell contains a single value. No lists or arrays.
-- Violates 1NF (multiple phone numbers in one cell)
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100),
phones VARCHAR(200) -- "555-0101,555-0102,555-0103"
);
-- 1NF compliant
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE customer_phones (
id INT PRIMARY KEY,
customer_id INT REFERENCES customers(id),
phone VARCHAR(20)
);Second Normal Form (2NF)
Rule: Must be in 1NF, and every non-key column must depend on the entire primary key (not just part of it).
Only relevant for tables with composite primary keys.
-- Violates 2NF (course_name depends only on course_id, not on student_id)
CREATE TABLE enrollments (
student_id INT,
course_id INT,
course_name VARCHAR(100), -- depends only on course_id
enrollment_date DATE,
PRIMARY KEY (student_id, course_id)
);
-- 2NF compliant
CREATE TABLE courses (
id INT PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE enrollments (
student_id INT,
course_id INT REFERENCES courses(id),
enrollment_date DATE,
PRIMARY KEY (student_id, course_id)
);Third Normal Form (3NF)
Rule: Must be in 2NF, and no non-key column depends on another non-key column.
-- Violates 3NF (department_name depends on department_id, not on employee_id)
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
department_id INT,
department_name VARCHAR(100) -- depends on department_id, not on employee id
);
-- 3NF compliant
CREATE TABLE departments (
id INT PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
department_id INT REFERENCES departments(id)
);Quick Reference
| Normal Form | Rule |
|---|---|
| 1NF | Each cell has one value |
| 2NF | All columns depend on the whole primary key |
| 3NF | Columns depend only on the primary key |
When to Denormalize
Normalization isn’t always the answer. Denormalize when:
- Read performance is critical — JOINs on 10+ tables are slow
- Reporting and analytics — Pre-joined tables simplify complex queries
- Caching — Store computed values to avoid expensive calculations
- High-traffic reads — Fewer JOINs = faster page loads
Real applications often use normalized schemas for writes and denormalized views/reporting tables for reads.
Related: Check our SQL cheat sheet and PostgreSQL vs MySQL guide.