Skip to content

Database Normalization Explained Simply

Database normalization organizes tables to reduce redundancy and improve data integrity. Here’s what you actually need to know.

Why Normalize?

  • Eliminates duplicate data
  • Prevents update anomalies (update one place, not many)
  • Prevents delete anomalies (deleting one fact doesn’t lose another)
  • Ensures data consistency

First Normal Form (1NF)

Rule: Each cell contains a single value. No lists or arrays.

-- Violates 1NF (multiple phone numbers in one cell)
CREATE TABLE customers (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  phones VARCHAR(200)  -- "555-0101,555-0102,555-0103"
);

-- 1NF compliant
CREATE TABLE customers (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

CREATE TABLE customer_phones (
  id INT PRIMARY KEY,
  customer_id INT REFERENCES customers(id),
  phone VARCHAR(20)
);

Second Normal Form (2NF)

Rule: Must be in 1NF, and every non-key column must depend on the entire primary key (not just part of it).

Only relevant for tables with composite primary keys.

-- Violates 2NF (course_name depends only on course_id, not on student_id)
CREATE TABLE enrollments (
  student_id INT,
  course_id INT,
  course_name VARCHAR(100),  -- depends only on course_id
  enrollment_date DATE,
  PRIMARY KEY (student_id, course_id)
);

-- 2NF compliant
CREATE TABLE courses (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

CREATE TABLE enrollments (
  student_id INT,
  course_id INT REFERENCES courses(id),
  enrollment_date DATE,
  PRIMARY KEY (student_id, course_id)
);

Third Normal Form (3NF)

Rule: Must be in 2NF, and no non-key column depends on another non-key column.

-- Violates 3NF (department_name depends on department_id, not on employee_id)
CREATE TABLE employees (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  department_id INT,
  department_name VARCHAR(100)  -- depends on department_id, not on employee id
);

-- 3NF compliant
CREATE TABLE departments (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

CREATE TABLE employees (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  department_id INT REFERENCES departments(id)
);

Quick Reference

Normal FormRule
1NFEach cell has one value
2NFAll columns depend on the whole primary key
3NFColumns depend only on the primary key

When to Denormalize

Normalization isn’t always the answer. Denormalize when:

  • Read performance is critical — JOINs on 10+ tables are slow
  • Reporting and analytics — Pre-joined tables simplify complex queries
  • Caching — Store computed values to avoid expensive calculations
  • High-traffic reads — Fewer JOINs = faster page loads

Real applications often use normalized schemas for writes and denormalized views/reporting tables for reads.


Related: Check our SQL cheat sheet and PostgreSQL vs MySQL guide.