Database Indexing Explained: How Indexes Speed Up Queries

Database indexes are like book indexes — they help the database find rows without scanning the entire table.

Without an Index (Sequential Scan)

SELECT * FROM users WHERE email = 'alice@example.com';

Without an index, PostgreSQL reads every row in the users table to find the matching email. For a table with 1 million rows, that’s 1 million comparisons. This is called a sequential scan or full table scan.

With an Index (Index Scan)

CREATE INDEX idx_users_email ON users (email);

SELECT * FROM users WHERE email = 'alice@example.com';

With an index, the database uses a B-tree data structure to find the row in milliseconds:

Level 1: Check if ‘alice@example.com’ is in the top, middle, or bottom third
Level 2: Narrow to one-ninth
Continue until found

A B-tree index on 1 million rows finds any value in about 20 comparisons (compared to 1 million without).

Index Types

Type	Best For
B-tree (default)	Equality and range queries (`=`, `<`, `>`, `BETWEEN`)
Hash	Equality queries only (`=`)
GIN	Arrays, JSONB, full-text search
GiST	Geospatial data, full-text search
BRIN	Large tables with natural ordering (time-series)

Creating Indexes

-- Single column
CREATE INDEX idx_users_email ON users (email);

-- Unique index (also enforces uniqueness)
CREATE UNIQUE INDEX idx_users_email ON users (email);

-- Multi-column
CREATE INDEX idx_users_name ON users (last_name, first_name);

-- Partial index
CREATE INDEX idx_active_users ON users (email) WHERE active = true;

-- Index on expression
CREATE INDEX idx_users_lower_email ON users (LOWER(email));

When to Index

Good candidates for indexing:

-- Columns used in WHERE clauses
SELECT * FROM orders WHERE user_id = 123;

-- Columns used in JOIN conditions
SELECT * FROM orders o JOIN users u ON o.user_id = u.id;

-- Columns used in ORDER BY
SELECT * FROM users ORDER BY created_at DESC;

-- Columns used in frequent UPDATE/DELETE WHERE clauses
DELETE FROM sessions WHERE expires_at < NOW();

When NOT to Index

Small tables (< 1000 rows) — a sequential scan is faster
Low cardinality columns — gender, status (few distinct values)
Columns rarely used in WHERE — the index won’t help
Heavy write workloads — indexes slow down INSERT/UPDATE/DELETE

Multi-Column Indexes

CREATE INDEX idx_users_name ON users (last_name, first_name);

This index helps queries that filter by:

WHERE last_name = 'Smith' — ✓ uses the index
WHERE last_name = 'Smith' AND first_name = 'John' — ✓ uses the index
WHERE first_name = 'John' — ✗ does NOT use the index (skipped the first column)

Order matters. Put the most selective column first.

EXPLAIN ANALYZE

Check if your index is being used:

EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'alice@example.com';

Output shows whether it’s an Index Scan, Bitmap Heap Scan, or Seq Scan:

Index Scan using idx_users_email on users  (cost=0.42..8.44 rows=1 width=42)
  Index Cond: ((email)::text = 'alice@example.com'::text)

Index Maintenance

-- Rebuild an index
REINDEX INDEX idx_users_email;

-- List all indexes for a table
SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'users';

-- Remove unused index
DROP INDEX IF EXISTS idx_unused;

The Trade-off

Index benefit:  Fast reads  (1000x faster for large tables)
Index cost:     Slow writes + disk space

              No Index    With Index
SELECT *        1 ms         1 ms  (small table)
SELECT *      1000 ms        2 ms  (large table)
INSERT           1 ms        2 ms
UPDATE           2 ms        5 ms
Disk space       100 MB     110 MB

Related: Learn SQL JOINs and normalization.