Database Indexing Explained: How Indexes Speed Up Queries
Database indexes are like book indexes — they help the database find rows without scanning the entire table.
Without an Index (Sequential Scan)
SELECT * FROM users WHERE email = 'alice@example.com';Without an index, PostgreSQL reads every row in the users table to find the matching email. For a table with 1 million rows, that’s 1 million comparisons. This is called a sequential scan or full table scan.
With an Index (Index Scan)
CREATE INDEX idx_users_email ON users (email);
SELECT * FROM users WHERE email = 'alice@example.com';With an index, the database uses a B-tree data structure to find the row in milliseconds:
- Level 1: Check if ‘alice@example.com’ is in the top, middle, or bottom third
- Level 2: Narrow to one-ninth
- Continue until found
A B-tree index on 1 million rows finds any value in about 20 comparisons (compared to 1 million without).
Index Types
| Type | Best For |
|---|---|
| B-tree (default) | Equality and range queries (=, <, >, BETWEEN) |
| Hash | Equality queries only (=) |
| GIN | Arrays, JSONB, full-text search |
| GiST | Geospatial data, full-text search |
| BRIN | Large tables with natural ordering (time-series) |
Creating Indexes
-- Single column
CREATE INDEX idx_users_email ON users (email);
-- Unique index (also enforces uniqueness)
CREATE UNIQUE INDEX idx_users_email ON users (email);
-- Multi-column
CREATE INDEX idx_users_name ON users (last_name, first_name);
-- Partial index
CREATE INDEX idx_active_users ON users (email) WHERE active = true;
-- Index on expression
CREATE INDEX idx_users_lower_email ON users (LOWER(email));When to Index
Good candidates for indexing:
-- Columns used in WHERE clauses
SELECT * FROM orders WHERE user_id = 123;
-- Columns used in JOIN conditions
SELECT * FROM orders o JOIN users u ON o.user_id = u.id;
-- Columns used in ORDER BY
SELECT * FROM users ORDER BY created_at DESC;
-- Columns used in frequent UPDATE/DELETE WHERE clauses
DELETE FROM sessions WHERE expires_at < NOW();When NOT to Index
- Small tables (< 1000 rows) — a sequential scan is faster
- Low cardinality columns —
gender,status(few distinct values) - Columns rarely used in WHERE — the index won’t help
- Heavy write workloads — indexes slow down INSERT/UPDATE/DELETE
Multi-Column Indexes
CREATE INDEX idx_users_name ON users (last_name, first_name);This index helps queries that filter by:
WHERE last_name = 'Smith'— ✓ uses the indexWHERE last_name = 'Smith' AND first_name = 'John'— ✓ uses the indexWHERE first_name = 'John'— ✗ does NOT use the index (skipped the first column)
Order matters. Put the most selective column first.
EXPLAIN ANALYZE
Check if your index is being used:
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'alice@example.com';Output shows whether it’s an Index Scan, Bitmap Heap Scan, or Seq Scan:
Index Scan using idx_users_email on users (cost=0.42..8.44 rows=1 width=42)
Index Cond: ((email)::text = 'alice@example.com'::text)Index Maintenance
-- Rebuild an index
REINDEX INDEX idx_users_email;
-- List all indexes for a table
SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'users';
-- Remove unused index
DROP INDEX IF EXISTS idx_unused;The Trade-off
Index benefit: Fast reads (1000x faster for large tables)
Index cost: Slow writes + disk space
No Index With Index
SELECT * 1 ms 1 ms (small table)
SELECT * 1000 ms 2 ms (large table)
INSERT 1 ms 2 ms
UPDATE 2 ms 5 ms
Disk space 100 MB 110 MBRelated: Learn SQL JOINs and normalization.