Which topics does this article cover?

It highlights SQL, database, postgresql, performance, indexing.

SQL Mistakes That Kill Your Database (And How to Fix Them)

Q: What is "SQL Mistakes That Kill Your Database (And How to Fix Them)" about?

SQL performance problems rarely come from the database itself — they come from inefficient queries. This guide covers the most common mistakes that slow production systems down, including missing indexes, N+1 queries, full table scans, bad joins, overfetching, and how to debug and optimize them properly.

Why Queries That Work Locally Break in Production

Development databases are small. You have maybe a few thousand rows, everything fits in memory, and even a terrible query runs in milliseconds. This creates a false sense of security.

Production databases are large. Queries that scan the whole table, retrieve 10x more data than you use, or trigger hundreds of additional queries per request will work fine at small scale and quietly become disaster at large scale. By the time you notice, users are already experiencing timeouts.

The good news: most database performance problems come from a small set of known mistakes. Learn to recognize them and you can fix 90% of slow queries.

Mistake 1: Not Using EXPLAIN ANALYZE

Before anything else — you can't optimize what you haven't measured. EXPLAIN ANALYZE is the PostgreSQL command that shows you how the database actually executes your query: what operations it performs, in what order, and how long each one takes.

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 42
ORDER BY created_at DESC;

Sample output:

Sort  (cost=1250.34..1252.84 rows=1000) (actual time=89.432..89.891 rows=1000)
  Sort Key: created_at DESC
  ->  Seq Scan on orders  (cost=0.00..1200.00 rows=1000) (actual time=0.021..85.341 rows=1000)
        Filter: (user_id = 42)
        Rows Removed by Filter: 999000
Planning Time: 0.5 ms
Execution Time: 90.1 ms

The two things to look for immediately:

Seq Scan (Sequential Scan) — The database is reading every row in the table to find the ones it needs. On a large table, this is almost always a problem. You want to see Index Scan instead.

Rows Removed by Filter — Here, the database read 999,000 rows and discarded them to find 1,000 matching rows. That's 99.9% wasted work. An index on user_id would let the database jump directly to the 1,000 rows it needs.

Run EXPLAIN ANALYZE on every query you're optimizing. It's not optional — it's the only way to know what's actually happening.

Mistake 2: Missing Indexes on Filtered and Joined Columns

An index is a separate data structure that allows the database to find rows without scanning the whole table. Think of it like the index at the back of a book — instead of reading every page to find mentions of "indexes," you jump directly to the relevant pages.

Without an index on user_id:

-- Scans every row in orders, checks if user_id = 42
SELECT * FROM orders WHERE user_id = 42;
-- On 1M rows: ~200ms

With an index:

CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Jumps directly to matching rows via index
SELECT * FROM orders WHERE user_id = 42;
-- On 1M rows: ~1ms

Add indexes on columns you filter by (WHERE), join on (JOIN ... ON), and sort by (ORDER BY). These are the operations that benefit from indexes.

Columns that are good index candidates:

-- Foreign keys are almost always worth indexing
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_comments_post_id ON comments(post_id);

-- Columns used in WHERE clauses frequently
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_users_email ON users(email);

-- Columns you ORDER BY on paginated queries
CREATE INDEX idx_orders_created_at ON orders(created_at DESC);

Composite indexes cover multiple columns and are valuable when you always filter on a specific combination:

-- You always query: WHERE user_id = ? AND status = ?
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Column order matters: this index helps (user_id) and (user_id, status)
-- but NOT (status) alone

One important caveat: indexes aren't free. Each index slows down INSERT, UPDATE, and DELETE operations because the index must be updated too. Don't index every column — index the columns that are actually used in performance-critical queries.

Mistake 3: The N+1 Query Problem

N+1 is one of the most common and destructive database performance problems, and it's especially prevalent when using ORMs that hide SQL from you.

Here's what it looks like:

// Fetch all posts — 1 query
const posts = await db.query("SELECT * FROM posts LIMIT 20");

// For each post, fetch its author — 20 more queries!
for (const post of posts) {
  post.author = await db.query(
    "SELECT * FROM users WHERE id = $1", [post.userId]
  );
}
// Total: 21 queries. On 100 posts: 101 queries. On 1000 posts: 1001 queries.

This scales linearly. Fetch 1,000 posts and you make 1,001 database round-trips. Each round-trip has network latency, connection overhead, and query parsing costs. What should be a fast page load becomes a 5-second timeout.

The fix is to fetch everything you need in a single query using a JOIN, or in two queries using an IN clause:

-- Option 1: JOIN — get posts and authors in one query
SELECT
  posts.id,
  posts.title,
  posts.body,
  users.id AS author_id,
  users.name AS author_name
FROM posts
JOIN users ON users.id = posts.user_id
LIMIT 20;

// Option 2: Two queries — fetch posts, then fetch all authors at once
const posts = await db.query("SELECT * FROM posts LIMIT 20");

const authorIds = [...new Set(posts.map(p => p.userId))];
const authors = await db.query(
  "SELECT * FROM users WHERE id = ANY($1)", [authorIds]
);

// Build a lookup map
const authorMap = Object.fromEntries(authors.map(a => [a.id, a]));

// Attach authors without any additional queries
for (const post of posts) {
  post.author = authorMap[post.userId];
}
// Total: 2 queries, regardless of how many posts

In ORMs, N+1 often appears when you access a relationship that wasn't explicitly loaded. Most ORMs have an "eager loading" mechanism to prevent this — in Prisma it's include, in Sequelize it's include, in ActiveRecord it's includes. Always check what queries your ORM is generating, especially in loops.

Mistake 4: `SELECT *` — Fetching Columns You Don't Use

SELECT * retrieves every column in a table. If your users table has 30 columns but you only need id, name, and email, you're transferring 27 columns of unnecessary data across the network, through your ORM, and into memory — for every row, on every query.

On tables with large text columns, JSON blobs, or binary data, this becomes very expensive very quickly.

-- 😐 Fetches all 30 columns of every user
SELECT * FROM users WHERE active = true;

-- ✅ Fetches only what you need
SELECT id, name, email FROM users WHERE active = true;

This matters even more for joined queries across multiple tables. SELECT * on a five-table join can return 150+ columns when you need 10.

Get specific about what you select. Your queries become self-documenting (the columns tell you what the query is used for), your data transfer is minimal, and adding a column to a table later won't silently break anything that was relying on a specific column order.

Mistake 5: Filtering on a Function-Wrapped Column

This one is subtle and easy to miss: when you wrap a column in a function inside a WHERE clause, PostgreSQL can't use the index on that column.

-- 😐 Can't use index on created_at — full table scan
SELECT * FROM orders WHERE DATE(created_at) = '2024-03-15';
SELECT * FROM orders WHERE LOWER(email) = 'jane@example.com';
SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = 2024;

The fix is to rewrite the condition so the column is unmodified:

-- ✅ Can use index on created_at
SELECT * FROM orders
WHERE created_at >= '2024-03-15'
  AND created_at < '2024-03-16';

-- ✅ For case-insensitive email search — use a functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'jane@example.com';

Functional indexes let you index the result of an expression. They're the right solution when you genuinely need to filter on a transformed value.

Mistake 6: Using OFFSET for Pagination at Scale

Offset-based pagination is the standard approach and it works fine for small datasets:

-- Page 1
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 0;
-- Page 2
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 20;
-- Page 100
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 1980;

The problem: for page 100, the database must scan and discard the first 1,980 rows before returning 20. For page 1,000, it discards 19,980 rows. The deeper you paginate, the slower it gets — OFFSET doesn't skip work, it does the work and throws it away.

Cursor-based pagination avoids this entirely:

-- First page — get the last ID from the result
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Next page — pass the created_at and id of the last row from previous page
SELECT id, title, created_at
FROM posts
WHERE (created_at, id) < ('2024-03-15 10:30:00', 847)
ORDER BY created_at DESC, id DESC
LIMIT 20;

With an index on (created_at DESC, id DESC), this query is fast regardless of how many pages deep you are. The database uses the index to jump directly to the right position. Page 1 and page 10,000 take the same amount of time.

The tradeoff: cursor-based pagination doesn't support jumping to an arbitrary page. You can only go forward (or backward with a reversed cursor). For most user-facing feeds and lists, this is fine — users scroll sequentially. For admin tables where jumping to page 47 matters, offset pagination is acceptable if your dataset stays manageable.

Mistake 7: Missing Partial Indexes

A regular index covers every row in a table. A partial index covers only the rows that match a condition — making it smaller, faster to scan, and cheaper to maintain.

If 95% of your orders are status = 'completed' and you only ever query for status = 'pending', a full index on status is doing a lot of work for rows you never look up:

-- 😐 Full index — covers all orders regardless of status
CREATE INDEX idx_orders_status ON orders(status);

-- ✅ Partial index — covers only pending orders
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';

The partial index is dramatically smaller (covering maybe 5% of rows instead of 100%), fits more easily in memory, and makes queries on pending orders faster.

Other good candidates for partial indexes:

-- Index only active users
CREATE INDEX idx_users_active_email ON users(email) WHERE active = true;

-- Index only unread notifications
CREATE INDEX idx_notifications_unread ON notifications(user_id, created_at)
WHERE read_at IS NULL;

-- Index only recent records
CREATE INDEX idx_events_recent ON events(created_at)
WHERE created_at > NOW() - INTERVAL '30 days';

Mistake 8: Not Using Connection Pooling

Every database connection has overhead — memory on the server, authentication handshakes, TCP setup. If your application opens a new connection for every request and closes it when the request finishes, you're paying that overhead constantly.

With 100 concurrent requests, you're trying to open 100 database connections simultaneously. PostgreSQL has a default connection limit of 100. You hit the limit, requests start queuing, and everything slows down.

Connection pooling solves this by maintaining a pool of open connections and reusing them across requests:

// ✅ With pg-pool in Node.js
import pg from 'pg';

const pool = new pg.Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,          // Maximum connections in the pool
  idleTimeoutMillis: 30000,  // Close idle connections after 30s
  connectionTimeoutMillis: 2000,  // Error if can't acquire in 2s
});

// Acquires a connection from pool, releases back when done
const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

For larger deployments, run PgBouncer as a connection pooler in front of PostgreSQL. It handles thousands of application connections and multiplexes them onto a small number of actual PostgreSQL connections — the database sees 10 connections while your application thinks it has 1,000 available.

Quick Reference: Before You Write a Query

Run through this mental checklist for any query touching large tables:

Is there an index on every column in WHERE, JOIN ON, and ORDER BY?
Am I selecting only the columns I actually use (no SELECT *)?
If I'm in a loop, am I making one query per iteration? (N+1 check)
Are any of my WHERE conditions wrapping a column in a function?
If paginating, does OFFSET scale to the page depths users might reach?
Run EXPLAIN ANALYZE — do I see Seq Scan on large tables?

Summary

Most database performance problems are boring and fixable. Missing indexes, N+1 queries, SELECT *, and function-wrapped columns in WHERE clauses account for the vast majority of slow queries. EXPLAIN ANALYZE is your best friend — it shows you exactly what the database is doing and where the time is being spent.

The habit to build: before shipping any feature that touches the database, run EXPLAIN ANALYZE on every query it uses. On your dev data it might look fine. With production row counts estimated using SET enable_seqscan = off or a staging environment, you'll catch problems before your users do.

Why Queries That Work Locally Break in Production

Development databases are small. You have maybe a few thousand rows, everything fits in memory, and even a terrible query runs in milliseconds. This creates a false sense of security.

The good news: most database performance problems come from a small set of known mistakes. Learn to recognize them and you can fix 90% of slow queries.

Mistake 1: Not Using EXPLAIN ANALYZE

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 42
ORDER BY created_at DESC;

Sample output:

Sort  (cost=1250.34..1252.84 rows=1000) (actual time=89.432..89.891 rows=1000)
  Sort Key: created_at DESC
  ->  Seq Scan on orders  (cost=0.00..1200.00 rows=1000) (actual time=0.021..85.341 rows=1000)
        Filter: (user_id = 42)
        Rows Removed by Filter: 999000
Planning Time: 0.5 ms
Execution Time: 90.1 ms

The two things to look for immediately:

Seq Scan (Sequential Scan) — The database is reading every row in the table to find the ones it needs. On a large table, this is almost always a problem. You want to see Index Scan instead.

Run EXPLAIN ANALYZE on every query you're optimizing. It's not optional — it's the only way to know what's actually happening.

Mistake 2: Missing Indexes on Filtered and Joined Columns

Without an index on user_id:

-- Scans every row in orders, checks if user_id = 42
SELECT * FROM orders WHERE user_id = 42;
-- On 1M rows: ~200ms

With an index:

CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Jumps directly to matching rows via index
SELECT * FROM orders WHERE user_id = 42;
-- On 1M rows: ~1ms

Add indexes on columns you filter by (WHERE), join on (JOIN ... ON), and sort by (ORDER BY). These are the operations that benefit from indexes.

Columns that are good index candidates:

-- Foreign keys are almost always worth indexing
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_comments_post_id ON comments(post_id);

-- Columns used in WHERE clauses frequently
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_users_email ON users(email);

-- Columns you ORDER BY on paginated queries
CREATE INDEX idx_orders_created_at ON orders(created_at DESC);

Composite indexes cover multiple columns and are valuable when you always filter on a specific combination:

-- You always query: WHERE user_id = ? AND status = ?
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Column order matters: this index helps (user_id) and (user_id, status)
-- but NOT (status) alone

Mistake 3: The N+1 Query Problem

N+1 is one of the most common and destructive database performance problems, and it's especially prevalent when using ORMs that hide SQL from you.

Here's what it looks like:

// Fetch all posts — 1 query
const posts = await db.query("SELECT * FROM posts LIMIT 20");

// For each post, fetch its author — 20 more queries!
for (const post of posts) {
  post.author = await db.query(
    "SELECT * FROM users WHERE id = $1", [post.userId]
  );
}
// Total: 21 queries. On 100 posts: 101 queries. On 1000 posts: 1001 queries.

The fix is to fetch everything you need in a single query using a JOIN, or in two queries using an IN clause:

-- Option 1: JOIN — get posts and authors in one query
SELECT
  posts.id,
  posts.title,
  posts.body,
  users.id AS author_id,
  users.name AS author_name
FROM posts
JOIN users ON users.id = posts.user_id
LIMIT 20;

// Option 2: Two queries — fetch posts, then fetch all authors at once
const posts = await db.query("SELECT * FROM posts LIMIT 20");

const authorIds = [...new Set(posts.map(p => p.userId))];
const authors = await db.query(
  "SELECT * FROM users WHERE id = ANY($1)", [authorIds]
);

// Build a lookup map
const authorMap = Object.fromEntries(authors.map(a => [a.id, a]));

// Attach authors without any additional queries
for (const post of posts) {
  post.author = authorMap[post.userId];
}
// Total: 2 queries, regardless of how many posts

Mistake 4: `SELECT *` — Fetching Columns You Don't Use

On tables with large text columns, JSON blobs, or binary data, this becomes very expensive very quickly.

-- 😐 Fetches all 30 columns of every user
SELECT * FROM users WHERE active = true;

-- ✅ Fetches only what you need
SELECT id, name, email FROM users WHERE active = true;

This matters even more for joined queries across multiple tables. SELECT * on a five-table join can return 150+ columns when you need 10.

Mistake 5: Filtering on a Function-Wrapped Column

This one is subtle and easy to miss: when you wrap a column in a function inside a WHERE clause, PostgreSQL can't use the index on that column.

-- 😐 Can't use index on created_at — full table scan
SELECT * FROM orders WHERE DATE(created_at) = '2024-03-15';
SELECT * FROM orders WHERE LOWER(email) = 'jane@example.com';
SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = 2024;

The fix is to rewrite the condition so the column is unmodified:

-- ✅ Can use index on created_at
SELECT * FROM orders
WHERE created_at >= '2024-03-15'
  AND created_at < '2024-03-16';

-- ✅ For case-insensitive email search — use a functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'jane@example.com';

Functional indexes let you index the result of an expression. They're the right solution when you genuinely need to filter on a transformed value.

Mistake 6: Using OFFSET for Pagination at Scale

Offset-based pagination is the standard approach and it works fine for small datasets:

-- Page 1
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 0;
-- Page 2
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 20;
-- Page 100
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 1980;

Cursor-based pagination avoids this entirely:

-- First page — get the last ID from the result
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Next page — pass the created_at and id of the last row from previous page
SELECT id, title, created_at
FROM posts
WHERE (created_at, id) < ('2024-03-15 10:30:00', 847)
ORDER BY created_at DESC, id DESC
LIMIT 20;

Mistake 7: Missing Partial Indexes

A regular index covers every row in a table. A partial index covers only the rows that match a condition — making it smaller, faster to scan, and cheaper to maintain.

If 95% of your orders are status = 'completed' and you only ever query for status = 'pending', a full index on status is doing a lot of work for rows you never look up:

-- 😐 Full index — covers all orders regardless of status
CREATE INDEX idx_orders_status ON orders(status);

-- ✅ Partial index — covers only pending orders
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';

The partial index is dramatically smaller (covering maybe 5% of rows instead of 100%), fits more easily in memory, and makes queries on pending orders faster.

Other good candidates for partial indexes:

-- Index only active users
CREATE INDEX idx_users_active_email ON users(email) WHERE active = true;

-- Index only unread notifications
CREATE INDEX idx_notifications_unread ON notifications(user_id, created_at)
WHERE read_at IS NULL;

-- Index only recent records
CREATE INDEX idx_events_recent ON events(created_at)
WHERE created_at > NOW() - INTERVAL '30 days';

Mistake 8: Not Using Connection Pooling

Connection pooling solves this by maintaining a pool of open connections and reusing them across requests:

// ✅ With pg-pool in Node.js
import pg from 'pg';

const pool = new pg.Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,          // Maximum connections in the pool
  idleTimeoutMillis: 30000,  // Close idle connections after 30s
  connectionTimeoutMillis: 2000,  // Error if can't acquire in 2s
});

// Acquires a connection from pool, releases back when done
const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

Quick Reference: Before You Write a Query

Run through this mental checklist for any query touching large tables:

Is there an index on every column in WHERE, JOIN ON, and ORDER BY?
Am I selecting only the columns I actually use (no SELECT *)?
If I'm in a loop, am I making one query per iteration? (N+1 check)
Are any of my WHERE conditions wrapping a column in a function?
If paginating, does OFFSET scale to the page depths users might reach?
Run EXPLAIN ANALYZE — do I see Seq Scan on large tables?

SQL Mistakes That Kill Your Database (And How to Fix Them)

Why Queries That Work Locally Break in Production

Mistake 1: Not Using EXPLAIN ANALYZE

Mistake 2: Missing Indexes on Filtered and Joined Columns

Mistake 3: The N+1 Query Problem

Mistake 4: `SELECT *` — Fetching Columns You Don't Use

Mistake 5: Filtering on a Function-Wrapped Column

Mistake 7: Missing Partial Indexes

Mistake 8: Not Using Connection Pooling

Quick Reference: Before You Write a Query

Summary

ZyVOP

Comments (0)

SQL Mistakes That Kill Your Database (And How to Fix Them)

Why Queries That Work Locally Break in Production

Mistake 1: Not Using EXPLAIN ANALYZE

Mistake 2: Missing Indexes on Filtered and Joined Columns

Mistake 3: The N+1 Query Problem

Mistake 4: `SELECT *` — Fetching Columns You Don't Use

Mistake 5: Filtering on a Function-Wrapped Column

Mistake 7: Missing Partial Indexes

Mistake 8: Not Using Connection Pooling

Quick Reference: Before You Write a Query

Summary

ZyVOP

Comments (0)

Related Posts

Four Ways a Refresh Token Request Fails — Only One Means Trouble

Background Jobs in NestJS with BullMQ: A Complete Walkthrough

Implementing Two-Factor Authentication (TOTP) in NestJS: With Full Source Code

IDOR Vulnerabilities in NestJS: How to Build Ownership Guards That Actually Protect Your Data

The Evolution of TypeScript Compilers: SWC vs TSC

Popular Tags

SQL Mistakes That Kill Your Database (And How to Fix Them)

Why Queries That Work Locally Break in Production

Mistake 1: Not Using EXPLAIN ANALYZE

Mistake 2: Missing Indexes on Filtered and Joined Columns

Mistake 3: The N+1 Query Problem

Mistake 4: SELECT * — Fetching Columns You Don't Use

Mistake 5: Filtering on a Function-Wrapped Column

Mistake 6: Using OFFSET for Pagination at Scale

Mistake 7: Missing Partial Indexes

Mistake 8: Not Using Connection Pooling

Quick Reference: Before You Write a Query

Summary

ZyVOP

Comments (0)

SQL Mistakes That Kill Your Database (And How to Fix Them)

Why Queries That Work Locally Break in Production

Mistake 1: Not Using EXPLAIN ANALYZE

Mistake 2: Missing Indexes on Filtered and Joined Columns

Mistake 3: The N+1 Query Problem

Mistake 4: SELECT * — Fetching Columns You Don't Use

Mistake 5: Filtering on a Function-Wrapped Column

Mistake 6: Using OFFSET for Pagination at Scale

Mistake 7: Missing Partial Indexes

Mistake 8: Not Using Connection Pooling

Quick Reference: Before You Write a Query

Summary

ZyVOP

Comments (0)

Related Posts

Four Ways a Refresh Token Request Fails — Only One Means Trouble

Background Jobs in NestJS with BullMQ: A Complete Walkthrough

Implementing Two-Factor Authentication (TOTP) in NestJS: With Full Source Code

IDOR Vulnerabilities in NestJS: How to Build Ownership Guards That Actually Protect Your Data

The Evolution of TypeScript Compilers: SWC vs TSC

Popular Tags

Mistake 4: `SELECT *` — Fetching Columns You Don't Use

Mistake 4: `SELECT *` — Fetching Columns You Don't Use