Full-Text Search in PostgreSQL: Skip Elasticsearch for the First 10 Million Rows
Build fast, typo-tolerant search in PostgreSQL using full-text search, relevance ranking, autocomplete, and fuzzy matching — without Elasticsearch.
Senior Developer

The reflex when someone asks for search is to reach for Elasticsearch or Algolia. For a lot of products, that reflex costs you operational complexity, a monthly bill, and a synchronization problem — keeping Elasticsearch in sync with your Postgres database — that you will debug at the worst possible time.
Postgres has a mature, production-capable full-text search engine built in. It handles most product search requirements up to tens of millions of rows, supports ranking, multilingual stemming, fuzzy matching, and prefix search. If your search volume fits those bounds, this is the right tool.
This guide covers everything you need: GIN indexes, tsvector, tsquery, ranking, partial matches, and the query builder that ties it together.
How Postgres Full-Text Search Works
Postgres represents a searchable document as a tsvector — a sorted list of lexemes (normalized word forms) with their positions. A search query is a tsquery — a boolean expression of lexemes.
-- A tsvector normalizes words and strips stop words
SELECT to_tsvector('english', 'The quick brown foxes are jumping');
-- Result: 'brown':3 'fox':4 'jump':6 'quick':2
-- 'The' and 'are' are stop words, removed
-- 'foxes' → 'fox', 'jumping' → 'jump' (stemming)
-- A tsquery is what you search for
SELECT to_tsquery('english', 'foxes & jumping');
-- Result: 'fox' & 'jump'
-- Match check
SELECT to_tsvector('english', 'The quick brown foxes are jumping')
@@ to_tsquery('english', 'foxes & jumping');
-- Result: trueThe @@ operator is the match operator. The @> and <@ operators check containment. This is the foundation of everything that follows.
Setting Up Your Schema
Option 1: Generated Column (Recommended)
Store the tsvector as a generated column — Postgres maintains it automatically on insert and update.
-- Products table with full-text search
ALTER TABLE products
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(description, '')), 'B') ||
setweight(to_tsvector('english', coalesce(category, '')), 'C')
) STORED;
-- GIN index on the generated column — makes search fast
CREATE INDEX CONCURRENTLY idx_products_search
ON products USING GIN(search_vector);setweight assigns importance to different fields. 'A' is highest weight, 'D' is lowest. When ranking results, name matches rank higher than description matches — which is the right behavior.
Option 2: Trigger-Maintained Column (For Complex Cases)
If your search vector needs data from joined tables or complex logic, use a trigger:
-- Add a regular tsvector column
ALTER TABLE products ADD COLUMN search_vector tsvector;
-- Function to rebuild the vector
CREATE OR REPLACE FUNCTION update_product_search_vector()
RETURNS TRIGGER AS $$
BEGIN
NEW.search_vector :=
setweight(to_tsvector('english', coalesce(NEW.name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(NEW.sku, '')), 'A') ||
setweight(to_tsvector('english', coalesce(NEW.description, '')), 'B') ||
setweight(to_tsvector('english', coalesce(NEW.brand, '')), 'C') ||
setweight(to_tsvector('english', coalesce(NEW.tags::text, '')), 'C');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER products_search_vector_update
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW EXECUTE FUNCTION update_product_search_vector();
-- Backfill existing rows
UPDATE products SET name = name;
CREATE INDEX CONCURRENTLY idx_products_search ON products USING GIN(search_vector);Basic Search Query
// src/services/searchService.ts
export async function searchProducts(params: {
query: string;
tenantId: string;
limit: number;
cursor?: string;
}) {
const { query, tenantId, limit } = params;
// Sanitize and format the query
// websearch_to_tsquery is more forgiving than to_tsquery
// It handles: "exact phrase", word1 OR word2, -excluded
// It does NOT throw on malformed input — safe for user input
const result = await db.query(`
SELECT
id, name, description, price, stock,
ts_rank(search_vector, query) AS rank,
ts_headline(
'english',
description,
query,
'MaxWords=20, MinWords=10, StartSel=<mark>, StopSel=</mark>'
) AS excerpt
FROM
products,
websearch_to_tsquery('english', $1) query
WHERE
tenant_id = $2
AND active = true
AND search_vector @@ query
ORDER BY rank DESC, name ASC
LIMIT $3
`, [query, tenantId, limit + 1]);
return result.rows;
}websearch_to_tsquery is the right function for user input. Unlike to_tsquery, it does not throw on malformed queries — a user typing a single apostrophe will not crash your endpoint.
ts_headline generates a snippet from the matching document with the search terms highlighted. The StartSel and StopSel parameters let you wrap matches in any HTML tags you want.
Prefix Search (Autocomplete)
Standard tsquery requires complete words. For autocomplete — matching as the user types — use prefix matching:
-- Standard: matches 'run' exactly
to_tsquery('english', 'run') -- matches: run, runs, running (stemming)
-- Prefix: matches anything starting with 'run'
to_tsquery('english', 'run:*') -- matches: run, runner, runaway, runningexport async function autocomplete(params: {
query: string;
tenantId: string;
limit?: number;
}) {
const { query, tenantId, limit = 10 } = params;
// Sanitize input — strip special characters that would break tsquery
const sanitized = query
.trim()
.replace(/[^a-zA-Z0-9\s-]/g, '')
.split(/\s+/)
.filter(Boolean)
.join(' & ');
if (!sanitized) return [];
// Add :* prefix to the last word for prefix matching
const tsQuery = sanitized.replace(/(\S+)$/, '$1:*');
const result = await db.query(`
SELECT
id, name, price,
ts_rank(search_vector, to_tsquery('english', $1)) AS rank
FROM products
WHERE
tenant_id = $2
AND active = true
AND search_vector @@ to_tsquery('english', $1)
ORDER BY rank DESC, name ASC
LIMIT $3
`, [tsQuery, tenantId, limit]);
return result.rows;
}Fuzzy Matching with pg_trgm
Full-text search requires the user to spell words correctly. pg_trgm adds trigram-based fuzzy matching — finding words that are "similar" even with typos.
-- Enable the extension
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Trigram index for fuzzy matching on name
CREATE INDEX CONCURRENTLY idx_products_name_trgm
ON products USING GIN(name gin_trgm_ops);// Combine full-text and fuzzy search
export async function searchWithFuzzy(params: {
query: string;
tenantId: string;
limit: number;
}) {
const { query, tenantId, limit } = params;
const result = await db.query(`
SELECT
id, name, description, price,
-- Score: full-text rank + trigram similarity, combined
(
COALESCE(ts_rank(search_vector, websearch_to_tsquery('english', $1)), 0) * 2 +
similarity(name, $1)
) AS combined_score
FROM products
WHERE
tenant_id = $2
AND active = true
AND (
-- Full-text match OR fuzzy name match
search_vector @@ websearch_to_tsquery('english', $1)
OR similarity(name, $1) > 0.3
)
ORDER BY combined_score DESC
LIMIT $3
`, [query, tenantId, limit]);
return result.rows;
}The similarity() function returns a 0–1 score. 0.3 is a reasonable threshold — lower catches more typos but more false positives.
The Complete Search Endpoint
// src/routes/search.ts
import { z } from 'zod';
const SearchSchema = z.object({
q: z.string().min(1).max(200).trim(),
limit: z.coerce.number().int().min(1).max(50).default(20),
cursor: z.string().optional(),
sort: z.enum(['relevance', 'price_asc', 'price_desc', 'newest']).default('relevance'),
fuzzy: z.coerce.boolean().default(false),
});
router.get('/search', authenticate, async (req, res) => {
const parsed = SearchSchema.safeParse(req.query);
if (!parsed.success) {
return res.status(400).json({ error: 'Invalid search parameters' });
}
const { q, limit, cursor, sort, fuzzy } = parsed.data;
// Skip search for very short queries — return empty or popular items
if (q.length < 2) {
return res.json({ data: [], pagination: { hasNextPage: false } });
}
try {
const results = fuzzy
? await searchWithFuzzy({ query: q, tenantId: req.tenant.id, limit: limit + 1 })
: await searchProducts({ query: q, tenantId: req.tenant.id, limit: limit + 1, cursor });
const hasNextPage = results.length > limit;
if (hasNextPage) results.pop();
res.json({
data: results,
query: q,
pagination: { hasNextPage },
});
} catch (err: any) {
// If tsquery throws (malformed input that got through sanitization)
// return empty results rather than a 500
if (err.code === '42601') {
return res.json({ data: [], query: q, pagination: { hasNextPage: false } });
}
throw err;
}
});
// Autocomplete endpoint — fast, prefix-based
router.get('/search/suggest', authenticate, async (req, res) => {
const q = z.string().min(2).max(100).trim().safeParse(req.query.q);
if (!q.success) return res.json([]);
const suggestions = await autocomplete({
query: q.data,
tenantId: req.tenant.id,
limit: 8,
});
res.json(suggestions.map(s => ({ id: s.id, name: s.name, price: s.price })));
});When to Move to Elasticsearch
Postgres full-text search is genuinely capable for most product requirements. Consider moving when:
Your search index exceeds ~50 million rows and query time degrades noticeably
You need faceted search with real-time facet counts (category filter counts that update as you type)
You need multi-language stemming across many languages in the same index
You need ML-based semantic/vector search (though Postgres
pgvectorextension now handles this too)You need search across multiple tables/indices with different schemas simultaneously
For everything else — product search, user search, content search, orders, documents — Postgres handles it, and you get one less system to operate.
Comments (0)
Login to post a comment.