DjangoPerformancePython

PostgreSQL Performance in Django: Indexes That Actually Work

A practical guide to picking PostgreSQL indexes in Django that lower latency, with clear trade-offs for write cost and operational complexity.

PostgreSQL Performance in Django: Indexes That Actually Work

Short answer

In Django, indexes actually work when each one is tied to a specific query shape: filter, ordering, or function-based lookup.

They stop working as a strategy when you add them in bulk without measurement, because every index increases INSERT, UPDATE, and DELETE cost.

A reliable approach is simple: measure first, add one targeted index, measure again, then decide keep/drop.

What "actually works" means in production

Treat an index as successful only if all three are true:

  • it lowers p95 latency for a real endpoint or query,
  • it reduces plan cost and/or buffer reads in EXPLAIN ANALYZE,
  • it keeps write-path overhead inside an agreed budget.

If your local benchmark improves but production writes slow down, the index is not a win.

Index selection flow (data before opinions)

index selection flow - data before opinions

This flow forces evidence-based decisions instead of checklist-driven indexing.

1) Start from expensive queries, not model fields

Use pg_stat_statements to identify high-cost statements by total_exec_time or high per-call latency by mean_exec_time.

Then inspect the exact ORM query with QuerySet.explain().

python
qs = (
    Order.objects
    .filter(tenant_id=tenant_id, status="PAID")
    .order_by("-created_at")
)

print(qs.explain(analyze=True, buffers=True, verbose=True))

The plan is your input contract for indexing decisions.

2) B-Tree first for most OLTP API paths

In PostgreSQL, B-Tree is still the default best choice for comparisons and ordering.

For multicolumn indexes, column order is decisive because of the leftmost-prefix behavior.

For WHERE tenant_id = ? AND status = ? ORDER BY created_at DESC, a strong candidate is:

Index(fields=["tenant_id", "status", "-created_at"], name="ord_tenant_status_created_ix")

Trade-off: better read path, higher write amplification on every Order write.

3) Partial index when hot traffic hits a subset

If most requests target a narrow subset (for example status='PAID'), partial indexes often outperform a full-table index in ROI.

In Django:

Index(fields=["tenant_id", "-created_at"], condition=Q(status="PAID"), name="ord_paid_recent_ix")

Important: PostgreSQL uses a partial index only if the query predicate logically implies the index predicate. Small predicate mismatches or parameterization details can make it unusable.

4) Expression index for function-based predicates

If your query filters on LOWER(email) or another expression, a plain index on email may not help.

Use an expression index:

Index(Lower("email"), name="ord_email_lower_ix")

You gain read performance by avoiding per-row expression evaluation during lookup, but writes pay the maintenance cost.

5) Covering index (INCLUDE) for list endpoints

INCLUDE lets you keep non-key columns in the index payload.

This can reduce heap access and enable index-only scans in favorable conditions.

But index-only scans are not guaranteed: they depend on visibility map state, not only on index definition.

Example:

Index(fields=["tenant_id", "status"], include=["total_gross", "currency"], name="ord_cover_ix")

On non-PostgreSQL backends, include is ignored.

6) GIN and BRIN are specialist tools

Choose GIN for complex searchable structures: jsonb, arrays, hstore, or full-text vectors.

In Django, this is typically GinIndex for JSONField workloads.

Choose BRIN for very large append-only tables where values correlate with physical row order (often timestamp-heavy telemetry data).

For classic transactional API traffic, B-Tree is usually the safer baseline than BRIN.

Django Meta.indexes example

python
from django.contrib.postgres.indexes import GinIndex
from django.db import models
from django.db.models import Q
from django.db.models.functions import Lower

class Order(models.Model):
    tenant_id = models.UUIDField()
    status = models.CharField(max_length=20)
    created_at = models.DateTimeField()
    email = models.EmailField()
    total_gross = models.DecimalField(max_digits=12, decimal_places=2)
    currency = models.CharField(max_length=3)
    metadata = models.JSONField(default=dict)

    class Meta:
        indexes = [
            models.Index(
                fields=["tenant_id", "status", "-created_at"],
                name="ord_tenant_status_created_ix",
            ),
            models.Index(
                fields=["tenant_id", "-created_at"],
                condition=Q(status="PAID"),
                name="ord_paid_recent_ix",
            ),
            models.Index(
                Lower("email"),
                name="ord_email_lower_ix",
            ),
            models.Index(
                fields=["tenant_id", "status"],
                include=["total_gross", "currency"],
                name="ord_cover_ix",
            ),
            GinIndex(
                fields=["metadata"],
                name="ord_meta_gin_ix",
            ),
        ]

This set only makes sense when it matches your top query patterns. Otherwise, it is over-indexing.

Common failure patterns

  • Adding indexes to every filtered column without frequency analysis.
  • Keeping overlapping indexes with nearly identical prefixes.
  • Ignoring write-path regression after deployment.
  • Assuming partial index predicates always match generated SQL.
  • Treating INCLUDE as a guaranteed index-only scan.

Production rollout checklist

  1. Take top 5 expensive statements from pg_stat_statements over 24-72h.
  2. Record baseline for each: p95, plan, BUFFERS, and calls.
  3. Add one index per iteration, deploy outside peak traffic.
  4. Compare read and write metrics post-deploy.
  5. Drop indexes with marginal gain or unacceptable write penalty.

This method usually produces fewer indexes, but with higher practical ROI.

Final verdict

In Django/PostgreSQL, indexes that actually work are the smallest set that matches the most expensive real queries.

Start with B-Tree and correct column order, add partial and expression indexes only where measurements justify them, and treat GIN/BRIN as targeted tools.

An index is a long-term maintenance decision, not a benchmark trick.