ArchitectureEngineeringAPI

API design mistakes that show up after a year in production

Seven API design mistakes that seem harmless early on but become costly after one year: compatibility, idempotency, error contracts, pagination and deprecation.

API design mistakes that show up after a year in production

Short answer

The most expensive API design mistakes are rarely endpoint syntax issues.

They are usually missing contracts for change, retries, and deprecation.

For the first months, everything looks fine because traffic is low and coordination is manual. After a year, scale exposes those gaps as duplicate writes, client breakages, and painful migrations.

Why these failures appear after one year

Early API validation is usually focused on happy-path correctness.

After 12 months, conditions change:

  • multiple mobile app versions run in parallel,
  • retries happen at many layers (SDK, gateway, workers),
  • concurrent writes become common,
  • the data model evolves while old clients still exist.

At that point, "it works" is no longer the same as "it is safe to evolve".

1) No explicit backward-compatibility policy

A common anti-pattern is treating contract changes as internal refactors.

For API consumers, they are not internal.

What is a breaking change

In practice, any change that can break an existing client without client deployment is breaking, for example:

  • removing a response field,
  • changing field type (int -> string),
  • tightening request validation,
  • changing status-code semantics.

Without a written compatibility policy, every release becomes probabilistic.

How to fix it

Keep the contract in OpenAPI and enforce diff checks in CI.

yaml
openapi: 3.1.2
info:
  title: Billing API
  version: 1.8.0
paths:
  /v1/invoices/{id}:
    get:
      responses:
        '200':
          description: OK

Decision rule:

  1. If client code must change, treat it as breaking.
  2. If many clients are affected, introduce a new version with migration window.
  3. If change is additive, keep version and monitor adoption.

Trade-off: versioned APIs increase maintenance overhead, but no versioning increases incident frequency and rollback cost.

2) Retrying mutating operations without idempotency

As systems grow, retries appear everywhere.

If POST /orders has no idempotency key, a transient timeout can create duplicate orders.

Production pattern

http
POST /v1/orders
Idempotency-Key: 4f3bb3ff-3328-4bc2-a70a-59efec9db195

The server should return the same result for the same key and equivalent payload within retention window.

Large payment APIs (for example Stripe) use this pattern to contain retry side effects.

Decision criteria

  • operation has side effects -> require idempotency key,
  • read-only operation -> rely on HTTP method semantics,
  • high-value writes (payments/provisioning) -> longer key retention and collision auditing.

Trade-off: key storage and payload matching add complexity, but missing idempotency creates expensive data-repair work.

3) Error responses without stable machine-readable semantics

If the API only returns {"message":"something went wrong"}, clients cannot reliably decide whether to retry, fix input, or escalate.

After a year, this becomes string-matching chaos across integrations.

Better contract

Standardize on RFC 9457 application/problem+json:

json
{
  "type": "https://api.example.com/problems/insufficient-quota",
  "title": "Quota exceeded",
  "status": 429,
  "detail": "Daily write quota exceeded",
  "instance": "/v1/orders/req-98f1"
}

Clients should map type to policy, not parse detail text.

4) No concurrency contract on updates

At low traffic, concurrent updates look rare.

At scale, lost updates become routine unless concurrency is explicit in the API contract.

Contract that survives production

  • server returns ETag,
  • client sends If-Match on mutation,
  • server rejects stale update with 412 Precondition Failed (or requires preconditions using 428 Precondition Required).
http
PATCH /v1/subscriptions/sub_123
If-Match: "v17"

Trade-off: one more client responsibility, but far lower cost than silent data corruption.

5) Offset pagination on highly mutable collections

Offset pagination often passes early tests and fails under live write activity.

Between page requests, inserted rows shift the window, causing duplicates or skips.

More stable alternative

  • cursor pagination,
  • deterministic ordering (created_at DESC, id DESC),
  • next/prev links in payload or HTTP Link header.
http
Link: </v1/events?cursor=eyJjcmVhdGVkX2F0Ijoi..."; rel="next"

Decision criteria:

  • small mostly static list -> offset can be acceptable,
  • large/high-churn list -> cursor by default,
  • ETL consumers -> provide snapshot/cutoff semantics.

6) No timeout/retry/backoff budget

Without retry budgets, clients optimize for individual request success and destabilize the whole system.

Under overload, this becomes retry amplification.

Minimal contract

  • explicit 429 and 503 behavior,
  • Retry-After when server can communicate safe retry timing,
  • exponential backoff with jitter guidance,
  • max attempts per request.

AWS Builders Library describes this clearly: retries are useful but can amplify outages if unmanaged.

7) Deprecation without dates and migration telemetry

The costly anti-pattern is announcing "v1 is deprecated" in release notes only.

One year later, legacy endpoints still run because nobody knows which clients depend on them.

Deprecation contract

  • Deprecation header (RFC 9745),
  • Sunset header (RFC 8594),
  • Link rel="deprecation" to migration guide,
  • adoption dashboard by client/application.
http
Deprecation: @1743465600
Sunset: Tue, 30 Sep 2026 23:59:59 GMT
Link: <https://docs.example.com/migrate-v2>; rel="deprecation"

Decision rule:

  1. Migration below target threshold -> extend window.
  2. Legacy share low + communication complete -> close old version.
  3. No adoption telemetry -> do not sunset yet.

90-day remediation checklist

  1. Define formal breaking/non-breaking policy and enforce OAS diff gate in CI.
  2. Require idempotency keys for side-effecting mutations.
  3. Standardize error payloads on RFC 9457 with stable type URIs.
  4. Add ETag/If-Match to critical PUT/PATCH flows.
  5. Replace offset with cursor pagination on high-churn resources.
  6. Publish retry budget and backoff requirements.
  7. Document 429/503 plus Retry-After behavior.
  8. Emit Deprecation and Sunset headers for retiring versions.
  9. Track version adoption per client id/token.
  10. Keep a clear runbook for extending or enforcing sunset.

Final verdict

The API design mistakes that hurt after a year all share one pattern: missing explicit contracts for evolution.

A healthy API is not one that returns 200 today.

It is one that can change safely under growing traffic, many clients, and continuous delivery pressure.