Skip to content

ADR-21: Quota Reservation for Concurrent Request Handling

🇰🇷 한국어 버전

DateAuthorRepos
2026-02-01@specvitalweb, worker, infra

Context

The Concurrency Gap in Event-Based Billing

ADR-13 (Billing and Quota Architecture) established event-based usage tracking where usage events are recorded at job completion. This model ensures billing accuracy (only successful operations are charged) and aligns with the cache-first architecture (cache hits generate no events).

However, a temporal gap exists between quota check (request submission) and quota consumption (job completion). During high concurrency, this gap creates a race condition:

Race Condition Scenario:

TimeActionUser State
T0User has 4998/5000 used-
T1Request A checks quota4998 < 5000 → Pass
T2Request B checks quota4998 < 5000 → Pass
T3Request A completes5008 used (exceeds limit)
T4Request B completes5018 used

Consequences of Unaddressed Race Condition:

  • Server resources wasted processing over-quota jobs
  • Billing discrepancies where actual usage exceeds limits
  • Unfair system behavior depending on request timing

Constraints

ConstraintRationale
Must block at Web levelWorker re-check wastes queue and compute resources
PostgreSQL-only solutionAlign with ADR-04 (River uses PostgreSQL)
No user-visible billing anomaliesPreserve user trust in billing accuracy
Worker owns lifecycleADR-12 establishes Worker as record creator

Decision

Adopt quota reservation pattern with atomic transactions for concurrent request handling.

Reservation Mechanism

Introduce quota_reservations table to track in-flight quota commitments:

ColumnTypePurpose
iduuid (PK)Primary key
user_iduuid (FK)Links to user account
event_typeusage_event_typespecview or analysis
reserved_amountintAnticipated quota consumption
job_idbigint (UNIQUE)Links to River job for cleanup
expires_attimestamptz1-hour TTL for orphan cleanup

Quota Check Formula

sql
used + reserved + requested_amount <= limit

Where:

  • used: Sum of usage_events in current billing period
  • reserved: Sum of active reservations (expires_at > NOW())
  • requested_amount: Units required for current request

Transaction Atomicity

Web layer executes in single PostgreSQL transaction:

  1. Check quota (including active reservations)
  2. Create reservation record
  3. Insert job into River queue (InsertTx)

Atomicity ensures: if job insertion fails, reservation is not created (rollback).

Reservation Lifecycle

Web: Check quota → Create reservation → InsertTx

Worker: Process job → Delete reservation (success or failure)

Cleanup: Expire orphaned reservations after 1 hour

Options Considered

Option A: Reservation Pattern (Selected)

Mechanism: Create reservation atomically with job insertion; Worker deletes on completion.

AspectAssessment
Quota accuracyGuaranteed - no concurrent over-commitment
User experienceTransparent - reservations are internal state
Resource efficiencyHigh - over-quota blocked at Web
ComplexityModerate - additional table and cleanup job

Option B: Pre-Deduction with Refund

Mechanism: Deduct quota at submission; refund on job failure.

AspectAssessment
Quota accuracyGuaranteed
User experiencePoor - users see inflated usage during processing
Resource efficiencyHigh
ComplexityHigh - multiple failure states require refund logic

Rejected: Creates billing anxiety and support burden. Users observing dashboards during job processing would see usage that may be refunded, undermining trust.

Option C: Worker Re-Check

Mechanism: Optimistic check at Web; authoritative check at Worker.

AspectAssessment
Quota accuracyGuaranteed (at Worker level)
User experiencePoor - job accepted then rejected asynchronously
Resource efficiencyPoor - over-quota jobs consume queue capacity
ComplexityLow

Rejected: Violates requirement to block at Web level. Wastes queue and Worker resources on jobs that will be rejected.

Consequences

Positive

BenefitImpact
Accurate quota enforcementConcurrent requests cannot exceed limits
Resource efficiencyNo wasted Worker cycles on over-quota jobs
User fairnessOnly completed work appears in billing
Atomic consistencyJob + reservation in single transaction
Clean failure handlingReservation deleted regardless of job outcome

Negative

Trade-offMitigation
Additional schema complexityStandard pattern; single table with clear purpose
Query overhead for reservation aggregationIndex on (user_id, event_type, expires_at)
Orphan cleanup requirementScheduled job with configurable 1-hour TTL
Debugging complexityStructured logging at reserve/release points

Technical Implications

AspectRequirement
Transaction scopeInsertTx and CreateReservation share PostgreSQL transaction
Index designCompound index for efficient reservation lookup and expiration
Cleanup scheduleCron job to delete reservations where expires_at < NOW()
MonitoringAlert on orphan count (indicates Worker health issues)
Worker modificationDelete reservation by job_id on job completion

References

Open-source test coverage insights