Skip to content

ADR-12: Worker-Centric Analysis Lifecycle

Korean Version

DateAuthorRepos
2024-12-16@KubrickCodeweb, worker

Context

Existing Architecture

ADR-03 established API (Web) and Worker service separation. ADR-04 introduced queue-based async processing. Initial implementation caused dual ownership issues:

Previous Flow:

User Request → Web creates "pending" record → Enqueue → Worker processes → Record update

Problem: Two services manipulating the same database record

Dual Ownership Issues

IssueImpact
Duplicate RecordsWeb creates pending record, Worker may create another
State InconsistencyWeb's DB state may not match actual queue state
Complex Error RecoveryFailure requires coordination between two services
Race ConditionsRetry requests may create multiple pending records
Unclear ResponsibilityAmbiguous authoritative source for record state

Root Cause

Core problem: No Single Source of Truth for analysis record lifecycle. Both Web and Worker have write access to the same record, causing synchronization complexity.

Decision

Adopt Worker-centric analysis lifecycle where Worker exclusively owns record creation, processing, and completion.

New Architecture

User Request → Web (enqueue only) → Queue → Worker (create → process → complete)
                    ↓                              ↓
              UUID generation          Single ownership of record lifecycle
              Check status from queue  Create record on job start
                                       Update on completion/failure

Core Principles:

  1. Web: Enqueue Only - Generate analysis UUID, enqueue job, no writes to analysis table
  2. Worker: Full Ownership - Create record on job start, update on completion
  3. Queue as State Source - Web checks in-progress analysis from queue, not DB
  4. Single Writer - Only Worker writes to analysis records

Options Considered

Option A: Worker-Centric Ownership (Selected)

Web enqueues analysis request with generated UUID. Worker creates record on processing start, updates on completion.

Pros:

  • Single source of truth: One service owns entire lifecycle
  • No duplicate records: Only Worker creates analysis entries
  • Clear service boundaries: Web handles HTTP, Worker handles analysis
  • Simple error handling: All failure states managed in one place
  • Better transaction consistency: Create and update in same service context
  • Independent scaling: Worker scales without affecting Web

Cons:

  • Web depends on queue for status queries
  • Queue system becomes critical infrastructure
  • Slightly more complex status checking logic

Option B: Web-Centric Ownership

Web creates and manages all records. Worker only updates existing records.

Pros:

  • Immediate DB record for tracking
  • Simple status queries (always from DB)

Cons:

  • Worker must handle "record not found" cases
  • Timing issues if queue processes before DB commit
  • More complex retry logic (must check record existence)
  • Web becomes bottleneck for record creation

Option C: Dual Ownership (Existing)

Both services can create and modify records with coordination logic.

Pros:

  • Implementation flexibility

Cons:

  • Duplicate record risk
  • Complex coordination required
  • Similar complexity to distributed transactions
  • Ambiguous source of truth

Implementation Details

Queue Payload

json
{
  "owner": "github-org",
  "repo": "repo-name",
  "commit_sha": "abc123",
  "user_id": "uuid",
  "analysis_id": "uuid"
}
  • analysis_id: Pre-generated by Web

Status Query Flow (Web)

  1. Check if active job matching owner/repo exists in queue
  2. Active job found → Return "pending" status
  3. No active job → Query DB for completed/failed analysis

Deduplication

CommitSHA-based uniqueness prevents duplicate analysis:

  • Multiple requests for same repo+commit → Single queue job
  • River's unique constraint on (owner, repo, commit_sha)
  • Prevents unnecessary computation on concurrent requests

Consequences

Positive

Data Integrity:

  • No duplicate analysis records
  • Consistent lifecycle state machine
  • Race conditions eliminated through clear ownership

Operational Simplicity:

  • Single service for debugging analysis issues (Worker)
  • Clearer logs and traces
  • Simpler monitoring (only one writer to track)

Scalability:

  • Worker scales independently based on queue depth
  • Web stays lightweight (no heavy DB writes for analysis)
  • Queue naturally buffers load

Future Compatibility:

  • Aligns with scheduled re-analysis (Worker ADR-01)
  • Same lifecycle whether user-initiated or scheduled
  • Consistent ownership model

Negative

Queue Dependency:

  • Web must query queue for status
  • Queue unavailability affects status queries
  • Mitigation: Queue shares PostgreSQL backend (River), same availability as DB

Delayed Visibility:

  • Analysis doesn't appear in DB until Worker starts processing
  • Short delay between enqueue and record creation
  • Mitigation: Queue status provides immediate feedback

Complexity Shift:

  • Status logic moves from simple DB query to queue inspection
  • Mitigation: Encapsulate in repository/adapter layer

Technical Implications

AspectImplication
Transaction ScopeRecord creation + initial state in single Worker transaction
Failure HandlingAll retries managed in Worker
Queue SchemaMust support owner/repo lookup for status
MonitoringQueue metrics indicate analysis status
Scheduled AnalysisUser-initiated and scheduled share same lifecycle

References

Open-source test coverage insights