Skip to content

ADR-08: SpecView Worker Binary Separation

🇰🇷 한국어 버전

DateAuthorRepos
2026-01-13@specvitalworker

Context

Design Intent Violation

The Specvital worker architecture follows a binary separation pattern (ADR-05) where each workload type runs as an independent process with dedicated configuration and dependencies.

Initial SpecView implementation violated this pattern by integrating SpecViewWorker into AnalyzerContainer, creating several architectural problems.

Problems with Integrated Approach

ProblemImpact
Secret ContaminationAnalyzer required GEMINI_API_KEY despite not using Gemini
Queue Routing Failure"Unhandled job kind" errors when jobs routed to wrong worker
Scaling MismatchCPU-bound parsing coupled with I/O-bound API workloads
Cost UnpredictabilityPay-per-token AI costs mixed with predictable parsing

Workload Characteristic Asymmetry

ConcernAnalyzerSpec-Generator
External APINoneGemini API
SecretsENCRYPTION_KEYGEMINI_API_KEY
Scaling ProfileCPU-bound (parsing)I/O-bound (API calls)
Cost ProfilePredictable (compute)Variable (pay-per-token)
TimeoutShort (~30s)Long (~10min)
Failure ModeMemory exhaustionRate limiting, API errors

The fundamental mismatch: test file parsing is a deterministic, local computation while spec generation is a non-deterministic, network-dependent AI task.

Decision

Separate AnalyzeWorker and SpecViewWorker into independent binaries with dedicated queues and configuration requirements.

Architecture

src/cmd/
├── analyzer/main.go       # Test file parsing (Tree-sitter, ENCRYPTION_KEY)
├── spec-generator/main.go # AI document generation (Gemini API, GEMINI_API_KEY)
├── scheduler/main.go      # Cron-based job scheduling
└── enqueue/main.go        # Manual enqueuing utility

River Queues:
├── analyze_repository     # Consumed by analyzer binary only
└── generate_spec_document # Consumed by spec-generator binary only

Binary Responsibilities

analyzer/main.go:

  • Consumes analyze_repository jobs from River queue
  • Clones repository, runs Tree-sitter parsing, extracts test metadata
  • Requires: DATABASE_URL, ENCRYPTION_KEY (for OAuth token decryption)
  • Does NOT require: GEMINI_API_KEY

spec-generator/main.go:

  • Consumes generate_spec_document jobs from River queue
  • Calls Gemini API for classification and conversion (ADR-14)
  • Requires: DATABASE_URL, GEMINI_API_KEY
  • Does NOT require: ENCRYPTION_KEY

Queue Isolation

Each binary registers only its supported job kinds:

go
// analyzer/main.go
river.AddWorker(client, &AnalyzeRepositoryWorker{})
// Only handles: analyze_repository

// spec-generator/main.go
river.AddWorker(client, &GenerateSpecDocumentWorker{})
// Only handles: generate_spec_document

Options Considered

Option A: Binary Separation (Selected)

Separate binaries (cmd/analyzer, cmd/spec-generator) with dedicated queues and configuration validation.

Pros:

  • Secret isolation - each binary only loads required secrets
  • Independent scaling - scale AI workloads separately from parsing
  • Cost attribution - clear separation of compute vs API costs
  • Failure isolation - Gemini rate limits don't affect test parsing
  • Queue clarity - each queue maps to exactly one consumer binary

Cons:

  • Two binaries to build, deploy, and monitor
  • Shared code must be extracted to internal packages
  • Configuration duplication for common settings

Option B: Single Binary with Runtime Mode

Single binary with --mode=analyzer or --mode=spec-generator flag.

Pros:

  • Single build artifact
  • Simpler CI/CD pipeline

Cons:

  • Binary includes all dependencies (Gemini SDK loaded even in analyzer mode)
  • Runtime misconfiguration risk
  • Secrets must be validated at runtime, not startup
  • Binary size bloat

Option C: Combined Process with Goroutines

Single process runs both workers as separate goroutines.

Pros:

  • Simplest deployment
  • Shared connection pools

Cons:

  • Secret exposure - every instance has both keys
  • Cannot scale independently
  • Resource contention between CPU-bound and I/O-bound tasks
  • Failure coupling
  • Violates ADR-05 pattern

Consequences

Positive

AreaBenefit
SecurityAnalyzer never touches GEMINI_API_KEY; spec-generator never touches ENCRYPTION_KEY
ScalingScale spec-generator independently based on AI queue depth
Cost VisibilityGemini API costs isolated to spec-generator service metrics
ReliabilityGemini outages don't affect test parsing pipeline
TimeoutAnalyzer: 30s (fast fail), Spec-generator: 10min (AI tolerance)
PaaS OptimizationDifferent instance sizes per workload profile

Negative

AreaTrade-off
Operational ComplexityTwo services to monitor with separate health checks
Build PipelineTwo Docker images to build and push
Shared CodeMust extract common utilities to internal packages
DebuggingCross-service tracing for related jobs

References

Open-source test coverage insights