Skip to content

ADR-28: Dynamic Cost Estimation

🇰🇷 한국어 버전

DateAuthorRepos
2026-02-03@KubrickCodeweb

Context

The Problem

The spec generation modal displays estimated usage costs before users trigger AI-powered test behavior generation. Per ADR-13: Billing and Quota Architecture, users have limited monthly quotas and cache hits do not consume quota.

However, the estimated usage remained constant regardless of cache option selection:

User SelectionExpected BehaviorActual Behavior
Use CacheShow reduced estimate (cache hits)Showed same as fresh
Fresh AnalysisShow full test countShowed same as cache

This occurred because:

  1. Estimates were hardcoded to current_test_count
  2. No mechanism existed to predict cache hit rates
  3. Worker's cache lookup algorithm was not accessible to Web layer

Technical Challenge

The Worker service uses SHA256(normalize(test_name)) for actual cache key lookup (Worker ADR-12). Replicating this algorithm in Web requires:

  • Sharing normalization logic across repositories
  • Frontend computing SHA256 hashes
  • Maintaining synchronization when Worker's algorithm changes

Decision

Implement client-side cache hit prediction using name+path matching against previous spec behaviors.

Algorithm

For each test in current analysis:
  1. Find matching test in previous spec_document.behaviors
     Match criteria: test.name == behavior.test_name
                 AND test.file_path == behavior.file_path
  2. If match found: predict cache hit
  3. If no match: predict cache miss

estimated_cost = current_tests - predicted_cache_hits

Design Choices

ChoiceRationale
Approximate matching~99% accuracy with simple implementation
Client-side predictionNo API round-trip; instant modal rendering
Disclaimer tooltip"Actual usage may differ from estimate"
Graceful degradationNo previous analysis → show full test count

Options Considered

Option A: Name+Path Prediction (Selected)

Match tests by name + file_path comparison against previous spec document behaviors.

Pros:

  • Zero latency: computed client-side from existing data
  • No API dependency: works offline
  • Simple implementation: string comparison only
  • Maintainable: independent of Worker normalization changes
  • High accuracy: ~99% correlation with actual cache hits

Cons:

  • Prediction mismatch: edge cases where name+path matches but normalized hash differs
  • False negatives: same logical test with path change appears as miss
  • No cross-repo benefit: cannot predict cache hits from other repositories

Option B: Replicate Worker's SHA256 Hash Algorithm

Implement identical normalization and hashing in Web frontend.

Pros:

  • 100% accuracy with actual cache behavior
  • Cross-repo prediction theoretically possible

Cons:

  • Code duplication: normalization logic in Worker (Go) and Web (TypeScript)
  • Sync maintenance: algorithm changes require coordinated deployments
  • Bundle size: SHA256 library adds to frontend
  • Complexity: regex-based normalization prone to drift
  • Test burden: must test parity across languages

Rejected: Coupling cost outweighs accuracy benefit.

Option C: Always Show Worst-Case Estimate

Display maximum possible cost (all tests = cache miss).

Pros:

  • Zero prediction risk
  • Simplest implementation

Cons:

  • Poor UX: cache option appears valueless
  • Conversion impact: users skip beneficial cache mode
  • Trust erosion: estimates always wrong in cached scenarios

Rejected: Defeats purpose of cache transparency.

Option D: Call Worker Service for Prediction

Add API endpoint that runs actual cache lookup without consuming quota.

Pros:

  • 100% accuracy
  • Single source of truth

Cons:

  • Additional latency: network delays modal rendering
  • API complexity: new endpoint to maintain
  • Failure mode: modal broken when Worker unavailable
  • Rate limiting: could be abused to probe cache state

Rejected: Over-engineering for prediction feature.

Implementation Details

API Endpoint

GET /api/spec-view/{analysisId}/cache-prediction?language={language}

Data Structure

typescript
interface CachePrediction {
  totalBehaviors: number;
  cacheableCount: number;
  estimatedCost: number;
}

SQL Query Structure

sql
WITH
  previous_spec AS (
    -- Find most recent spec for same codebase/language
    SELECT id FROM spec_documents
    WHERE ... ORDER BY created_at DESC LIMIT 1
  ),
  previous_behaviors AS (
    -- Extract behaviors with test names and file paths
    SELECT test_name, file_path FROM spec_behaviors
    WHERE spec_document_id = (SELECT id FROM previous_spec)
  ),
  current_tests AS (
    -- Get all test cases from current analysis
    SELECT name, file_path FROM test_cases
    WHERE analysis_id = $1
  ),
  matched_behaviors AS (
    -- Count where name AND file_path match
    SELECT COUNT(*) FROM current_tests c
    JOIN previous_behaviors p
      ON c.name = p.test_name AND c.file_path = p.file_path
  )
SELECT total_behaviors, cacheable_count;

Frontend Integration

typescript
const dynamicEstimatedCost = forceRegenerate
  ? estimatedCost
  : (cachePrediction?.estimatedCost ?? estimatedCost);

Consequences

Positive

  • User Experience: Estimates reflect cache benefit, enabling informed decisions
  • Cost Transparency: Users see approximate savings before committing quota
  • Implementation Speed: Simple approach shipped without cross-repo coordination
  • Resilience: No external dependencies; works when Worker unavailable
  • Maintainability: Web layer isolated from Worker's cache key evolution

Negative

  • Prediction Accuracy: ~1% of cases may show incorrect estimate
  • Algorithm Divergence: Intentional mismatch with Worker's actual algorithm
  • Limited Scope: Cannot predict cross-repository cache hits
  • User Expectation: Requires disclaimer to manage estimate != actual

Edge Cases

ScenarioPredictionRealityImpact
Test renamed, same pathCache missCache missCorrect
Test moved, same nameCache missPossible cache hitUnder-predicts savings
Test copied to new fileCache missCache hitUnder-predicts savings
Whitespace-only name changeCache hitCache hitCorrect

Edge cases result in under-prediction of savings (higher cost shown), which is safe for billing context.

References

Internal

External

Open-source test coverage insights