ADR-0017: Eager per-submission aggregation into a per-queue record
ACCEPTED
Context
The queue detail page shows per-field aggregates — mean/std-dev for numeric fields, value-count histograms for choice fields. The data is already shaped (Annotation.data keyed by the ADR-0015 schema fields) and apps.evaluations already has an aggregate_field helper that dispatches on value type. Two questions: when to compute (per submission / per render / scheduled) and where to store it. Dogfood queues run to hundreds of items with a handful of fields each — cheap enough to recompute eagerly, large enough that the detail page shouldn't scan all submissions per load.
Decision
Recompute eagerly on every submitted-annotation save, persisted in a denormalised AnnotationQueueAggregate:
- One row per queue —
OneToOneFieldtoAnnotationQueuewith a singleaggregatesJSONField, read viaselect_related("aggregate"). No aggregation in the request path. - Compute once per submission, outside the parent transaction. After the locked block that updates
review_countand item status (ADR-0016), the aggregator runs wrapped in a loggingtry/except— an aggregation failure must never roll back a reviewer's submission. Same resilience stance as the score writer (ADR-0013). - Per item, prefer the authoritative annotation, else all submitted. Once resolved, authoritative (ADR-0016) is the single contributor; pre-resolution every submission counts, so the admin sees the spread.
- Text fields are excluded — free-text has no meaningful aggregate and would flood the result with unique values.
- Edits recompute too (
update_or_createkeyed onqueue), so a revised answer never leaves stale numbers. Noneand missing keys are skipped, not zero-filled; explicit numeric zeros are kept.
Consequences
- Positive: The detail page reads one cached
JSONField— no hot-path aggregation, no N+1, no Celery roundtrip; renders are independent of queue size. - Positive: Recomputing on every submitted save (not just new rows) keeps the aggregate consistent with reviewer edits — same reason as ADR-0013.
- Positive: Swallowing aggregation errors outside the parent transaction means a future bad value type can't corrupt the submission or the UI; the aggregate retries on the next save.
- Negative: Recompute is O(items × annotations) per save — sub-millisecond at dogfood scale, but visible per-submission latency for a ~10,000-item queue. Incremental aggregation would need a structural change (the single
JSONFieldholds no per-item state). - Negative: A swallowed failure is visible only in the log — admins see a stale aggregate with no UI warning and must rebuild manually.
- Negative: Free-text fields don't appear on the aggregate panel; reviewer comments at scale need the CSV/JSONL export.
- Negative: Any path that creates a submitted annotation outside the normal save flow (data migrations, shell) bypasses the recompute; the aggregator is callable directly but must be remembered.
Alternatives considered
- Compute on read: rejected — a per-render scan of every submission, degrading non-linearly with queue size on the primary admin surface.
- Celery task via signal: rejected — operational overhead (queue, worker, retry) and harder-to-trace indirection for a sub-millisecond computation; the call site is one function if write latency ever forces a move.
- Store aggregates on
AnnotationQueue: rejected — the queue is fetched on the list page and would carry the payload everywhere; a separateOneToOneFieldkeeps the row lean. - Top-N for text fields: rejected for v1 — free-text rarely clusters into countable values; export covers inspection.
- Aggregate inside the parent transaction: rejected — a failure would roll back the reviewer's submission, trading a display issue for lost work (same trade-off as ADR-0013).