Skip to content

distant_pairs (knowledge/pairs) ~7.85s at prod scale — exceeds Theme 2 <2s target (bounded-by-design outlier) #202

Description

@mkreyman

Summary

GET /api/v1/knowledge/pairs (distant_pairs) consistently returns in ~7.85–7.97s at production scale (Second Brain, 76,107 published articles). This is the slowest knowledge endpoint by an order of magnitude and exceeds Epic 27 Theme 2's stated acceptance bar — "All vector endpoints return <2s at prod scale."

It is not the unbounded full-scan failure mode Theme 2 was created to kill (that class is fixed: search ~0.56s, combined ~0.69s, suggested_links ~0.28s, novelty ~0.63s, all live). distant_pairs is correct and bounded by design@max_pair_candidates 1000 + statement_timeout, and US-27.7b's scale gate asserts index-correctness, not wall-clock. So this is a latency-vs-target gap on a deliberately heavy endpoint, filed as the explicit Theme 2 follow-up rather than left implicit.

Evidence (live, deployed build — loopctl.com)

GET /api/v1/knowledge/pairs?limit=10&project_id=<second-brain>
  run 1: HTTP=200  time=7.971622s
  run 2: HTTP=200  time=7.880950s
  run 3: HTTP=200  time=7.849783s

Same-corpus comparison (all <1s):

endpoint latency
knowledge/search (semantic, limit 10) 0.56s
knowledge/search (combined, limit 10) 0.69s
knowledge/articles/:id/suggested_links 0.28s
knowledge/novelty (POST) 0.63s
knowledge/pairs (distant_pairs) ~7.85s

Corpus at time of measurement: total_count = 77051 (meta), ~76k published.

Why it's slow (current bound)

distant_pairs samples up to @max_pair_candidates (1000) embedded published articles and computes pairwise cosine distances over that sample (lib/loopctl/knowledge.ex ~L148–151, do_distant_pairs/7). That sampling keeps the inner scan bounded (no full-corpus pairwise blowup) and index-correct, but 1000 candidates → ~the work still lands near the per-endpoint statement_timeout, well above the 2s Theme 2 bar.

Acceptance criteria

  • Decide the intended contract for distant_pairs: either (a) bring it under the Theme 2 <2s bar at prod scale, or (b) explicitly exempt it with a documented, higher latency budget and a scale-gate assertion that enforces that budget (so it can't silently regress further).
  • If (a): reduce the default @max_pair_candidates and/or move the pairwise distance into an index-assisted/sampled-kNN shape; re-measure < 2s at 76k.
  • If (b): document the budget in the endpoint's OpenAPI description + the Theme 2 notes, and add a per-endpoint latency assertion to the US-27.8 scale gate pinned to the agreed budget.
  • meta for distant_pairs surfaces the effective candidate cap so callers understand the sampling.
  • Live re-verification on the deployed build at prod scale (per Theme 1 discipline), not just the test DB.

Notes

Refs: #175 (Epic 27 · Theme 2), US-27.7b (#193), US-27.8 (#198).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions