Skip to content

XY-845: [ELF benchmark suite] Add retrieval quality and hierarchical routing cases#138

Merged
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-845
Jun 9, 2026
Merged

XY-845: [ELF benchmark suite] Add retrieval quality and hierarchical routing cases#138
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-845

Conversation

@yvette-carlisle

@yvette-carlisle yvette-carlisle commented Jun 9, 2026

Copy link
Copy Markdown
Member

Summary

  • add real-world retrieval fixtures for alternate phrasing, distractors, multi-hop routing, current-vs-obsolete selection, minimal sufficient context, and stage-level explainability
  • extend real_world_job reports with expected evidence recall, irrelevant context ratio, latency/cost, and trace-stage attribution while preserving memory-evolution and work-resume report counters
  • add cargo make real-world-memory-retrieval plus benchmark docs/spec routing that keeps qmd/OpenViking reference-only unless adapters run

Validation

  • cargo test -p elf-eval --test real_world_job_benchmark
  • cargo make real-world-memory
  • cargo make real-world-memory-retrieval
  • cargo make fmt
  • cargo make lint-fix
  • cargo make checks
  • ELF_BASELINE_PROJECTS=ELF cargo make baseline-live-docker

@yvette-carlisle yvette-carlisle force-pushed the y/elf-xy-845 branch 3 times, most recently from cae7e14 to 1231baa Compare June 9, 2026 16:01
…hmark cases and report metrics","authority":"XY-845"}
@yvette-carlisle yvette-carlisle merged commit c1ea4f4 into main Jun 9, 2026
10 checks passed
@yvette-carlisle yvette-carlisle deleted the y/elf-xy-845 branch June 9, 2026 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant