Skip to content

XY-928: [ELF benchmark P1] Add OpenViking context-trajectory and hierarchy benchmark#188

Merged
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-928
Jun 11, 2026
Merged

XY-928: [ELF benchmark P1] Add OpenViking context-trajectory and hierarchy benchmark#188
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-928

Conversation

@yvette-carlisle

Copy link
Copy Markdown
Member

Summary

  • Add blocked OpenViking context_trajectory fixtures for staged retrieval, hierarchy selection, and recursive/context expansion.
  • Extend OpenViking live-baseline output with expected, matched, and missing evidence ids, and keep trajectory scoring blocked until same-corpus evidence ids match.
  • Update benchmark specs, reports, manifest counts, and claim-boundary tests so no ELF trajectory win/tie/loss is claimed without comparable artifacts.

Verification

  • bash -n scripts/live-baseline-benchmark.sh
  • cargo test -p elf-eval --test real_world_job_benchmark --all-features context_trajectory_fixtures_report_blocked_openviking_gates
  • cargo make real-world-memory
  • cargo make fmt
  • cargo make lint-fix
  • cargo make checks

…tory benchmark coverage","authority":"XY-928"}
@yvette-carlisle yvette-carlisle merged commit 548cf24 into main Jun 11, 2026
13 checks passed
@yvette-carlisle yvette-carlisle deleted the y/elf-xy-928 branch June 11, 2026 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant