Skip to content

XY-883: [ELF benchmark P2] Repair first-generation OSS adapter real-world coverage#170

Merged
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-883
Jun 10, 2026
Merged

XY-883: [ELF benchmark P2] Repair first-generation OSS adapter real-world coverage#170
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-883

Conversation

@yvette-carlisle

Copy link
Copy Markdown
Member

Summary

  • isolate qmd, memsearch, and mem0 live-baseline corpus mutation with per-adapter corpus copies
  • harden claude-mem live-baseline coverage with durable SQLite lifecycle, reopen, and detail/source hydration checks
  • update the external adapter manifest, report assertions, and benchmark runbook so local OSS behavior stays distinct from hosted or not-encoded claims

Verification

  • bash -n scripts/live-baseline-benchmark.sh
  • jq . apps/elf-eval/fixtures/real_world_external_adapters/memory_projects_manifest.json >/dev/null
  • cargo test -p elf-eval --test real_world_job_benchmark real_world_report_includes_external_adapter_coverage_manifest --all-features
  • cargo make real-world-memory
  • ELF_BASELINE_PROJECTS=qmd,memsearch,mem0,claude-mem cargo make baseline-live-docker
  • ELF_BASELINE_PROJECTS=claude-mem cargo make baseline-live-docker
  • cargo make fmt
  • cargo make lint-fix
  • cargo make checks

…dapter benchmark coverage","authority":"XY-883"}
@yvette-carlisle yvette-carlisle merged commit dfae7e5 into main Jun 10, 2026
13 checks passed
@yvette-carlisle yvette-carlisle deleted the y/elf-xy-883 branch June 10, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant