Skip to content

XY-898: [ELF benchmark vNext P3] Promote first-generation OSS memory baselines into real-world adapters#179

Merged
yvette-carlisle merged 7 commits into
mainfrom
y/elf-xy-898
Jun 11, 2026
Merged

XY-898: [ELF benchmark vNext P3] Promote first-generation OSS memory baselines into real-world adapters#179
yvette-carlisle merged 7 commits into
mainfrom
y/elf-xy-898

Conversation

@yvette-carlisle

@yvette-carlisle yvette-carlisle commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

  • Adds first-generation OSS adapter scenario judgments with typed statuses and ELF position summaries (wins, ties, loses, untested).
  • Updates agentmemory, mem0/OpenMemory local OSS, memsearch, and claude-mem evidence so basic baseline smoke is separated from real-world/history/UI/continuity claims.
  • Refreshes benchmark reports, measurement audit artifacts, and strength matrix wording around the 2026-06-11 scoped baseline evidence.

Validation

  • cargo test -p elf-eval --test real_world_job_benchmark --all-features external_adapter -- --nocapture
  • cargo make real-world-memory
  • cargo make real-world-memory-live-adapters
  • ELF_BASELINE_PROJECTS=ELF,agentmemory,mem0,memsearch,claude-mem cargo make baseline-live-docker completed and preserved typed non-pass competitor states
  • JSON/report consistency checks with jq -e, stale-token search, and git diff --check
  • cargo make fmt
  • cargo make lint-fix
  • cargo make checks

…memory baselines into scenario evidence","authority":"XY-898"}
…ter scenario evidence","authority":"XY-898"}
…apter evidence with main","authority":"XY-898"}

# Conflicts:
#	apps/elf-eval/src/bin/real_world_job_benchmark.rs
#	apps/elf-eval/tests/real_world_job_benchmark.rs
#	docs/guide/benchmarking/2026-06-11-competitor-strength-evidence-matrix.md
#	docs/guide/benchmarking/2026-06-11-elf-iteration-direction-from-competitor-benchmarks.md
#	docs/guide/benchmarking/2026-06-11-measurement-coverage-audit.md
#	docs/guide/benchmarking/index.md
#	docs/research/2026-06-11-measurement-coverage-audit.json
#	docs/research/2026-06-11-xy-897-competitor-strength-matrix.json
@yvette-carlisle yvette-carlisle merged commit 8423079 into main Jun 11, 2026
13 checks passed
@yvette-carlisle yvette-carlisle deleted the y/elf-xy-898 branch June 11, 2026 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant