Skip to content

XY-850: [ELF benchmark suite] Add production ops, private corpus, and long-run scale cases#142

Merged
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-850
Jun 9, 2026
Merged

XY-850: [ELF benchmark suite] Add production ops, private corpus, and long-run scale cases#142
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-850

Conversation

@yvette-carlisle

Copy link
Copy Markdown
Member

Summary

  • Add private production addendum, 10k backfill, guarded 100k backfill, and opt-in soak cargo-make profiles.
  • Extend live-baseline reports with latency percentiles, cost proxy, resource envelope, duplicate/resume fields, and operational case guidance.
  • Tighten production corpus manifest label validation so report-visible IDs stay sanitized.
  • Document the opt-in/guarded paths and preserve the no-private-pass boundary without an operator-owned manifest.

Verification

  • cargo make fmt
  • cargo make lint-fix
  • cargo make checks
  • docker compose -f docker-compose.baseline.yml config >/dev/null
  • cargo make baseline-production-private fail-closed without ELF_BASELINE_PRODUCTION_CORPUS_MANIFEST
  • cargo make baseline-production-private-addendum fail-closed without ELF_BASELINE_PRODUCTION_CORPUS_MANIFEST
  • cargo make baseline-backfill-100k-docker fail-closed without ELF_BASELINE_ENABLE_EXPENSIVE=1
  • ELF_BASELINE_PRODUCTION_CORPUS_MANIFEST=apps/elf-eval/fixtures/production_corpus/synthetic_coding_agent_manifest.json ELF_BASELINE_PROJECTS=ELF ELF_BASELINE_ELF_TIMEOUT_SECONDS=1200 ELF_BASELINE_MAX_ELF_SECONDS=1200 cargo make baseline-production-private-addendum
  • semantic drift audit: changed docs claims matched Makefile tasks, scripts, report output, and manifest validation
  • cargo make baseline-live-docker-clean

…uction ops and scale profiles","authority":"XY-850"}
@yvette-carlisle yvette-carlisle merged commit 1f445ff into main Jun 9, 2026
10 checks passed
@yvette-carlisle yvette-carlisle deleted the y/elf-xy-850 branch June 9, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant