Skip to content

XY-849: [ELF benchmark suite] Add operator debugging UX cases#140

Merged
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-849
Jun 9, 2026
Merged

XY-849: [ELF benchmark suite] Add operator debugging UX cases#140
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-849

Conversation

@yvette-carlisle

Copy link
Copy Markdown
Member

Summary

  • Add the operator_debugging_ux real-world job suite with five fixtures covering dropped evidence, rerank promotion, provider latency/failure, rebuild result changes, and misleading relation context.
  • Extend the benchmark runner and reports with operator-debug evidence, trace/viewer links, raw SQL avoidance, dropped-candidate visibility, trace completeness, and repair clarity scoring.
  • Add read-only viewer support for direct trace_id links and document the operator UX report.

Validation

  • cargo make real-world-job-operator-ux
  • cargo make real-world-job-smoke
  • cargo test -p elf-eval --test real_world_job_benchmark
  • cargo test -p elf-api admin_viewer_is_admin_prefixed_and_read_only
  • cargo make fmt
  • cargo make lint-fix
  • cargo make checks
  • Semantic drift audit: pass

@yvette-carlisle yvette-carlisle merged commit df336aa into main Jun 9, 2026
10 checks passed
@yvette-carlisle yvette-carlisle deleted the y/elf-xy-849 branch June 9, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant