Add Anchored Review Format standard for AI review systems by dangng2004 · Pull Request #106 · ChicagoHAI/OpenAIReview

dangng2004 · 2026-07-01T06:23:42Z

Summary

Introduces standard/, an open output standard for AI paper-review systems so a conformant system plugs into the benchmark with no per-system adapter.

Payload: a review is a list of comments, each with a verbatim quote and an explanation. Those are the only fields scoring depends on; everything else (title, severity, paragraph_index, paper metadata) is optional.
Two integration profiles:
- profile-cli.md — open systems the benchmark runs locally (one command, paper in, payload out).
- profile-api.md — closed systems exposed as an async submit-and-poll HTTP API.
validate.py — stdlib-only conformance checker (errors vs warnings, --strict, 0/1 exit).
reference/review_client.py — stdlib-only caller for the API profile that doubles as a conformance test a system can run against its own staging endpoint.
examples/ — minimal, full, and an intentionally-invalid payload.

Scope is one concern: adds only standard/. No changes to the reviewer package, the benchmark, or the adapters.

Follow-ups

Make the openaireview web backend (openaireview-web-backend) conformant so our own hosted system dogfoods the standard and the reference client runs against it end to end. It is already async submit-and-poll, but today it uses /review + /status|/results, a token field, multipart+email submission, and a nested methods.*.comments body, so it does not yet conform. The fix is a thin POST /v1/reviews + GET /v1/reviews/{id} pair that reuses the existing worker and returns the flat payload.

An open output standard so conformant review systems plug into the benchmark with no per-system adapter. A review is a list of comments, each with a verbatim quote and an explanation (the only fields scoring depends on), with everything else optional. Includes two integration profiles (CLI for systems the benchmark runs locally, hosted API for closed systems), a stdlib-only validator, a stdlib-only reference client for the API profile, and minimal/full/invalid examples. Adds only standard/; no changes to the reviewer package, benchmark, or adapters. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dangng2004 requested a review from chenhaot July 1, 2026 06:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Anchored Review Format standard for AI review systems#106

Add Anchored Review Format standard for AI review systems#106
dangng2004 wants to merge 1 commit into
mainfrom
add-anchored-review-standard

dangng2004 commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dangng2004 commented Jul 1, 2026

Summary

Follow-ups

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant