Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions .github/workflows/ai-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
name: AI Review

on:
# Seed / refresh the status when the PR is opened, pushed to, or marked ready.
# This is the trigger we've actually seen fire, so it carries the workflow even
# if check_run events don't reach it.
pull_request:
types: [ opened, synchronize, reopened, ready_for_review ]
# Re-evaluate the moment a reviewer's check changes (so we don't wait for the
# next push). Filtered to our reviewers by name in the job-level `if`.
check_run:
types: [ created, completed, rerequested ]
merge_group:

permissions:
checks: read
statuses: write

jobs:
evaluate:
# NOTE: the 'Bugbot' / 'Claude' tokens below (and the keys in the script) must
# appear in the reviewers' real check-run names. Case-insensitive substring match,
# so suffixes like "Claude Code Review / review" or casing tweaks are fine.
if: >
github.event_name == 'merge_group' ||
(github.event_name == 'pull_request' && github.event.pull_request.draft == false) ||
(github.event_name == 'check_run' &&
(contains(github.event.check_run.name, 'Bugbot') ||
contains(github.event.check_run.name, 'Claude')))
# Serialize every run for the same head SHA so the last run wins
# (commit statuses are last-write-wins).
concurrency:
group: >-
ai-review-${{ github.event.check_run.head_sha
|| github.event.pull_request.head.sha
|| github.event.merge_group.head_sha }}
cancel-in-progress: false
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
Comment thread
claude[bot] marked this conversation as resolved.
- name: Generate GitHub App token
id: generate-token
uses: actions/create-github-app-token@v2
with:
app-id: ${{ secrets.PLATFORM_CODE_AGENT_APP_ID }}
private-key: ${{ secrets.PLATFORM_CODE_AGENT_APP_PK }}
Comment thread
claude[bot] marked this conversation as resolved.
- uses: actions/github-script@v9
with:
Comment thread
cursor[bot] marked this conversation as resolved.
github-token: ${{ steps.generate-token.outputs.token }}
script: |
Comment thread
claude[bot] marked this conversation as resolved.
// Match each reviewer by a distinctive (case-insensitive) substring of
// its check-run name, so a suffix/casing tweak still resolves. Tighten a
// key if it ever collides with an unrelated check.
const REVIEWERS = [
{ label: "Cursor Bugbot", key: "bugbot" },
{ label: "Claude Code Review", key: "claude" },
]
const CONTEXT = "AI Review"
const PASSING = ["success"]
const ADVISORY = ["neutral", "skipped"] // accepted, non-blocking
const REJECTING = ["failure", "action_required"] // genuine "changes needed"

// Head SHA the reviewers ran against. For check_run events the workflow
// file is read from the default branch, but this SHA still points at the
// PR head, so the status lands in the right place.
const sha =
context.payload.check_run?.head_sha ??
context.payload.pull_request?.head?.sha ??
context.payload.merge_group?.head_sha

const targetUrl =
`${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`
const setStatus = (state, description) =>
github.rest.repos.createCommitStatus({
owner: context.repo.owner, repo: context.repo.repo, sha,
state, context: CONTEXT,
description: description.slice(0, 140), target_url: targetUrl,
})

// Merge queue: reviewers don't re-run here, so just pass.
if (context.eventName === "merge_group") {
await setStatus("success", "Merge queue — auto-pass")
return
}

// One pass over every check on the commit, then pick the latest per
// reviewer (most recently started wins).
const allRuns = await github.paginate(github.rest.checks.listForRef, {
owner: context.repo.owner, repo: context.repo.repo, ref: sha, per_page: 100,
})
const at = r => new Date(r.started_at ?? r.updated_at ?? r.created_at ?? 0)
const latestFor = key =>
allRuns.filter(r => r.name.toLowerCase().includes(key)).sort((a, b) => at(b) - at(a))[0]
const runs = REVIEWERS.map(r => latestFor(r.key))
Comment on lines +91 to +94

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The fallback chain r.started_at ?? r.updated_at ?? r.created_at ?? 0 at line 91 references fields that don't exist on the GitHub check_run REST schema — only started_at and completed_at are top-level on check_run; created_at/updated_at live on check_suite — so the chain effectively collapses to r.started_at ?? 0. When a maintainer clicks Re-run, GitHub creates a new check_run with status: queued and started_at: null, so at(newRun) = new Date(null) = epoch and the older completed run wins the descending sort; latestFor returns the stale run with its prior conclusion for the entire rerun window. Combined with the merge_group auto-pass at lines 81-84, a maintainer-triggered rerun + queue-for-merge can land the PR with a stale success before the rerun verdict arrives. Fix: prefer non-completed runs when one exists for the same reviewer, or use check_run.id (monotonically increasing per repo) as the tiebreaker.

Extended reasoning...

What the bug is

Line 91 defines the sort key for picking the latest check_run per reviewer:

const at = r => new Date(r.started_at ?? r.updated_at ?? r.created_at ?? 0)

The fallback chain was added in response to inline comment 3372143715 (and is the same shape the author proposed for the rerun-staleness problem cursor[bot] flagged in 3363611828). But per the GitHub Checks Runs REST API schema, a check_run object only has these top-level fields: id, head_sha, node_id, external_id, url, html_url, details_url, status, conclusion, started_at, completed_at, output, name, check_suite, app, pull_requests, deployment. There is no top-level created_at or updated_at — those fields exist on the nested check_suite resource. So r.updated_at and r.created_at evaluate to undefined for every run and the chain reduces to r.started_at ?? 0.

Why this matters during reruns

When a maintainer clicks Re-run on Cursor Bugbot or Claude Code Review, GitHub creates a fresh check_run with status: queued and started_at: null (per the spec, started_at is nullable on check_run; cursor[bot] confirmed this exact behavior in inline 3363611828: "a freshly rerequested run is often queued with a null started_at"). The prior completed run is still returned by listForRef. Now the sort runs:

  • at(newQueuedRerun) = new Date(null).getTime() = 0 (epoch)
  • at(oldCompletedRun) = valid started_at timestamp (≫ 0)
  • Descending sort: old completed run wins

latestFor returns the OLD run. Line 102 sees run.status === 'completed' (the old one), and the script falls through to whichever verdict branch matches the OLD run's conclusion. The newly-queued rerun is invisible to the script during its entire queued+in_progress window, and the GitHub webhook does not emit a check_run event for the queued→in_progress transition, so the workflow has no other opportunity to re-evaluate until the rerun completes.

Why existing code doesn't prevent it

The author clearly intended the fallback chain to handle this case — it was added specifically in response to the queued-started_at-is-null problem cursor[bot] flagged. The fix landed but doesn't work because the chosen fallback fields are not in the check_run schema. This is a near-miss, not a missed-case: the structural intent is correct, the field names are wrong.

Impact

Once AI Review is wired into branch protection as the consolidated required gate (the explicit goal of this PR), the rerun-window staleness has two failure modes:

  • Stale failure / action_required → gate stays red. Annoying but recoverable: the rerun completes, fires check_run.completed, the workflow re-evaluates and posts the correct verdict.
  • Stale success / neutral → gate stays green during rerun. This is the dangerous one. Lines 81-84 auto-pass on merge_group events without consulting reviewer state at all. So a maintainer who triggered a rerun and then immediately queues the PR for merge can have the merge queue fire, see AI Review = success from the stale verdict, auto-pass merge_group's own evaluation, and land the PR before the rerun's verdict arrives. If the rerun would have surfaced a regression that the prior run missed, it merges into main with the gate showing green throughout.

Step-by-step proof (stale success → unsafe merge)

  1. t=0 — Both reviewers complete success on commit X. The script writes AI Review = success.
  2. t=5m — Maintainer clicks Re-run on Cursor Bugbot via the Checks UI. GitHub creates check_run #N+1 named Cursor Bugbot on head_sha = X with status: queued, started_at: null, completed_at: null. check_run.created fires.
  3. t=5m+ε — The job runs (the [created, rerequested] subscription wakes it). listForRef returns both the old completed Bugbot run AND the new queued one. The script computes at(new) = new Date(null) = epoch and at(old) = valid started_at > 0. Descending sort: old wins. latestFor('bugbot') returns the OLD completed run.
  4. t=5m+ε — Line 102: run.status === 'completed' ✓. Line 104: REJECTING.includes('success') false. Line 106: PASSING.includes('success') ✓ — script treats it as passing. With both reviewers passing per the OLD state, line 123 runs and posts AI Review = success (or, if the pending-overwrite branch is hit instead, it remains at the prior success). Either way: gate is green.
  5. t=6m — Maintainer queues PR for merge. merge_group fires. Lines 81-84 auto-pass with setStatus("success", "Merge queue — auto-pass").
  6. t=6m+ε — Merge proceeds. Bugbot rerun has not yet completed.
  7. t=8m — Bugbot rerun completes with failure (caught a regression the prior run missed). Too late — already in main.

The runtime semantics of new Date(null).getTime() === 0 are unambiguous JS: new Date(null) equals Thu Jan 01 1970 00:00:00 GMT+0000, and Math.max(0, anyPositiveTimestamp) always picks the positive value. The trace above follows directly.

How to fix it

Several viable options, in roughly increasing order of robustness:

  1. Prefer non-completed runs when both exist for a reviewer — most directly captures the intent ("a rerun is in flight, ignore the prior verdict"):
const latestFor = key => {
  const matches = allRuns.filter(r => r.name.toLowerCase().includes(key))
  const inFlight = matches.find(r => r.status !== 'completed')
  if (inFlight) return inFlight
  return matches.sort((a, b) => new Date(b.started_at ?? 0) - new Date(a.started_at ?? 0))[0]
}
  1. Sort by check_run.id as a stable tiebreaker — check_run ids are monotonically increasing per repo, so the newest run always wins regardless of started_at nullity:
const at = r => [new Date(r.started_at ?? 0).getTime(), r.id]
const latestFor = key => allRuns
  .filter(r => r.name.toLowerCase().includes(key))
  .sort((a, b) => { const [ta, ia] = at(a), [tb, ib] = at(b); return tb - ta || ib - ia })[0]
  1. Replace the broken fallback with completed_at (which IS on the schema) — narrower fix but only useful as a tiebreaker because queued runs have completed_at: null too.

Any of these unsticks the rerun-staleness case the original fallback chain was meant to address.


const blockers = [] // posted a blocking verdict
const advisory = [] // accepted but non-blocking (neutral/skipped)
const waiting = [] // not finished, or finished without a usable verdict
REVIEWERS.forEach(({ label }, i) => {
const run = runs[i]
const c = run?.conclusion
if (!run || run.status !== "completed") {
waiting.push(`${label} (${run ? "running" : "not started"})`)
} else if (REJECTING.includes(c)) {
blockers.push(label)
} else if (PASSING.includes(c)) {
// all good
} else if (ADVISORY.includes(c)) {
advisory.push(c === "neutral" ? label : `${label} (${c})`)
} else {
waiting.push(`${label} (${c ?? "unknown"}; rerun needed)`)
}
})

// FAILURE: at least one reviewer wants changes.
if (blockers.length) {
const verb = blockers.length > 1 ? "have" : "has"
await setStatus("failure", `Nope. ${blockers.join(", ")} ${verb} blocking notes`)
return
}

// SUCCESS: everyone finished and nothing is blocking.
if (!waiting.length) {
await setStatus("success",
advisory.length ? `Not gonna block it — ${advisory.join(", ")}` : "Good to go — all clear")
return
}

// PENDING: still waiting on a reviewer.
const done = REVIEWERS.length - waiting.length
await setStatus("pending", `Review pending (${done}/${REVIEWERS.length} done) — ${waiting.join(", ")}`)
2 changes: 2 additions & 0 deletions .github/workflows/ecr.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
name: ECR

BREAK AND SEE IF AI CATCHES THIS!!!!

Check failure on line 3 in .github/workflows/ecr.yml

View check run for this annotation

Claude / Claude Code Review

ecr.yml broken by stray top-level scalar

The PR adds the literal line `BREAK AND SEE IF AI CATCHES THIS!!!!` at `.github/workflows/ecr.yml:3` between the `name:` and `on:` keys. This is a bare scalar at the document root mixed with mapping entries, so it is not valid YAML and GitHub Actions will mark `ecr.yml` as 'Invalid workflow file' — no `sei-chain` container images (including `mock-` and `mock_block_validation-` variants) will be built or pushed to ECR on push to `main`/`release/**` or via workflow_dispatch after merge. Remove lin

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental debug line in workflow

High Severity

A non-comment line BREAK AND SEE IF AI CATCHES THIS!!!! was added at the top level of .github/workflows/ecr.yml. That is not valid workflow syntax, so GitHub is likely to reject or fail to load the ECR workflow, blocking pushes to main and release/** that depend on it.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 74dd752. Configure here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The PR adds the literal line BREAK AND SEE IF AI CATCHES THIS!!!! at .github/workflows/ecr.yml:3 between the name: and on: keys. This is a bare scalar at the document root mixed with mapping entries, so it is not valid YAML and GitHub Actions will mark ecr.yml as 'Invalid workflow file' — no sei-chain container images (including mock- and mock_block_validation- variants) will be built or pushed to ECR on push to main/release/** or via workflow_dispatch after merge. Remove lines 2–3 before merging.

Extended reasoning...

What the bug is

.github/workflows/ecr.yml line 3 contains the bare unquoted text BREAK AND SEE IF AI CATCHES THIS!!!! (no leading #), sandwiched between the mapping keys name: ECR on line 1 and on: on line 5. The document root is a YAML mapping (its keys are name, on, jobs), so a non-mapping scalar at the same indentation level is a syntax error — a YAML document cannot mix a bare scalar with mapping entries in the same context.

How it manifests

Running a YAML parser against the file fails:

$ python3 -c "import yaml; yaml.safe_load(open(".github/workflows/ecr.yml"))"
yaml.scanner.ScannerError: while scanning a simple key
  ...
could not find expected ":" in .github/workflows/ecr.yml, line 3, column 1

GitHub Actions uses the same YAML semantics at workflow load time, so when this lands on main/release/**, the Actions tab will show Invalid workflow file for ecr.yml and the workflow will refuse to dispatch.

Why existing code does not prevent it

The line is a top-level addition with no surrounding guard. The workflow is otherwise correct, but a single malformed line at the document root is enough for GitHub to reject the entire file — there is no partial parsing or "skip bad lines" fallback.

Impact

.github/workflows/ecr.yml is the workflow that produces and publishes the sei-chain Docker images to ECR. It triggers on:

  • push to main and release/** branches
  • workflow_dispatch

For each push, it builds and pushes three image tags to {ecr-registry}/sei/sei-chain:

  • <sha> (production)
  • mock-<sha> (mock balances build)
  • mock_block_validation-<sha> (mock block-validation build)

Once this PR merges, every push to main/release/** will silently fail to dispatch ecr.yml. No new container images will be built or pushed. Anything downstream that pulls {ecr-registry}/sei/sei-chain:<sha> for a post-merge commit will not find the image, and deployment/CI flows depending on those tags break until a follow-up PR removes the line.

Why this happened and how to fix

The latest commit on this PR is 74dd752 Test neagtive case with AI review. The commit message is explicit: this line was inserted intentionally to test whether the new consolidated AI Review gate this PR introduces catches such regressions. It must not land. cursor[bot] independently flagged the same line on the same commit (inline 3373932492, "Accidental debug line in workflow", High Severity).

Fix: delete lines 2–3 of .github/workflows/ecr.yml (the blank line and the BREAK line) so the file goes back to:

name: ECR

on:
  push:
    ...

Step-by-step proof

  1. GitHub event: a commit lands on main after this PR merges.
  2. GitHub Actions loads .github/workflows/ecr.yml.
  3. The YAML parser reaches line 3 (BREAK AND SEE IF AI CATCHES THIS!!!!).
  4. Line 1 (name: ECR) established the document root as a mapping. Line 3 begins at column 1 with a non-mapping token (no : follows on the same logical line; the next token is on: on line 5 after a blank line).
  5. Parser raises ScannerError: could not find expected ":" in .github/workflows/ecr.yml, line 3, column 1.
  6. GitHub Actions surfaces "Invalid workflow file" and refuses to dispatch ecr.yml for this commit.
  7. None of the three image tags (sei-chain:<sha>, sei-chain:mock-<sha>, sei-chain:mock_block_validation-<sha>) is built or pushed to ECR.
  8. The same outcome repeats for every subsequent push to main/release/** and every workflow_dispatch invocation, until a follow-up PR removes the line.

🔬 also observed by cursor


on:
push:
branches:
Expand Down
Loading