diff --git a/CHANGELOG.md b/CHANGELOG.md index 2c10534..ae031aa 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,13 @@ Please choose versions by [Semantic Versioning](http://semver.org/). * MINOR version when you add functionality in a backwards-compatible manner, and * PATCH version when you make backwards-compatible bug fixes. +## Unreleased + +- feat!: selector mode is now the DEFAULT for `/coding:pr-review` and `/coding:code-review` (was opt-in via `--selector`/`selector` token); callers passing `standard` now get selector behavior — BREAKING CHANGE for any tooling that relied on `standard` triggering per-owner Task dispatch +- feat!: standard-mode per-owner Task dispatch removed — `owners_to_spawn` computation, per-owner Task prompt template, `funnel clean — no adjudication needed` standard-mode short-circuit, and `REVIEW_TIMING` per-owner instrumentation block deleted from both commands; full mode remains the per-owner deep sweep escape hatch +- docs: scenarios/002 and scenarios/004 retired (status: outdated) — superseded by scenarios/005 and scenarios/006 respectively; footer updated with retirement reason +- chore: acceptance.sh updated to assert the new dispatch reality: Selector mode (default) routing check added (section 1), per-owner block check tightened to full-mode-only assertions (section 2) + ## v0.21.0 - fix: ast-grep-runner.sh excludes `.git/` paths from both input file list and findings output — incident 2026-06-11 surfaced 274 stale agent-auditor findings from `.git/COMMIT_EDITMSG` during selector validation diff --git a/commands/code-review.md b/commands/code-review.md index f8e7294..953797c 100644 --- a/commands/code-review.md +++ b/commands/code-review.md @@ -1,6 +1,6 @@ --- allowed-tools: Task, Bash(git diff:+), Bash(git log:+), Bash(git status:+), Bash(git ls-files:+) -argument-hint: "[short|standard|full|selector] [directory]" +argument-hint: "[short|full|selector] [directory]" description: Perform a comprehensive code review of recent changes --- @@ -19,9 +19,8 @@ Perform a code review with configurable depth based on mode. Parse the first argument to determine mode: - If first arg is `short|quick|fast` → **Short mode** (manual review only) -- If first arg is `full|comprehensive|complete` → **Full mode** (all 13 agents) -- If first arg is `--selector` or `selector` → **Selector mode** (in-session classify + adjudicate, zero sub-agent spawns) -- Otherwise → **Standard mode** (4 core agents, default) +- If first arg is `full|comprehensive|complete` → **Full mode** (all agents, per-owner dispatch) +- Otherwise (including `standard`, `selector`, `--selector`, or no token) → **Selector mode (default)** (in-session classify + adjudicate, zero sub-agent spawns) Any remaining arguments are treated as the directory path. @@ -53,7 +52,7 @@ Store result for later: **3b. Run make precommit (Full mode only)** -Running the full test suite is CI's job; the review needs the result, not a re-run. In **Standard** and **Short** mode, skip this step entirely and include in the Step 5 report: "precommit skipped (standard mode) — CI covers lint+test". +Running the full test suite is CI's job; the review needs the result, not a re-run. In **Selector** and **Short** mode, skip this step entirely and include in the Step 5 report: "precommit skipped (selector mode) — CI covers lint+test". **Full mode only**: Check if Makefile exists and has a `precommit` target: 1. Use Read tool to check if Makefile exists (will error if missing) @@ -70,11 +69,11 @@ If Makefile doesn't exist or lacks `precommit` target, skip this step. If `make ### Step 4: Dispatcher — ast-grep funnel → findings-scoped LLM adjudication -Mirrors `commands/pr-review.md` Step 4. The funnel runs first (diff-scoped), then adjudication Tasks are spawned only for owners with findings or active judgment rules. **Standard mode**: zero LLM spawns when the funnel is clean and no judgment rules are active. **Full mode**: keeps today's behavior (all relevant owners + conditional agents). +Mirrors `commands/pr-review.md` Step 4. The funnel runs first (diff-scoped), then adjudicates findings in-session. **Selector mode (the default)**: in-session classify + adjudicate, zero sub-agent spawns. **Full mode**: keeps per-owner dispatch (all relevant owners + conditional agents). **Short mode**: skips Step 4 entirely. **Short Mode**: No agents — skip to Step 5. -**Early exit (standard mode)**: if NO changed file has extension `.go` or `.py` AND none matches `CHANGELOG.md`, `go.mod`, `LICENSE*`, `README.md`, `Makefile`, `pyproject.toml`, `k8s/**`, `agents/**`, `commands/**`, `skills/**`, `docs/**` — the diff cannot match any rule. Skip Step 4 entirely; note "Step 4 skipped: no rule-relevant files changed" in the report. One glance at the diff stat decides this — no tool calls needed. +**Early exit**: if NO changed file has extension `.go` or `.py` AND none matches `CHANGELOG.md`, `go.mod`, `LICENSE*`, `README.md`, `Makefile`, `pyproject.toml`, `k8s/**`, `agents/**`, `commands/**`, `skills/**`, `docs/**` — the diff cannot match any rule. Skip Step 4 entirely; note "Step 4 skipped: no rule-relevant files changed" in the report. One glance at the diff stat decides this — no tool calls needed. - BUT: if LICENSE missing AND repo is public, add to "Should Fix": - "Missing LICENSE file" - "README missing license section" (check with Grep for `## License` in README.md) @@ -103,9 +102,7 @@ RUNNER="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/scripts Run exactly this one Bash call, once. Emits `{stats, findings_by_owner: {: [...findings]}, errors}` — read it from `/tmp/code-review-findings.json`. Do NOT spawn an agent for this step (the former `coding:ast-grep-runner` agent is deprecated). If the runner is missing or fails: note "mechanical funnel unavailable" for the Step 5 report and continue with Step 4b using judgment-rule triggers only — do NOT investigate (no `find`, no `which`, no path probing). -#### 4b: Findings-scoped adjudication (standard mode) - -Compute the active judgment-rule set and dispatch owners selectively: +#### 4b: Findings-scoped candidate computation **Step 4b-i: Active judgment rules** — compute which judgment-tier rules are triggered by the diff. Run: @@ -129,81 +126,62 @@ jq -r --arg files "$CHANGED_FILES" ' ' rules/index.json ``` -This produces a list of ` ` pairs whose trigger globs match at least one changed file. Rules with `trigger: ["@commits"]` are always included. - -**Step 4b-ii: Dispatch set** — the set of owners to spawn is: +This produces a list of ` ` pairs whose trigger globs match at least one changed file. Rules with `trigger: ["@commits"]` are always included. This output feeds both Selector mode (Steps 4c-sel/4d-sel) and Full mode (per-owner dispatch). -``` -owners_to_spawn = (keys of findings_by_owner) ∪ (owners from active judgment rules) -``` +#### Selector mode (the default): Steps 4c-sel and 4d-sel -If `owners_to_spawn` is empty AND `findings_by_owner` is empty, report "funnel clean — no adjudication needed" and proceed to Step 5. ZERO LLM spawns. +Steps 4.0, 4a, and 4b-i run unchanged. Resolve the guide and execute Steps 4c-sel and 4d-sel from it — zero sub-agent spawns. -Otherwise, spawn ONE Task per owner in `owners_to_spawn` **concurrently**. Owners NOT in the set are NOT spawned — no exceptions in standard mode. Each Task prompt: +Run exactly this one command, once: +```bash +GUIDE="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/docs/selector-mode-guide.md" +[ -f "$GUIDE" ] || GUIDE="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/docs/selector-mode-guide.md" +[ -f "$GUIDE" ] && echo "GUIDE_OK: $GUIDE" || echo "GUIDE_MISSING" ``` -coding: agent: "TARGET_DIR=. - -Pre-filtered mechanical findings for you (from ast-grep-runner): -] JSON, or empty array if none> -Active judgment rules you own (from rules/index.json, triggered by this diff): - +If it prints `GUIDE_MISSING`: report "selector guide unavailable" as a **Must Fix toolchain failure** in Step 5 and STOP the selector path — do NOT continue with a mechanical-findings-only review presented as a complete selector review (a review without the judgment tier silently misses every judgment-tier rule; same fail-fast discipline as Step 4.0). Do NOT investigate further (no `find`, no `ls`, no path probing). -Adjudicate: for each mechanical finding, assign severity (Critical / Important / Optional), add a fix suggestion that cites the rule by ID. Drop any finding whose rule_id is not in the index — stale-walker bug, not your concern. +If it prints `GUIDE_OK`: Read the file at that path, then execute its **Step 4c-sel CLASSIFY** and **Step 4d-sel ADJUDICATE** with: -Also scan the diff for each active judgment rule listed above and report violations you find. Read only changed files relevant to those rules. +- **DIFF** = `git diff HEAD~1` (or directory diff as parsed in Step 1) +- **CANDIDATES** = the Step 4b-i ` ` output +- **MECHANICAL_FINDINGS** = `/tmp/code-review-findings.json` +- **Working directory** = the reviewed directory -Review changed code only." -``` +On the guide's short-circuit condition the report line is `selector clean — no adjudication needed`. Skip this section for short/full mode. -Run per-Owner dispatches **concurrently** — they're independent. +Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. -**Timing instrumentation** (mirror of `commands/pr-review.md` Step 4b): **Only when `REVIEW_TIMING=1` is set in the environment** — otherwise skip this instrumentation entirely. Record wall-time of each per-Owner dispatch as a structured event so the funnel's per-Owner ROI is measurable, not anecdotal: +#### Full mode: per-owner dispatch -```bash -ts_start=$(date +%s%3N) -# ... invoke coding: agent ... -ts_end=$(date +%s%3N) -echo "{\"event\":\"per_owner_adjudication\",\"owner\":\"\",\"findings_in\":,\"wall_ms\":$((ts_end - ts_start))}" >> /tmp/code-review-timing.jsonl -``` +**Full mode only** — skip this section in selector and short mode. -Roll-up summary after all owners return: +Compute the dispatch set from Step 4b-i and Step 4a findings: -```bash -# Filter to per_owner_adjudication events only — wc -l over-counts when the -# file has stale lines from a prior unclean run or the summary line itself. -total_ms=$(jq -s 'map(select(.event == "per_owner_adjudication") | .wall_ms) | add' /tmp/code-review-timing.jsonl) -owners_invoked=$(grep -c '"event":"per_owner_adjudication"' /tmp/code-review-timing.jsonl) -echo "{\"event\":\"per_owner_summary\",\"owners_invoked\":$owners_invoked,\"total_ms\":$total_ms}" >> /tmp/code-review-timing.jsonl +``` +owners_to_spawn = (keys of findings_by_owner) ∪ (owners from active judgment rules) ``` -Diagnostic only — operators read it to answer "is Owner X worth dispatching?" with data. Not part of the Step 5 user-facing report. `code-review.md` has no formal cleanup step (the command works against the current branch in-place; there is no review-worktree to remove), so end the run with `rm -f /tmp/code-review-timing.jsonl` to keep the file from accumulating stale lines across reviews. - -#### Selector mode: Steps 4c-sel and 4d-sel - -When the mode argument is `--selector` or `selector`: Steps 4.0, 4a, and 4b-i run unchanged. Skip Step 4b-ii. Resolve the guide and execute Steps 4c-sel and 4d-sel from it — zero sub-agent spawns. +If `owners_to_spawn` is empty AND `findings_by_owner` is empty, report "funnel clean — no adjudication needed" and proceed to Step 5. ZERO LLM spawns. -Run exactly this one command, once: +Otherwise, spawn ONE Task per owner in `owners_to_spawn` **concurrently** — they're independent. Each Task prompt: -```bash -GUIDE="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/docs/selector-mode-guide.md" -[ -f "$GUIDE" ] || GUIDE="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/docs/selector-mode-guide.md" -[ -f "$GUIDE" ] && echo "GUIDE_OK: $GUIDE" || echo "GUIDE_MISSING" ``` +coding: agent: "TARGET_DIR=. -If it prints `GUIDE_MISSING`: report "selector guide unavailable" as a **Must Fix toolchain failure** in Step 5 and STOP the selector path — do NOT continue with a mechanical-findings-only review presented as a complete selector review (a review without the judgment tier silently misses every judgment-tier rule; same fail-fast discipline as Step 4.0). Do NOT investigate further (no `find`, no `ls`, no path probing). +Pre-filtered mechanical findings for you (from ast-grep-runner): +] JSON, or empty array if none> -If it prints `GUIDE_OK`: Read the file at that path, then execute its **Step 4c-sel CLASSIFY** and **Step 4d-sel ADJUDICATE** with: +Active judgment rules you own (from rules/index.json, triggered by this diff): + -- **DIFF** = `git diff HEAD~1` (or directory diff as parsed in Step 1) -- **CANDIDATES** = the Step 4b-i ` ` output -- **MECHANICAL_FINDINGS** = `/tmp/code-review-findings.json` -- **Working directory** = the reviewed directory +Adjudicate: for each mechanical finding, assign severity (Critical / Important / Optional), add a fix suggestion that cites the rule by ID. Drop any finding whose rule_id is not in the index — stale-walker bug, not your concern. -On the guide's short-circuit condition the report line is `selector clean — no adjudication needed`. When the mode is anything else (short/standard/full), skip this section entirely. +Also scan the diff for each active judgment rule listed above and report violations you find. Read only changed files relevant to those rules. -Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. +Review changed code only." +``` #### 4c: Context-specific conventions @@ -218,6 +196,8 @@ Load these conventionally when the diff matches: #### 4d: Citation validation +**Full mode only** (selector mode's own citation call lives in the guide). Walk every finding from the per-owner agent reports and verify its `rule_id` field exists in `rules/index.json`. Drop findings citing missing IDs — they're hallucinations or stale-walker references. Log dropped findings to stderr. + ```bash coding:simple-bash-runner agent: "bash scripts/validate-citations.sh " ``` diff --git a/commands/pr-review.md b/commands/pr-review.md index 012a46d..c567392 100644 --- a/commands/pr-review.md +++ b/commands/pr-review.md @@ -1,6 +1,6 @@ --- allowed-tools: Task, Bash(git diff:+), Bash(git log:+), Bash(git status:+), Bash(git ls-files:+), Bash(git fetch:+), Bash(git worktree:+), Bash(git branch:+), Bash(rm -rf:+) -argument-hint: " [short|standard|full|selector]" +argument-hint: " [short|full|selector]" description: Review current branch diff against target branch (excludes vendor/node_modules) --- @@ -79,9 +79,8 @@ cd && git worktree remove /tmp/pr-review-- --for ### Step 1: Parse Mode Argument - `short|quick|fast` → **Short mode** (manual review only) -- `full|comprehensive|complete` → **Full mode** (all agents) -- `--selector` or `selector` → **Selector mode** (in-session classify + adjudicate, zero sub-agent spawns) -- Otherwise → **Standard mode** (4 core agents, default) +- `full|comprehensive|complete` → **Full mode** (all agents, per-owner dispatch) +- Otherwise (including `standard`, `selector`, `--selector`, or no token) → **Selector mode (default)** (in-session classify + adjudicate, zero sub-agent spawns) ### Step 2: Project Detection @@ -95,7 +94,7 @@ Detect project type in `REVIEW_DIR`: **3b. Run make precommit (Full mode only)** -Running the full test suite is CI's job; the review needs the result, not a re-run. In **Standard** and **Short** mode, skip this step entirely and include in the Step 5 report: "precommit skipped (standard mode) — CI covers lint+test". +Running the full test suite is CI's job; the review needs the result, not a re-run. In **Selector** and **Short** mode, skip this step entirely and include in the Step 5 report: "precommit skipped (selector mode) — CI covers lint+test". **Full mode only**: Check if `REVIEW_DIR/Makefile` exists and has `precommit` target. If yes: ``` @@ -106,11 +105,11 @@ Include failures in report. Continue regardless. ### Step 4: Dispatcher — ast-grep funnel → findings-scoped LLM adjudication -The dispatcher runs the full mechanical+script funnel first (diff-scoped), then spawns adjudication Tasks only for owners that have findings or active judgment rules. **Standard mode**: zero LLM spawns when the funnel is clean and no judgment rules are active. **Full mode**: keeps today's behavior (all relevant owners + conditional agents). +The dispatcher runs the full mechanical+script funnel first (diff-scoped), then adjudicates findings in-session. **Selector mode (the default)**: in-session classify + adjudicate, zero sub-agent spawns. **Full mode**: keeps per-owner dispatch (all relevant owners + conditional agents). **Short mode**: skips Step 4 entirely. **Short Mode**: No agents — skip to Step 5. -**Early exit (standard mode)**: if NO changed file has extension `.go` or `.py` AND none matches `CHANGELOG.md`, `go.mod`, `LICENSE*`, `README.md`, `Makefile`, `pyproject.toml`, `k8s/**`, `agents/**`, `commands/**`, `skills/**`, `docs/**` — the diff cannot match any rule. Skip Step 4 entirely; note "Step 4 skipped: no rule-relevant files changed" in the report. One glance at the Step 0c diff stat decides this — no tool calls needed. +**Early exit**: if NO changed file has extension `.go` or `.py` AND none matches `CHANGELOG.md`, `go.mod`, `LICENSE*`, `README.md`, `Makefile`, `pyproject.toml`, `k8s/**`, `agents/**`, `commands/**`, `skills/**`, `docs/**` — the diff cannot match any rule. Skip Step 4 entirely; note "Step 4 skipped: no rule-relevant files changed" in the report. One glance at the Step 0c diff stat decides this — no tool calls needed. #### 4.0: Toolchain preflight (fail-fast) @@ -136,9 +135,7 @@ RUNNER="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/scripts Run exactly this one Bash call, once. The runner emits `{stats, findings_by_owner: {: [...findings]}, errors}` — read it from `/tmp/pr-review-findings.json`. Do NOT spawn an agent for this step (the former `coding:ast-grep-runner` agent is deprecated). If the runner is missing or fails: note "mechanical funnel unavailable" for the Step 5 report and continue with Step 4b using judgment-rule triggers only — do NOT investigate (no `find`, no `which`, no path probing). -#### 4b: Findings-scoped adjudication (standard mode) - -Compute the active judgment-rule set and dispatch owners selectively: +#### 4b: Findings-scoped candidate computation **Step 4b-i: Active judgment rules** — compute which judgment-tier rules are triggered by the diff. Run: @@ -162,79 +159,62 @@ jq -r --arg files "$CHANGED_FILES" ' ' rules/index.json ``` -This produces a list of ` ` pairs whose trigger globs match at least one changed file. Rules with `trigger: ["@commits"]` are always included. +This produces a list of ` ` pairs whose trigger globs match at least one changed file. Rules with `trigger: ["@commits"]` are always included. This output feeds both Selector mode (Steps 4c-sel/4d-sel) and Full mode (per-owner dispatch). -**Step 4b-ii: Dispatch set** — the set of owners to spawn is: +#### Selector mode (the default): Steps 4c-sel and 4d-sel -``` -owners_to_spawn = (keys of findings_by_owner) ∪ (owners from active judgment rules) -``` +Steps 4.0, 4a, and 4b-i run unchanged. Resolve the guide and execute Steps 4c-sel and 4d-sel from it — zero sub-agent spawns. -If `owners_to_spawn` is empty AND `findings_by_owner` is empty, report "funnel clean — no adjudication needed" and proceed to Step 5. ZERO LLM spawns. - -Otherwise, spawn ONE Task per owner in `owners_to_spawn` **concurrently**. Owners NOT in the set are NOT spawned — no exceptions in standard mode. Each Task prompt: +Run exactly this one command, once: +```bash +GUIDE="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/docs/selector-mode-guide.md" +[ -f "$GUIDE" ] || GUIDE="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/docs/selector-mode-guide.md" +[ -f "$GUIDE" ] && echo "GUIDE_OK: $GUIDE" || echo "GUIDE_MISSING" ``` -coding: agent: "REVIEW_DIR=. -Pre-filtered mechanical findings for you (from ast-grep-runner): -] JSON, or empty array if none> +If it prints `GUIDE_MISSING`: report "selector guide unavailable" as a **Must Fix toolchain failure** in Step 5 and STOP the selector path — do NOT continue with a mechanical-findings-only review presented as a complete selector review (a review without the judgment tier silently misses every judgment-tier rule; same fail-fast discipline as Step 4.0). Do NOT investigate further (no `find`, no `ls`, no path probing). -Active judgment rules you own (from rules/index.json, triggered by this diff): - +If it prints `GUIDE_OK`: Read the file at that path, then execute its **Step 4c-sel CLASSIFY** and **Step 4d-sel ADJUDICATE** with: -Adjudicate: for each mechanical finding, assign severity (Critical / Important / Optional), add a fix suggestion that cites the rule by ID. Drop any finding whose rule_id is not in the index — stale-walker bug, not your concern. +- **DIFF** = the Step 0c diff (`git diff origin/...HEAD`) +- **CANDIDATES** = the Step 4b-i ` ` output +- **MECHANICAL_FINDINGS** = `/tmp/pr-review-findings.json` +- **Working directory** = `REVIEW_DIR` -Also scan the diff for each active judgment rule listed above and report violations you find. Read only changed files relevant to those rules. +On the guide's short-circuit condition the report line is `selector clean — no adjudication needed`. Skip this section for short/full mode. -Only review changed files from the diff. Exclude vendor/ and node_modules/." -``` +Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. -**Timing instrumentation**: **Only when `REVIEW_TIMING=1` is set in the environment** — otherwise skip this instrumentation entirely. Record the wall-time of each per-Owner dispatch as a structured event. Recommended shape — one log line per Owner before and after the agent runs: +#### Full mode: per-owner dispatch -```bash -ts_start=$(date +%s%3N) -# ... invoke coding: agent ... -ts_end=$(date +%s%3N) -echo "{\"event\":\"per_owner_adjudication\",\"owner\":\"\",\"findings_in\":,\"wall_ms\":$((ts_end - ts_start))}" >> /tmp/pr-review-timing.jsonl -``` +**Full mode only** — skip this section in selector and short mode. -After all per-Owner dispatches return, append a roll-up summary line: +Compute the dispatch set from Step 4b-i and Step 4a findings: -```bash -# Filter to per_owner_adjudication events only — wc -l over-counts when the -# file has stale lines from a prior unclean run or the summary line itself. -total_ms=$(jq -s 'map(select(.event == "per_owner_adjudication") | .wall_ms) | add' /tmp/pr-review-timing.jsonl) -owners_invoked=$(grep -c '"event":"per_owner_adjudication"' /tmp/pr-review-timing.jsonl) -echo "{\"event\":\"per_owner_summary\",\"owners_invoked\":$owners_invoked,\"total_ms\":$total_ms}" >> /tmp/pr-review-timing.jsonl +``` +owners_to_spawn = (keys of findings_by_owner) ∪ (owners from active judgment rules) ``` -The timing file `/tmp/pr-review-timing.jsonl` is purely diagnostic — it lets operators answer "is Owner X worth dispatching?" with data instead of intuition. Not part of the Step 5 user-facing report; include it in the cleanup step for the review worktree so it gets removed after Step 6. - -#### Selector mode: Steps 4c-sel and 4d-sel - -When the mode argument is `--selector` or `selector`: Steps 4.0, 4a, and 4b-i run unchanged. Skip Step 4b-ii. Resolve the guide and execute Steps 4c-sel and 4d-sel from it — zero sub-agent spawns. +If `owners_to_spawn` is empty AND `findings_by_owner` is empty, report "funnel clean — no adjudication needed" and proceed to Step 5. ZERO LLM spawns. -Run exactly this one command, once: +Otherwise, spawn ONE Task per owner in `owners_to_spawn` **concurrently**. Each Task prompt: -```bash -GUIDE="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/docs/selector-mode-guide.md" -[ -f "$GUIDE" ] || GUIDE="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/docs/selector-mode-guide.md" -[ -f "$GUIDE" ] && echo "GUIDE_OK: $GUIDE" || echo "GUIDE_MISSING" ``` +coding: agent: "REVIEW_DIR=. -If it prints `GUIDE_MISSING`: report "selector guide unavailable" as a **Must Fix toolchain failure** in Step 5 and STOP the selector path — do NOT continue with a mechanical-findings-only review presented as a complete selector review (a review without the judgment tier silently misses every judgment-tier rule; same fail-fast discipline as Step 4.0). Do NOT investigate further (no `find`, no `ls`, no path probing). +Pre-filtered mechanical findings for you (from ast-grep-runner): +] JSON, or empty array if none> -If it prints `GUIDE_OK`: Read the file at that path, then execute its **Step 4c-sel CLASSIFY** and **Step 4d-sel ADJUDICATE** with: +Active judgment rules you own (from rules/index.json, triggered by this diff): + -- **DIFF** = the Step 0c diff (`git diff origin/...HEAD`) -- **CANDIDATES** = the Step 4b-i ` ` output -- **MECHANICAL_FINDINGS** = `/tmp/pr-review-findings.json` -- **Working directory** = `REVIEW_DIR` +Adjudicate: for each mechanical finding, assign severity (Critical / Important / Optional), add a fix suggestion that cites the rule by ID. Drop any finding whose rule_id is not in the index — stale-walker bug, not your concern. -On the guide's short-circuit condition the report line is `selector clean — no adjudication needed`. When the mode is anything else (short/standard/full), skip this section entirely. +Also scan the diff for each active judgment rule listed above and report violations you find. Read only changed files relevant to those rules. -Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. +Only review changed files from the diff. Exclude vendor/ and node_modules/." +``` #### 4c: Context-specific conventions (kept from prior Step 2.5) @@ -251,7 +231,7 @@ Inside the YOLO container the docs are mounted at `/home/node/.claude/plugins/ma #### 4d: Citation validation -Before consolidating in Step 5, walk every finding from 4b's agent reports and verify its `rule_id` field exists in `rules/index.json`. Drop findings citing missing IDs — they're hallucinations or stale-walker references. Log dropped findings to stderr so the post-review smoke can detect drift. +**Full mode only** (selector mode's own citation call lives in the guide). Before consolidating in Step 5, walk every finding from the per-owner agent reports and verify its `rule_id` field exists in `rules/index.json`. Drop findings citing missing IDs — they're hallucinations or stale-walker references. Log dropped findings to stderr so the post-review smoke can detect drift. ```bash coding:simple-bash-runner agent: "bash scripts/validate-citations.sh " diff --git a/scenarios/002-clean-pr-zero-findings.md b/scenarios/002-clean-pr-zero-findings.md index 6a4933f..9982de6 100644 --- a/scenarios/002-clean-pr-zero-findings.md +++ b/scenarios/002-clean-pr-zero-findings.md @@ -1,5 +1,5 @@ --- -status: active +status: outdated --- # Scenario 002: Zero-violation PR in standard mode produces empty findings, no false positives @@ -38,6 +38,4 @@ Updated for funnel v2 (PR #48): runner output now lands in `/tmp/code-review-fin --- -**Status**: updated for funnel v2 (PR #48), not yet walked. Requires a live Claude Code session; mark as walked once a session produces the stream-json transcript. Key new assertions vs v1: (1) findings path is `/tmp/code-review-findings.json` not an awk-extracted stdout block; (2) "funnel clean — no adjudication needed" message asserted; (3) zero `coding:*-assistant` subagent spawns verified from transcript. - -Selector-mode sibling journey: scenarios/005. These two files keep guarding the legacy default path until the default flip retires it. +Outdated 2026-06-12: the default flipped to selector mode and the standard-mode per-owner dispatch was removed (full mode remains the per-owner deep sweep). Superseded by scenarios/005 (resp. 006). diff --git a/scenarios/004-findings-exist-path.md b/scenarios/004-findings-exist-path.md index 853fb00..9558590 100644 --- a/scenarios/004-findings-exist-path.md +++ b/scenarios/004-findings-exist-path.md @@ -1,5 +1,5 @@ --- -status: active +status: outdated --- # Scenario 004: `/coding:pr-review` against a real PR surfaces findings + adjudicates per Owner @@ -137,4 +137,4 @@ All 7 Expected items: **7/7 PASS** Baseline tuple: `(findings_count=6, distinct_rule_ids=6, distinct_owners=4)`. -Selector-mode sibling journey: scenarios/006. These two files keep guarding the legacy default path until the default flip retires it. +Outdated 2026-06-12: the default flipped to selector mode and the standard-mode per-owner dispatch was removed (full mode remains the per-owner deep sweep). Superseded by scenarios/005 (resp. 006). diff --git a/scripts/acceptance.sh b/scripts/acceptance.sh index 0a96a06..92bf514 100755 --- a/scripts/acceptance.sh +++ b/scripts/acceptance.sh @@ -47,12 +47,12 @@ else fail "short mode skip directive missing from pr-review.md or code-review.md" fi -# Standard mode invokes the deterministic runner script (Step 4a is mandatory for +# All modes (selector/full) invoke the deterministic runner script (Step 4a is mandatory for # the dispatcher). The former coding:ast-grep-runner agent is deprecated and must # NOT be invoked by either command. if grep -qE "scripts/ast-grep-runner\.sh" commands/code-review.md && \ grep -qE "scripts/ast-grep-runner\.sh" commands/pr-review.md; then - ok "standard mode invokes scripts/ast-grep-runner.sh in both dispatcher commands" + ok "ast-grep-runner.sh invoked in both dispatcher commands" else fail "scripts/ast-grep-runner.sh not referenced in pr-review.md or code-review.md" fi @@ -76,14 +76,24 @@ else fail "full mode references only $conditional_agents legacy-path agents (< 4) in code-review.md" fi +# Default/otherwise token must route to Selector mode in both commands. +if grep -qE "Selector mode \(the default\)|Selector mode \(default\)" commands/pr-review.md && \ + grep -qE "Selector mode \(the default\)|Selector mode \(default\)" commands/code-review.md; then + ok "default/otherwise token routes to Selector mode (default) in both commands" +else + fail "Selector mode (default) routing missing from pr-review.md or code-review.md" +fi + echo "=== 2/5 Per-language routing ===" -# Per-Owner adjudication block present (Step 4b) — the routing surface. -if grep -qE "Per-Owner adjudication|coding:|findings_by_owner" commands/code-review.md && \ - grep -qE "Per-Owner adjudication|coding:|findings_by_owner" commands/pr-review.md; then - ok "Step 4b per-Owner adjudication block present in both dispatcher commands" +# Full-mode per-owner dispatch block present — findings_by_owner feeds the dispatch set and +# coding: agent prompts reference it. Per-owner dispatch lives in full mode only +# (standard/selector path removed); the check asserts what remains true post-flip. +if grep -qE "coding:|findings_by_owner" commands/code-review.md && \ + grep -qE "coding:|findings_by_owner" commands/pr-review.md; then + ok "full-mode per-owner dispatch block (findings_by_owner / coding:) present in both dispatcher commands" else - fail "Step 4b per-Owner adjudication block missing from pr-review.md or code-review.md" + fail "full-mode per-owner dispatch block missing from pr-review.md or code-review.md" fi # Every owner referenced in rules/index.json must have a matching agent file in agents/.