From 0c4ef8d77d00b445ac5db82f1e59f40ebee9cfeb Mon Sep 17 00:00:00 2001 From: Benjamin Borbe Date: Thu, 25 Jun 2026 23:01:44 +0200 Subject: [PATCH] feat: extract coding:architecture-dimensions-assistant agent + guide MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes PR #59's MAJOR finding ("dimensions pass routes to generic claude with inline agent role"). The 8-dimension behavioral review checklist is now a maintainable doc-paired agent: - docs/architecture-dimensions-guide.md — source of truth for the 8 dimensions, severity rubric, output contract - agents/architecture-dimensions-assistant.md — agent that reads the guide and applies it (no rule duplication) - commands/architecture-review.md — Agent B now a true thin wrapper, one-line prompt routing to the new agent Indexed in README, llms.txt, CLAUDE.md Doc↔Agent table. --- CHANGELOG.md | 4 + CLAUDE.md | 1 + README.md | 2 + agents/architecture-dimensions-assistant.md | 101 ++++++++++ commands/architecture-review.md | 8 +- docs/architecture-dimensions-guide.md | 205 ++++++++++++++++++++ llms.txt | 1 + 7 files changed, 317 insertions(+), 5 deletions(-) create mode 100644 agents/architecture-dimensions-assistant.md create mode 100644 docs/architecture-dimensions-guide.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 0f2cd02..569ff53 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,10 @@ Please choose versions by [Semantic Versioning](http://semver.org/). * MINOR version when you add functionality in a backwards-compatible manner, and * PATCH version when you make backwards-compatible bug fixes. +## Unreleased + +- feat: extract `coding:architecture-dimensions-assistant` agent + `docs/architecture-dimensions-guide.md` — closes the v0.24.0 `/coding:architecture-review` MAJOR finding ("dimensions pass routes to generic `claude` with inline agent role"). The 8-dimension behavioral review checklist (data flow, failure, concurrency, observability, cross-cutting consistency, config/blast radius, evolvability, drift) is now a maintainable doc-paired agent. `/coding:architecture-review` Agent B now routes to the new agent with a one-line prompt instead of an inlined 200-word checklist. Indexed in README.md (Workflows & Documentation + Go agents table), llms.txt, and CLAUDE.md Doc↔Agent table. + ## v0.24.0 - feat: add `/coding:architecture-review [directory]` — deep whole-codebase architectural review (different altitude from `/coding:code-review`'s diff-scope and `/coding:audit-architecture`'s single-agent scan). Spawns two parallel agents: top-down structural (`go-architecture-assistant` / `python-architecture-assistant`) + dimensions pass (data flow, failure, concurrency, observability, drift). Consolidates into Must Fix / Should Fix / Could Fix with file:line citations and Top 5 highest-leverage fixes. diff --git a/CLAUDE.md b/CLAUDE.md index 0def57b..6b73221 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -76,6 +76,7 @@ Each enforceable guide in `docs/` should have a matching agent in `agents/`. The | `go-boolean-combinator-pattern.md` | `go-architecture-assistant` | | `python-factory-pattern.md` | `python-architecture-assistant` | | `adr-guide.md` | `go-architecture-assistant` | +| `architecture-dimensions-guide.md` | `architecture-dimensions-assistant` | Reference-only docs (patterns, setup guides) don't need agents. diff --git a/README.md b/README.md index a9f296c..4fbc726 100644 --- a/README.md +++ b/README.md @@ -170,6 +170,7 @@ All guides live in [`docs/`](docs/) and can be read standalone without the plugi | [README Guide](docs/readme-guide.md) | README.md standards | | [PRD Guide](docs/prd-guide.md) | Product Requirements | | [ADR Guide](docs/adr-guide.md) | Architecture Decisions | +| [Architecture Dimensions](docs/architecture-dimensions-guide.md) | Whole-codebase behavioral review — 8 dimensions (data flow, failure, concurrency, observability, drift) | | [Markdown & Todos](docs/markdown-todo-guide.md) | Formatting standards | ### Claude Code Authoring @@ -219,6 +220,7 @@ Agents are invoked by commands — you rarely call them directly. Each reads its | `go-security-specialist` | `go-security-linting.md` | Vulnerabilities, OWASP | | `srp-checker` | — | Single Responsibility Principle (unit-level) | | `go-architecture-assistant` | — | Cross-unit architecture, naive extractions, layering, boundaries | +| `architecture-dimensions-assistant` | `architecture-dimensions-guide.md` | Whole-codebase behavioral review — data flow, failure, concurrency, observability, drift | | `go-version-manager` | — | Go version currency | | `go-tooling-assistant` | `go-makefile-commands.md` | Makefile, tools.go | diff --git a/agents/architecture-dimensions-assistant.md b/agents/architecture-dimensions-assistant.md new file mode 100644 index 0000000..51de441 --- /dev/null +++ b/agents/architecture-dimensions-assistant.md @@ -0,0 +1,101 @@ +--- +name: architecture-dimensions-assistant +description: Review whole-codebase behavioral architecture across eight dimensions — data flow, failure & resilience, concurrency & lifecycle, observability, cross-cutting consistency, config/secrets/blast radius, evolvability, architectural drift. Sibling to `go-architecture-assistant` / `python-architecture-assistant` (which own structural concerns). Read-only — does not modify code. +model: sonnet +effort: high +tools: Read, Grep, Glob, Bash +color: purple +allowed-tools: Bash(grep:*), Bash(find:*), Bash(awk:*), Bash(git:*) +--- + +# Purpose + +You are a whole-codebase **behavioral** architecture reviewer. Your sibling agents (`go-architecture-assistant`, `python-architecture-assistant`) own the **structural** concerns — package boundaries, dependency direction, abstraction seams. You own everything else that determines how the system behaves under load, failure, and growth. + +**Source of truth:** `docs/architecture-dimensions-guide.md`. The eight dimensions, severity rubric, output contract, and skip directives all live there. Read it before each review; do not paraphrase it from memory. + +## When invoked + +The user (typically via `/coding:architecture-review`) gives you a target directory. Apply the guide's eight dimensions to that directory's production code. + +1. **Read `docs/architecture-dimensions-guide.md`** — load the dimensions, severity rubric, output contract. +2. **Scope the review** — read the target's `docs/architecture*.md`, `docs/adr/`, root `ARCHITECTURE.md`, and `CLAUDE.md` as the drift baseline for dimension 8. +3. **Walk each dimension in order.** Skip a dimension only if it genuinely doesn't apply (e.g. dimension 4 observability for a CLI tool with no daemon state) — note the skip in the report. +4. **Cite file:line on every finding.** Drop findings you cannot cite. +5. **Tag severity per the guide** — Critical / Major / Moderate / Minor, judged architecturally (not by line count or name length). +6. **End with Top 5 highest-leverage fixes** — best (impact / cost) ratio across all severities, sequenced, each independently shippable. + +## What you do NOT do + +- **No structural review** — that's the sibling agents' job. If you find a layering violation, note it briefly and defer to the structural pass. +- **No code modification** — read-only. +- **No diff-scoped review** — whole-codebase only. For per-PR work, the user should use `/coding:code-review` instead. +- **No mechanical findings** — "function too long", "too many params", "name should be camelCase" are owned by linters and `go-quality-assistant` / `python-quality-assistant`. Your altitude is architectural. + +## Output format + +``` +## Summary +One paragraph: overall health verdict, biggest risk, biggest strength. + +## Dimension 1: Data flow end-to-end +[Critical/Major/Moderate/Minor] — (`path/to/file.go:42`) + Why: + Fix: + +## Dimension 2: Failure & resilience +... + +## Dimension N (continued for each applicable dimension) +... + +## Skipped dimensions +- Dimension X: + +## Top 5 highest-leverage fixes +1. +2. ... +``` + +## Common dimension-to-symptom mapping + +Quick reference for what to grep / inspect per dimension: + +| Dimension | Quick grep / check | +|-----------|-------------------| +| Data flow | Trace one entry point through to all writes (filesystem, DB, message bus). State distribution across components. | +| Failure | `exec.Command` without `Context`; `http.DefaultClient`; missing timeouts; bare `return err` discarding context | +| Concurrency | `go func`, `sync.WaitGroup`, `os.Chdir`, shared maps without mutex, ctx threading | +| Observability | `slog.Info/Warn/Error` call sites — count distinct field keys for the same domain object; correlation ID grep | +| Cross-cutting | `exec.Command*` count outside the canonical wrapper; `errors.New` vs `errors.Wrap` mix; HTTP client patterns | +| Config/blast radius | Container launch args (`--user`, `--memory`, `--cap-drop`, mount modes `:ro` vs `:rw`); ADR Phase markers in `TODO` comments | +| Evolvability | Pick a plausible feature; grep config field references vs runtime dispatch sites; look for asymmetric provider packages | +| Drift | Compare `CLAUDE.md` / `README.md` package inventory vs actual `pkg/` listing; ADR Phase 2 markers; `TODO`/`FIXME` from prior dates | + +## Calibration examples (from the guide's severity rubric) + +- **Critical (data flow):** State machine fanned across 5 packages, each with different interpretation of the same tuple → silent inconsistency +- **Critical (config/blast radius):** Container runs as root with read-write credential mount, no resource limits +- **Critical (evolvability):** Config field validated but runtime never reads it (dead-code dispatch) +- **Major (observability):** 368 `slog.Info` sites, same domain object logged with 4 different keys; no correlation ID across subprocess boundaries +- **Major (cross-cutting):** Subprocess spawn has 5 different patterns; one of them (`exec.Command` without ctx) can hang forever +- **Moderate (drift):** Stated 2-workflow architecture in CLAUDE.md; code has 4 workflows +- **Minor (concurrency):** Mutex protection is correct but undocumented + +When uncertain, drop one severity level. Inflated severity tags devalue the whole report. + +## Anti-patterns (don't do these) + +- ❌ Generic checklist parroting (re-listing the 8 dimensions in the output without grounded findings) +- ❌ Findings without file:line citations +- ❌ "Refactor everything" sweeping recommendations +- ❌ Re-doing the structural review the sibling agents own +- ❌ Mechanical findings dressed as architectural +- ❌ Inflating severity to look thorough + +## Related + +- `docs/architecture-dimensions-guide.md` — source of truth (read this first) +- `coding:go-architecture-assistant` / `coding:python-architecture-assistant` — sibling structural pass +- `/coding:architecture-review` — command that invokes you in parallel with a structural agent +- `coding:srp-checker` — unit-level SRP (different altitude) diff --git a/commands/architecture-review.md b/commands/architecture-review.md index 3b46baf..27c308e 100644 --- a/commands/architecture-review.md +++ b/commands/architecture-review.md @@ -56,12 +56,10 @@ Run all agents (2 or 3, per Step 2 detection) in a single message via Task tool. - subagent_type: `coding:python-architecture-assistant` - prompt: "Top-down architecture review of [directory]. Start from entry point (`__main__.py`, `cli.py`, FastAPI/Flask app factory) → top-level package → submodules. Assess: (1) entry point — composition root vs business-logic leak; (2) module/package structure — SRP at module level; flag grab-bag modules (`utils`, `common`, `helpers`); (3) layering and dependency direction (domain must not import framework/ORM/HTTP client); (4) separation of concerns at class level (real multi-responsibility, not file length); (5) import-time side effects (top-level side effects, registry mutations on import); (6) mixin/inheritance abuse (deep MRO, diamond hierarchies); (7) abstraction seams (Protocol/ABC at the right altitude? noise interfaces?); (8) notable good patterns to preserve. Read-only. file:line citations for every finding. Skip `.venv/`, `__pycache__/`, `build/`, `dist/`. Report: one-paragraph verdict + section per concern + 'what's working well' section." -**Agent B — Dimensions pass (run in parallel with A):** +**Agent B — Behavioral dimensions pass (run in parallel with A):** -The dimensions pass is open-ended behavioral exploration (data flow tracing, failure-path walks, drift detection), not rule-tier adjudication. There is no `rules/index.json` entry to cite, no judgment-tier rule set scoped to it, and no `coding:*` agent currently owns this altitude — so we route to the generic `claude` agent with a focused prompt. A future `coding:architecture-dimensions-assistant` extraction would need its own `docs/` guide + rule entries to earn its keep; until that exists, the generic route is correct. - -- subagent_type: `claude` -- prompt: "Extend a separately-running top-down architecture review of [directory] with these orthogonal dimensions. Do NOT redo top-down work. Cover: (1) data flow end-to-end — trace one critical command from entry to side-effects; state distribution; ordering guarantees; idempotency; (2) failure & resilience — walk unhappy paths; timeout/retry/backoff consistency; ctx cancellation propagation; recovery from partial failure; (3) concurrency & lifecycle — goroutine/task ownership; leak paths; shared mutable state protection; (4) observability — correlation IDs across process boundaries; structured logging consistency; debuggability in prod without redeploys; (5) cross-cutting consistency — subprocess spawn, error wrapping, retry, config loading reinvented vs centralized; (6) config/secrets/blast radius — least privilege; resource limits; credential mount modes; (7) evolvability test — count files touched to add one plausible next feature; (8) architectural drift — stated patterns (in docs/) vs implemented; permanent 'temporary' workarounds. Read-only. Severity tag per finding (Critical/Major/Moderate/Minor). End with 'Top 5 highest-leverage fixes'. file:line citations required." +- subagent_type: `coding:architecture-dimensions-assistant` +- prompt: "Apply `docs/architecture-dimensions-guide.md` to [directory]. Read the guide first; then walk each applicable dimension, citing file:line on every finding. End with Top 5 highest-leverage fixes. Read-only." ### Step 4: Consolidate into Must / Should / Could diff --git a/docs/architecture-dimensions-guide.md b/docs/architecture-dimensions-guide.md new file mode 100644 index 0000000..d2745c8 --- /dev/null +++ b/docs/architecture-dimensions-guide.md @@ -0,0 +1,205 @@ +# Architecture Dimensions Guide + +How to review a whole codebase for **behavioral** architectural quality, complementing the **structural** review owned by `go-architecture-assistant` / `python-architecture-assistant`. + +## When to apply + +- Whole-codebase audit (not a diff) +- Quarterly architectural health check +- Before scaling a service (team, traffic, complexity) +- Onboarding a codebase you didn't write + +**Do NOT use for diff review** — `/coding:code-review` and `/coding:pr-review` cover per-PR work. + +## Scope: structural vs behavioral + +| Concern | Owned by | +|---------|----------| +| Package boundaries, dependency direction, abstraction seams, SRP at package level | `go-architecture-assistant` / `python-architecture-assistant` (structural) | +| Data flow, failure paths, concurrency, observability, drift, blast radius | This guide (behavioral) | + +The two pass together cover whole-codebase architectural health. Run in parallel via `/coding:architecture-review`. + +## The eight dimensions + +Review one per pass; don't blur them. Each finding cites file:line. + +### 1. Data flow end-to-end + +Trace ONE critical command from entry point to all side-effects. Don't survey — pick one path and walk it. + +**Check:** + +- Where does state live? (filesystem, in-memory, external system — usually multiple) +- Ordering guarantees? At-least-once delivery implies duplicates; is every consumer idempotent? +- Half-state recovery on partial failure? +- Are state transitions owned by one component, or fanned across several? + +**Severity:** + +- **Critical** if state can be lost on crash; duplicate processing causes incorrect external side-effects; state machine has multiple owners with divergent interpretations. +- **Major** if recovery requires manual operator action; ordering is fragile but eventually-consistent. +- **Moderate** if idempotency is documented but not enforced. + +### 2. Failure & resilience + +Walk unhappy paths explicitly. Don't assume; check each one. + +**Check:** + +- Subprocess dies — handled, or daemon stalls? +- Network unreachable — bounded retry with backoff? +- Daemon SIGTERM mid-operation — recoverable on next start, or corrupt state? +- Timeout / retry / backoff consistency. Jitter? +- Context cancellation propagation top-to-bottom. + +**Severity:** + +- **Critical** if a partial failure leaves the system in an unrecoverable state; daemon hangs forever on common failure mode. +- **Major** if retry logic is reinvented per package with different semantics; no timeout on external call. +- **Moderate** if error is swallowed/logged but operator has no signal. + +### 3. Concurrency & lifecycle + +Goroutine / task ownership: who starts, who stops? + +**Check:** + +- Goroutine leak paths on error? +- Context threading all the way down (no `context.Background()` in business logic)? +- Shared mutable state — protected by mutex, or by goroutine ownership? Documented? +- Graceful shutdown contract — `sync.WaitGroup` / `errgroup` / equivalent? +- Process-wide state (`os.Chdir`, env vars) that races with concurrent operations? + +**Severity:** + +- **Critical** if process-wide mutable state races with concurrent goroutines; locks held across operations that can hang forever. +- **Major** if shutdown is abrupt (background work continues after exit); leak path on common error. +- **Moderate** if mutex protection is correct but undocumented. + +### 4. Observability + +Can you debug a stuck operation in production without adding logs and redeploying? + +**Check:** + +- Correlation IDs that survive cross-process / cross-Kafka / cross-subprocess hops? +- Structured logging (`slog.With` or equivalent) used consistently, or every site rebuilds attrs from scratch? +- Field-key drift across packages (same domain object logged with different keys)? +- Logger output complete (boot phase too, not just steady-state)? +- Dynamic log level control, or static at startup? +- Debug endpoints (`/debug/pprof`, per-object state dump, log tail) — present? + +**Severity:** + +- **Major** if no correlation ID across process boundaries; can't grep one request end-to-end. +- **Major** if field keys drift across packages for the same domain object. +- **Moderate** if no dynamic log-level control; investigation requires redeploy. + +### 5. Cross-cutting consistency + +Same problem solved the same way across packages, or each one reinvents? + +**Check:** + +- Subprocess spawning — one wrapper, or N raw `exec.Command*` sites? +- Error wrapping convention — one library used consistently? +- Retry logic — central helper, or per-package? +- Config loading — one path, or scattered? +- HTTP clients — shared transport + timeout, or one-off `http.DefaultClient.Do` per package? + +Especially relevant for AI-generated code — agents transcribe specs and drift from house style unless conventions are centralized. + +**Severity:** + +- **Major** if a foundational concern (subprocess, retry) has 3+ patterns coexisting. +- **Moderate** if conventions are documented but not enforced. +- **Minor** if two small patterns coexist with no operational impact. + +### 6. Configuration, secrets, blast radius + +**Check:** + +- Config layering precedence (default < global < project < env < arg) — explicit and tested? +- Secrets resolution — at load time or at use time? Documented? +- RBAC / least privilege — container runs as root? Credentials mounted read-write? +- Resource limits — `--memory`, `--cpus`, `--pids-limit` set with sane defaults? +- Filesystem hardening — `--read-only`, `--security-opt no-new-privileges`, `--cap-drop ALL`? +- Tokens scoped per-repo / per-tenant, or global? + +**Severity:** + +- **Critical** if container runs as root with read-write credential mount; no resource limits on operator-untrusted code. +- **Critical** if config validation passes but runtime ignores the field (silent misconfiguration trap). +- **Major** if secrets convention is inconsistent across the codebase. + +### 7. Evolvability test (meta-test) + +Pick one plausible next feature. Count files / packages you'd touch. + +**Check:** + +- Shotgun surgery → abstraction is at the wrong altitude. +- Touching one file → boundary is right OR abstraction is premature (does the feature actually need swap-in/swap-out?). +- Dead-code dispatch — config field exists but runtime always picks one path? (Silent misconfiguration.) +- Asymmetric abstractions — one provider has its own package, another is hardcoded in a "generic" package? + +**Severity:** + +- **Critical** if a configurable feature is dead code at runtime (silent misconfig). +- **Major** if adding a sibling implementation requires touching 10+ files; one impl exists in its own package while another leaks into "generic" code. + +### 8. Architectural drift + +Stated architecture (in `docs/`, ADRs, CLAUDE.md) vs implemented reality. + +**Check:** + +- Stated patterns the code violates? +- Documents that lie (claim a refactor shipped that didn't, claim Phase 2 done when only Phase 1 landed)? +- "Temporary" workarounds (band-aid forwarders, one-time migrations on every boot, legacy compat probes) that are now permanent? +- Package inventory in CLAUDE.md vs actual `pkg/` listing — drift? +- ADR Phase markers — Phase 1 shipped but Phase 2 unstarted with `TODO` in source? + +**Severity:** + +- **Major** if a security ADR's Phase 2 is unshipped but production defaults assume Phase 2 (e.g. "secure defaults will land later" → never landed). +- **Major** if CLAUDE.md / README describes a different architecture than the code. +- **Moderate** if "temporary" workarounds carry justifying comments but no removal plan. + +## Output contract + +Every finding MUST include: + +- **Severity tag** — Critical / Major / Moderate / Minor (architectural judgment, not "function too long") +- **file:line citation** — `pkg/foo/bar.go:42`; drop any finding without one +- **Why it matters** — one sentence +- **Real fix** — not "consider refactoring"; name the structural change + +Report ends with **Top 5 highest-leverage fixes** — best (impact / cost) ratio across all severities. The actionable shortlist, not an exhaustive checklist. + +## Anti-patterns + +| ❌ | ✅ | +|---|---| +| "Function too long" / "too many params" → mechanical, not architectural | "40-param factory call duplicated across 2 sites — silent positional drift" | +| Long checklist with no file:line citations | Specific finding with `pkg/foo/bar.go:123` | +| Generic checklist parroting | Findings rooted in actual file paths and named types from the reviewed repo | +| "Refactor everything" plan | Top 5 highest-leverage fixes, sequenced, each independently shippable | +| Cover all 8 dimensions equally even if 3 don't apply | Skip dimensions that genuinely don't apply (e.g. observability for a CLI with no long-running state) | +| Re-do the structural review | This is the BEHAVIORAL pass; structural is the sibling agent's job | + +## Skip directives + +Exclude from review: + +- `vendor/`, `node_modules/`, `mocks/`, generated files (`*_gen.go`, `*.pb.go`) +- Lock files (`go.sum`, `package-lock.json`, `uv.lock`) +- Tests (`*_test.go`, `tests/`) — review production code; test quality is a separate concern owned by `go-test-quality-assistant` / `python-quality-assistant` + +## Relationship to other agents/commands + +- `coding:go-architecture-assistant` / `coding:python-architecture-assistant` — sibling structural pass; run in parallel via `/coding:architecture-review` +- `coding:srp-checker` — unit-level SRP; this guide handles cross-cutting consistency +- `/coding:code-review` — diff-scoped per-PR review (different altitude) +- `/coding:audit-architecture` — quick single-agent structural scan (lighter alternative) diff --git a/llms.txt b/llms.txt index 61524e4..a126c28 100644 --- a/llms.txt +++ b/llms.txt @@ -85,6 +85,7 @@ - [Documentation Guide](docs/documentation-guide.md): README, docs/, PRDs, ADRs - [PRD Guide](docs/prd-guide.md): Product Requirements Documents - [ADR Guide](docs/adr-guide.md): Architecture Decision Records +- [Architecture Dimensions](docs/architecture-dimensions-guide.md): Whole-codebase behavioral review — 8 dimensions (data flow, failure, concurrency, observability, drift, blast radius, evolvability, cross-cutting consistency); paired with `architecture-dimensions-assistant` agent and `/coding:architecture-review` command - [Markdown & Todos](docs/markdown-todo-guide.md): Formatting standards ## Acceptance Scenarios