Skip to content

alonf/specrew

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,106 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Specrew    Specrew — Governed Agentic SDLC Specrew — Governed Agentic SDLC

Specrew

License: MIT Version Status: Alpha Platforms

Governed agentic SDLC. Agents type — you decide. Specrew is a methodology layer over GitHub Spec Kit that keeps the human in the loop at every decision boundary while letting AI agents do the work between boundaries. Runs on Windows, macOS, and Linux via PowerShell 7+. Works with GitHub Copilot, Claude Code, Cursor, OpenAI Codex CLI, and Google Antigravity.

⚡ Try it now (5 min)

macOS / Linux — native shell (zsh/bash)

curl -fsSL https://raw.githubusercontent.com/alonf/specrew/main/install.sh | sh
mkdir hello-specrew && cd hello-specrew && git init
specrew init
specrew start "Build a tip calculator with a web UI"

For Mac and Linux, install.sh does it all: it auto-installs PowerShell Core as an internal dependency if it is missing (Ubuntu/Debian via the Microsoft apt repository; macOS via Homebrew), installs Specrew from the PowerShell Gallery, and puts the native specrew command on your PATH. To validate a beta instead of the stable release, append -s -- --prerelease to the curl … | sh line.

Validating an unreleased prerelease? The main URL above serves the released install.sh. When you're beta-testing a feature branch or tag that isn't merged yet, fetch the script from that ref instead — e.g. curl -fsSL https://raw.githubusercontent.com/alonf/specrew/<branch-or-tag>/install.sh | sh -s -- --prerelease.

Windows — PowerShell 7+

Install-Module Specrew -Scope CurrentUser -SkipPublisherCheck
mkdir hello-specrew; cd hello-specrew; git init
specrew init
specrew start "Build a tip calculator with a web UI"

That's it. Specrew now drives you through the spec-driven lifecycle: you'll co-author a spec with the AI, sign off on a plan, and end with working code traceable to every decision.

Prerequisites

git and one AI host CLI — GitHub Copilot, Claude Code, Cursor, Codex CLI, or Antigravity. PowerShell 7+ is the runtime: on macOS/Linux it is an internal dependency that install.sh auto-installs for you (you never invoke pwsh directly); on Windows you run Specrew from PowerShell 7+.

Installing PowerShell 7+ manually (only needed if you bypass install.sh):

  • Windows: preinstalled on Windows 11; on Windows 10, winget install Microsoft.PowerShell or microsoft.com/powershell
  • macOS: brew install --cask powershell (then run pwsh to enter)
  • Linux: official install guide (Ubuntu, Debian, Fedora, Arch all supported)

macOS/Linux fallback — module install instead of install.sh: run Install-Module Specrew -Scope CurrentUser -SkipPublisherCheck from inside pwsh, not zsh/bash (Install-Module does not exist in your login shell — running it there prints command not found). The PowerShell Gallery prompt defaults to N, so pressing Enter declines the install — choose A / Yes to All (or add -Force). The native install.sh path above avoids both pitfalls.

See docs/getting-started.md for full install steps, dependency notes (uv, npm, Node, Spec Kit), and brownfield-project bootstrap.

What just happened

Specrew gated the agent at every decision boundary: specifyclarifyplantasksimplementreview-signoffiteration-closeout. At each boundary, the agent stopped and asked you to authorize before advancing. The artifacts on disk (.specrew/, specs/<feature>/) form a complete, host-independent audit trail. You can resume the same feature tomorrow from a different AI host — the methodology lives in files, not in the agent's memory.

Why Specrew exists

Modern AI-assisted code tools optimize for throughput — finish more in less time. That works until the AI quietly decides things the human would have decided differently:

  • Picks a database without asking
  • Resolves an ambiguous requirement by guessing
  • Skips a clarifying question to save a turn
  • Crosses a planning-to-implementation boundary without authorization
  • Ships work that "looks correct" but isn't traceable to a spec

Specrew was built after observing these failures empirically and concluding that the gap is not in the agent's capability; it is in the discipline around the agent. The same agent that auto-resolves a scope decision under one tool will surface it as a question under another. The difference is the methodology layer.

Specrew encodes that methodology as four guarantees:

  1. Boundary discipline. The lifecycle has explicit approval boundaries (specify, clarify, plan, tasks, before-implement, review-signoff, retro, iteration-closeout, feature-closeout). One human authorization advances at most one boundary. No agent prose can simulate authorization. Enforcement is moving from prose to code (see Proposal 065, shipped as Feature 039).
  2. Substantive interaction. Every boundary handoff is reviewable in the console with the essence of "what I just did / why I stopped / what I need from you" visible without opening files. Status pings are not enough.
  3. Audit-trail durability. Every verdict, decision, drift event, and bypass lives in .squad/decisions.md (Copilot host) or the host-native decisions ledger with timestamps, commit hashes, and recognized verdict shapes. Sessions can be reconstructed after the fact; methodology lives in artifacts, not in agent memory.
  4. Methodology survives the host. Specrew runs on GitHub Copilot CLI (default), Claude Code, Cursor (cursor-agent), Codex CLI, or Antigravity (agy) via specrew start --host <kind> or the interactive numbered menu when --host is omitted — VS Code Chat remains a roadmap item (Proposal 071). Per-host flag translation keeps --remote / --allow-all / --autopilot uniform at the Specrew surface; canonical Crew identity lives at .specrew/team/agents/<role>.md and translates to each host's native subagent format on every specrew start. The skill-level enforcement gates are host-agnostic by design — switching hosts must not weaken the methodology.

Post-Commit Verification Protocol

Boundary handoffs are only trustworthy when the cited evidence is tied to the committed tree, not to transient working-copy state. After a boundary commit, the Crew must:

  1. replace any .squad/decisions.md authorization Commit Reference: pending value with the real boundary commit hash
  2. keep Recorded At values in canonical UTC seconds precision
  3. run a stale-reference scan against every cited file:/// review target
  4. rerun the governed validation lane on the exact committed tree before claiming boundary readiness
  5. state any remaining verification gap or defer explicitly in the handoff

This protocol is part of the substantive-interaction contract: a boundary stop must tell the human what changed, why the agent stopped, what evidence to inspect, and whether that evidence still resolves after commit.

Switch your AI host mid-feature — without losing your place

This is what governance buys you that raw CLI usage cannot.

Every AI coding host has a context window. When that window fills — or you simply close the terminal — the agent's memory of "what we just decided, why we stopped here, what the gates are, which iteration is open" is gone. Picking back up means rebuilding context in prose, paying the token cost of recap, and trusting the agent to faithfully reconstruct decisions it never explicitly recorded.

Specrew sidesteps this by treating the artifact on disk as the source of truth, not the agent's memory. The spec, plan, tasks, iteration plan, decisions ledger, drift log, and current boundary state all live in files inside your project. Any host — Copilot, Claude, Cursor, Codex, Antigravity — can be started against the same project, read the same artifacts, and continue from the exact same boundary.

A real workflow this makes possible:

Monday  — specrew start --host copilot     "specify a tip calculator"
                                              → spec.md committed, /speckit.clarify queued
Tuesday — specrew start --host claude       (no prompt — resumes at clarify)
                                              → clarifications.md committed, /speckit.plan queued
Wednesday — specrew start --host cursor     (no prompt — resumes at plan)
                                              → plan.md committed, /speckit.tasks queued
Thursday — specrew start --host codex       (no prompt — resumes at iteration scaffold)
                                              → iter-001/plan.md, ready for /speckit.implement
Friday   — specrew start --host antigravity (no prompt — resumes mid-implement)
                                              → tests + code committed, /speckit.review-signoff queued

Each specrew start on a different host:

  • Resolves to the same canonical Crew at .specrew/team/agents/<role>.md (translated to host-native format on the fly)
  • Reads the same boundary state from .specrew/state.yml and resumes at the next gate
  • Picks up the same decisions ledger, drift log, and audit trail
  • Honors the same enforcement gates (no boundary auto-advance, no scope creep, no host-specific shortcuts)

The methodology is what makes this practical. Without governed boundaries + durable artifacts, switching host mid-feature means context loss and silent decision divergence between sessions. With them, the host is interchangeable — you can chase the cheapest model, the strongest reasoner, or the host that's loaded on the machine you happen to be at, and the project doesn't care.

What Specrew is not

If you want… …use this instead
A multi-agent code library (orchestrate agents in code) CrewAI, AutoGen, LangGraph, Microsoft Agent Framework
Autopilot coding (let the agent run; check the output) GitHub Copilot Coding Agent, OpenAI Codex (cloud app), Claude Code autonomous mode
The spec-driven command surface alone (/speckit.specify, /speckit.plan, …) Spec Kit directly
A code-agent team runtime on its own (specialist agents, agent charters) — without spec-driven governance Squad CLI directly (Copilot host only), or each host's native subagent system (.claude/agents/, etc.)
A code generator None of these — Specrew is governance over agent-driven work, not a code generator

Specrew composes Spec Kit + your host's native code-agent teams into a methodology layer with enforced discipline. It is the smallest layer that keeps the human in control when agents are doing the typing.

How it differs in one paragraph

Vanilla Spec Kit ships the slash-command surface but has no orchestration or boundary enforcement. Vanilla code-agent-team runners (Squad CLI, host-native subagent systems) run multi-agent teams but don't drive a spec-driven lifecycle. Autopilot tools and multi-agent libraries optimize for throughput by letting the agent decide. Specrew goes the other direction: the spec is authoritative, drift is a first-class event, every boundary requires explicit human authorization, and the audit trail is durable. Different design point. Same agents.

Status

  • Latest stable baseline: 0.35.0 — stable promotion of the 0.33.0–0.35.0 line (Specrew Refocus, Product & Problem Domain lens, Code & Implementation lens) after beta-before-stable validation; see CHANGELOG.md for release details
  • Active development line: 0.36.0 (Feature 182 — work-kind & branch governance) — branch-ready as 0.36.0-beta1, not yet published; 0.35.0 remains the latest published stable
  • Alpha software, validated through dogfooding in this repository
  • Built for a single developer today. Multi-developer reconciliation is a roadmap item (Proposal 010); a leaner spec-first concurrent model is queued as Proposal 115.
  • Release truth lives in CHANGELOG.md, docs/versioning.md, and the v0.NN.0 tags.

What's working today

  • specrew init bootstraps Spec Kit + Specrew governance (and installs Squad CLI when Copilot is the chosen host) into a fresh or existing repo
  • specrew start launches the canonical lifecycle session with handoff artifacts refreshed
  • specrew where renders the velocity dashboard from canonical artifacts
  • The full lifecycle: specify → clarify → plan → tasks → implement → review-signoff → retro → iteration-closeout → feature-closeout — with gate-respecting boundary stops by default (Proposal 066, shipped). The last two boundaries (iteration-closeout, feature-closeout) are not decoration: they produce the per-iteration dashboard.md + per-feature closeout-dashboard.md artifacts, mark the work durably "done", and gate the next iteration / next feature from starting. Skipping them leaves the project in an in-flight state — see docs/user-guide.md "Closing iterations + features" for what these boundaries produce and the verdict shapes that advance them.
  • The Design Workshop — a facilitated, lens-driven design conversation at specify-intake and again at the design-analysis stop before planning. It opens with the product & problem domain lens (users, pain, MVP, constraints — shipped in 0.34.0), works the nine technical lenses that apply (architecture, components, NFRs, UI/UX, data, security, integration, DevOps, observability), and closes with the code & implementation lens (implementation-craft rules, auto-on for any code-writing feature — shipped in 0.35.0) — plus co-designed component maps and flows, in-band diagrams, and durable workshop artifacts. The full methodology lives in docs/methodology/design-workshop-methodology.md
  • Session-state durability across reboots, worktree switches, and boundary events
  • A per-user Crew Interaction Profile (/specrew-user-profile) — four decision-area settings (Product Strategy, UX/UI Design, Software Architecture, AI Delivery Planning) that tune how much Specrew asks, explains, recommends, and auto-decides. It resolves per current user from the loader/path rule ($env:USERPROFILE\.specrew\user-profile.yml on Windows, ~/.specrew/user-profile.yml on Unix), is surfaced as soft session guidance everywhere and hard-applied only in /speckit.specify, and lets teammates run different local profiles in the same repo with no shared-repository changes. See docs/user-guide.md "Crew Interaction Profile".
  • Slash-command catalog deployed to .claude/skills/, .github/skills/, and .agents/skills/ (Feature 024)
  • Validator memoization, parallelization, closed-iteration index, repetition detector — the v0.24.3 process-optimization bundle keeps the discipline cheap to enforce
  • Reviewer-regression routing, session-loaded file change detection, drift-log integrity
  • Pre-boundary markdown-lint auto-fix gate prevents lint round-trips at every boundary commit
  • Refocus drift recovery (/specrew-refocus) + automatic discipline injection: boundary syncs deliver the incoming stage's rules on every host; post-compaction and session-start hooks re-ground context on hook-capable hosts (Claude/Codex/Copilot/Cursor, per the verified matrix) — with a per-session circuit breaker and three kill-switch levels
  • PR-review-integration soft warning surfaces missing pr-review-resolution.md when host has automated review available

Lifecycle-adjacent Spec Kit commands

Specrew surfaces these lifecycle-adjacent Spec Kit commands at specific lifecycle points. They are additive aids — they complement the governed lifecycle and do not replace governance.

Command Lifecycle point When to use Status
/speckit.checklist before-plan Requirements-quality aid that catches vague, incomplete, inconsistent, or missing requirements before planning. Recommended for substantive work; optional for low-risk slices. Surfaced
/speckit.analyze before-implement (after a complete tasks.md) Additive cross-artifact consistency review across spec.md, plan.md, and tasks.md. Complements governance validation; does not replace it. Surfaced
/speckit.taskstoissues Known but deferred for Feature 054; not part of the default lifecycle in this slice. Deferred

What's coming (roadmap highlights)

  • F-039 Launch-Mode Boundary Enforcement — mechanical refusal of agent boundary chaining (shipped v0.25.0)
  • F-040 Multi-Host Launch Pathspecrew start --host claude|codex|copilot (shipped v0.26.0)
  • F-043 Multi-Host Onboarding + Selection Flowspecrew host list/use/status CLI surface + host-history persistence + interactive numbered menu (shipped v0.27.0)
  • F-044 Per-Host Architecture Refactor — Open-Closed host extension (registry + 4 host packages); 5th contract function Install-<Kind>CrewRuntime; canonical .specrew/team/agents/<role>.md source-of-truth; Antigravity host graduated to supported (shipped v0.27.0)
  • F-041 Cost-Aware Model Routing — discovery skill + lean cost-profile + Junior→cheap-model auto-routing (next; addresses 2026-05-30 Copilot pricing pivot)
  • F-042 Token Economy MVP — cost.yml + dashboard COST section so per-iteration spend is measurable
  • Substantive Intake Questioning (Proposal 063) — persona-driven adaptive intake at specify + clarify boundaries
  • Iteration-Level Lifecycle Enforcement (Proposal 117) — populate state.md / review.md / retro.md per iteration (closes the empty-iteration-dir gap surfaced by 2026-05-25 dice re-audit)
  • Host Autopilot Quality Profiles (Proposal 118) — surface per-host quality-defaults at selection time + per-feature overrides
  • Friction Dial (Proposal 100) — strict/default/autonomous modes for expert developers
  • Host-Native Hook Deployment (Proposal 105) — Claude Code PreToolUse hooks elevate F-039 from cooperative to runtime enforcement
  • Installed-File SDLC Audit (Proposal 099) — close the dogfooding deficit between maintainer paste-prompts and installed methodology files

See proposals/INDEX.md for the full proposal catalog (Shipped / Draft / Candidate).

Platform support

Platform Status
Windows 11 (primary) ✅ Fully validated
WSL Ubuntu ✅ Manually validated end-to-end
Linux native (Ubuntu) ✅ Path handling cross-platform; CI matrix configured
macOS 🔧 Path handling cross-platform; CI matrix configured; no in-house validation yet

Key documents

Contributing

Specrew is alpha. Reading, issues, and discussion are welcome now. External pull requests are intentionally deferred until the operating model and review boundaries stabilize. The dogfooding loop on this repository is the validation surface for every methodology change.

License

Specrew is released under the MIT License. See LICENSE for the repository license and NOTICE.md for upstream attribution covering derived Squad and Spec Kit materials.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors