Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .agents/skills/agents-shipgate/assets/advisory-pr-comment.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Advisory PR comment.
# Recommended starting point: runs the scanner on every PR, posts a summary
# comment, uploads the report as an artifact, and never fails the job.
Expand All @@ -18,9 +18,9 @@
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ThreeMoonsLab/agents-shipgate@v1.0.0a1
- uses: ThreeMoonsLab/agents-shipgate@v0.14.0
with:
ci_mode: advisory
diff_base: target
pr_comment: 'true'
shipgate_version: '1.0.0a1'
shipgate_version: '0.14.0'
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ body:
id: version
attributes:
label: Agents Shipgate version
placeholder: "v1.0.0a1"
placeholder: "v0.14.0"
validations:
required: true
- type: dropdown
Expand Down
21 changes: 0 additions & 21 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,28 +25,7 @@ harness/adoption/artifacts/
.claude/*
!.claude/commands/

# Launch/deck workspaces
docs/decks/**/scratch/
docs/decks/**/.DS_Store
docs/decks/**/.~lock*
docs/decks/**/~$*
docs/decks/**/*.tmp
docs/decks/**/qa*.jpg
docs/decks/vc-thesis/slide-08-options/*.html
docs/decks/vc-thesis/slide-08-options/_v3_*.png
samples/**/.agents-shipgate/baseline.json

# Keep curated final VC deck artifacts despite the generic build/ ignore.
!docs/decks/vc-thesis/build/
docs/decks/vc-thesis/build/*
!docs/decks/vc-thesis/build/fragments/
docs/decks/vc-thesis/build/fragments/*
!docs/decks/vc-thesis/build/_logo-mark-*.png
!docs/decks/vc-thesis/build/contact-sheet*.png
!docs/decks/vc-thesis/build/deck*.pdf
!docs/decks/vc-thesis/build/deck*.pptx
!docs/decks/vc-thesis/build/slide-*.png
!docs/decks/vc-thesis/build/fragments/*.png

# Merged-PR miner clone cache (benchmark/miner)
.miner-work/
6 changes: 3 additions & 3 deletions .well-known/agents-shipgate.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"name": "agents-shipgate",
"display_name": "Agents Shipgate",
"tagline": "The deterministic merge gate for AI-generated agent capability changes",
"version": "1.0.0a1",
"version": "0.14.0",
"license": "Apache-2.0",
"publisher": {
"name": "Three Moons Lab",
Expand Down Expand Up @@ -58,12 +58,12 @@
],
"package": {
"pypi": "agents-shipgate",
"github_action": "ThreeMoonsLab/agents-shipgate@v1.0.0a1",
"github_action": "ThreeMoonsLab/agents-shipgate@v0.14.0",
"github_repo": "ThreeMoonsLab/agents-shipgate"
},
"release_status": {
"track": "verify-capable release",
"latest_release": "v1.0.0a1"
"latest_release": "v0.14.0"
},
"install": {
"pipx": "pipx install agents-shipgate",
Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@

## Unreleased

## 0.14.0 - 2026-06-30

- **Versioning: the `1.0.0-alpha` line is withdrawn; this work ships as
`0.14.0`.** An earlier draft of this cycle briefly carried `1.0.0a1`. That
label was withdrawn: the `report.json` schema (`report_schema_version:
"0.28"`) is still additive-versioned and not yet frozen, the package is still
`Development Status :: 4 - Beta`, and no real-world detection-accuracy
baseline has been published — none of which support a `1.0` line. `0.14.0`
continues the `0.x` contract line from `0.13.0` and carries the same
agent-controller cleanup (see
[STABILITY.md](STABILITY.md#migration-note-0-14-0)). A `1.0` line will begin
only when the report schema reaches `1.0` and holds without a breaking change.
- **Non-preview `verify` now fails closed on a missing `--config`.**
`agents-shipgate verify --workspace . --config missing.yaml --json` exits
`2` with `merge_verdict: "unknown"`, `applicability: "unknown"`, and
Expand Down
7 changes: 3 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,9 @@ agents-shipgate list-checks
## Surface discipline

Read this before adding a new public surface. This project has shipped surface
area faster than it has proven the surface it already has — the review in
[`docs/shipgate-strategic-engineering-review.md`](docs/shipgate-strategic-engineering-review.md)
names this directly. Until the verdict-accuracy benchmark and default-on
activation land, the bar for new surface is deliberately high.
area faster than it has proven the surface it already has. Until the
verdict-accuracy benchmark and default-on activation land, the bar for new
surface is deliberately high.

A **new surface** is any of: a new CLI command or sub-app; a new
`report_schema_version` or other versioned schema; a new top-level report or
Expand Down
20 changes: 19 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,15 @@ Local-first and static by default — no agent execution, tool calls, LLM calls,

<!-- Canonical tagline: The deterministic merge gate for AI-generated agent capability changes. -->

> [!IMPORTANT]
> **Status: pre-1.0 (beta).** The decision engine is deterministic and stable,
> but Shipgate's real-world detection accuracy is still being validated against
> a labeled corpus of agent PRs — no precision/recall numbers are published yet.
> On heavily dynamic tool surfaces (factory-built toolsets, config-bound
> allowlists, runtime-assembled tools), Shipgate deliberately returns
> `insufficient_evidence` rather than guess. Treat it as an advisory gate while
> that accuracy work is in progress — see [ROADMAP.md](ROADMAP.md).

## 60 seconds: watch it block two PRs

Claude Code adds `stripe.create_refund` to your support agent and opens a
Expand All @@ -45,6 +54,15 @@ uvx agents-shipgate fixture run agent_weakens_gate
gate-removal checks are suppression-immune: the cheapest reward-hack is
also the most visible one.

**…and here's the failure mode.** These two cases are constructed fixtures with
a clear-cut answer, chosen to show the gate working. Real PRs are messier: when
a change builds its tool surface dynamically — a toolkit factory, a config-bound
allowlist, tools assembled at runtime — static extraction often can't enumerate
the result, and Shipgate returns `insufficient_evidence` and routes to a human
rather than emit a confident wrong verdict. That is the intended failure mode,
not a bug; reducing how often it fires on real dynamic code is active work (see
[ROADMAP.md](ROADMAP.md)).

One engine decides (`report.json.release_decision.decision`); everything
else — `merge_verdict`, PR comments, Check Runs, Action outputs — is a
deterministic projection of it. Five-minute version:
Expand Down Expand Up @@ -403,7 +421,7 @@ jobs:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
with:
fetch-depth: 0
- uses: ThreeMoonsLab/agents-shipgate@v1.0.0a1
- uses: ThreeMoonsLab/agents-shipgate@v0.14.0
with:
ci_mode: advisory
diff_base: target
Expand Down
2 changes: 1 addition & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> **Naming.** This project is **Agents Shipgate** (display name) / `agents-shipgate` (package, CLI, repo). See [`AGENTS.md` § Naming (canonical)](AGENTS.md#naming-canonical) for the full convention.

**Latest release: `v1.0.0a1`** — the **agent-native contract cleanup** cycle.
**Latest release: `v0.14.0`** — the **agent-native contract cleanup** cycle.

## What Agents Shipgate is

Expand Down
33 changes: 22 additions & 11 deletions STABILITY.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,31 @@
# Stability Contract · 1.0.0-alpha
# Stability Contract · 0.14.0

What agents and CI integrations can rely on across versions of Agents Shipgate.

This document is the contract. If the runtime ever diverges from what's documented here, that's a bug — please file an issue.

Shipgate is pre-1.0. The CLI surface, exit codes, and `contract_version`
described here are stable within the `0.x` line, but the `report.json` schema
(`report_schema_version`, currently `0.28`) is still additive-versioned and
not yet frozen. A `1.0` line will not begin until the report schema reaches
`1.0` and holds without a breaking change. Pin a version (or the Action tag)
for reproducible CI.

---

<a id="migration-note-100-alpha"></a>
<a id="migration-note-0-14-0"></a>

## Migration Note 1.0.0 Alpha
## Migration Note: 0.14.0

`1.0.0a1` starts a new alpha contract line on top of the `0.13.0`
release. It deliberately cleans up overlapping agent-controller contracts
instead of preserving every `0.x` surface.
`0.14.0` continues the `0.x` contract line from `0.13.0`. It is a minor
release that nonetheless makes deliberate breaking changes to the
agent-controller surface — permitted under `0.x` semantics — cleaning up
overlapping contracts instead of preserving every earlier surface. (An
earlier draft of this work was briefly labelled `1.0.0-alpha`; that label was
withdrawn because the report schema is not yet frozen, and the same changes
ship here as `0.14.0`.)

Breaking changes from the `0.x` line:
Breaking changes from the `0.13.0` line:

- `agents-shipgate verify` no longer writes
`agents-shipgate-reports/agent-result.json`. Agents should read
Expand Down Expand Up @@ -67,12 +78,12 @@ Breaking changes from the `0.x` line:
`verifier.json.merge_verdict` is the controller projection for agents and
PR automation; it is not a second release gate.

## What WILL NOT change in the current alpha line
## What WILL NOT change in the current `0.x` line

### CLI command surface

These commands and flags are stable across the current `1.0.0a*`
contract line. Future alpha versions may make deliberate breaking
These commands and flags are stable across the current `0.14.x`
contract line. Future `0.x` versions may make deliberate breaking
changes only by bumping `contract_version` and updating this file.

| Command | Stable flags |
Expand Down Expand Up @@ -108,7 +119,7 @@ changes only by bumping `contract_version` and updating this file.
### Provisional CLI command surface

The org/fleet governance commands are preview surfaces in the current
`1.0.0a*` line. They are documented, deterministic, local-only, and included in
`0.14.x` line. They are documented, deterministic, local-only, and included in
`agents-shipgate contract --json` / `.well-known/agents-shipgate.json` for
design-partner discovery, but their flags and schemas are not stable
command-contract commitments yet. They remain consumers of `verify` artifacts;
Expand Down
2 changes: 1 addition & 1 deletion adoption-kits/claude-code-skill/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
---
name: agents-shipgate
description: Run prominent Agents Shipgate flows when a change touches what an AI agent can do: `shipgate check`, `agents-shipgate verify`, or `shipgate audit --host`. Use after adding or modifying MCP servers or tools, tool/function definitions (@tool, @function_tool), OpenAPI specs that describe agent tools, agent prompts, permission scopes, approval or confirmation policies, agent CI workflows, or shipgate.yaml — and before creating a PR for any such change. Also use to verify agent-related PRs, fix or triage Shipgate findings, add Shipgate to CI, or interpret Shipgate verifier/report artifacts. Triggers on phrases like "add shipgate", "verify this agent PR", "merge verdict", "release readiness for my agent", "tool-use readiness", "shipgate check", "agents-shipgate verify", "audit host grants", "shipgate.yaml", "agents-shipgate-reports/verifier.json", "agents-shipgate-reports/report.json", "fix shipgate finding".
Expand Down Expand Up @@ -72,7 +72,7 @@

## Stable contracts (rely on these)

- **CLI surface** follows the current alpha contract line — see https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/STABILITY.md.
- **CLI surface** follows the current 0.x contract line — see https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/STABILITY.md.
- **Installed CLI contract**: when available, run `agents-shipgate contract --json` to verify local schema versions, capability/research surfaces, `release_decision.decision`, and manual-review signal fields. Older installs should use [`docs/agent-contract-current.md`](https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/docs/agent-contract-current.md) or upgrade before automating against the local contract command.
- **Verifier JSON**: `verifier_schema_version: "0.1"`. Read `merge_verdict`, `can_merge_without_human`, `first_next_action`, `fix_task`, `capability_review.top_changes`, `trust_root_touched`, and `policy_weakened` before summarizing an AI-generated PR. `merge_verdict` is a deterministic projection; the gate remains `report.json.release_decision.decision`.
- **Verify run JSON**: `verify-run.json` uses `schema_version: "shipgate.verify_run/v1"` and records stable run identity, subject refs, input hashes, outcome, and artifact hashes. It is the reproducibility artifact for `verify`; do not treat it as a second gate.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Prompt · Add Agents Shipgate to a repo

You are working in a repo that may contain an AI agent — likely one of: an MCP server tool list (`*mcp*.json` or `.agents-shipgate/*.json`), an OpenAPI spec the agent calls, a Codex plugin package (`.codex-plugin/plugin.json`) or marketplace (`.agents/plugins/marketplace.json`), a Python file with `@function_tool` / `@tool` decorators (OpenAI Agents SDK, LangChain, CrewAI), a Google ADK agent in `agent.py`, an Anthropic Messages API artifact set under `prompts/`/`tools/anthropic-tools.json`/`policies/anthropic-policy.yaml`, or an OpenAI API artifact set under `prompts/`/`tools/openai-tools.json`/`openai-config.json`.
Expand All @@ -11,16 +11,16 @@

1. **Install the tool - pin the version so a stale build can't shadow it.** This flow uses the current verifier, agent-handoff, primary-command, and Codex-boundary contracts and requires **contract v9 or newer**; an older copy lingering on `PATH` may lack the command or schema fields this prompt expects. Prefer a **pinned, zero-install** runner that fetches the exact version every time instead of trusting whatever is already on `PATH`. **Pin it into one variable and use that for every step below**, so no single command can fall through to a stale binary:
```bash
SG="uvx agents-shipgate@1.0.0a1" # uv: ephemeral, always the pinned build
# or: SG="pipx run agents-shipgate==1.0.0a1"
SG="uvx agents-shipgate@0.14.0" # uv: ephemeral, always the pinned build
# or: SG="pipx run agents-shipgate==0.14.0"
$SG --version # confirm the pinned runner resolves
```
Every step below calls `$SG …`; e.g. `$SG verify --preview --json` runs the verify preview through the pinned runner, never a `PATH` copy.

If you would rather install onto `PATH`, pin the floor and **fail loudly when it resolves older** — a plain `pipx install agents-shipgate` is a no-op when an older build already exists — then set `SG=agents-shipgate`:
```bash
python -m pip install -U --pre agents-shipgate
agents-shipgate contract --json # STOP if this reports contract_version < 9 - re-run pinned via uvx agents-shipgate@1.0.0a1
agents-shipgate contract --json # STOP if this reports contract_version < 9 - re-run pinned via uvx agents-shipgate@0.14.0
SG=agents-shipgate # only after the line above confirms contract v9+
```

Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Prompt · Decide whether Agents Shipgate is relevant

You are working in a repo or reviewing a PR and need to decide whether
Expand Down Expand Up @@ -71,13 +71,13 @@
5. **Then act.**
- If `run_shipgate: true`: run verify preview through a **version-pinned,
zero-install** runner so a stale copy already on `PATH` can't answer —
`uvx agents-shipgate@1.0.0a1 verify --preview --json` (or
`pipx run agents-shipgate==1.0.0a1 ...`). Only fall back to a bare
`uvx agents-shipgate@0.14.0 verify --preview --json` (or
`pipx run agents-shipgate==0.14.0 ...`). Only fall back to a bare
`agents-shipgate verify --preview --json` once `agents-shipgate --version`
confirms contract v9 or newer. Then follow
[`prompts/add-shipgate-to-repo.md`](https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/prompts/add-shipgate-to-repo.md)
for the first-adoption helper flow, or point the user at the GitHub
Action (`ThreeMoonsLab/agents-shipgate@v1.0.0a1`) if they prefer CI.
Action (`ThreeMoonsLab/agents-shipgate@v0.14.0`) if they prefer CI.
- If `run_shipgate: false` and `dry_run_recommended: true`: propose
a non-mutating scan only — never propose `init --write` based on a
dry-run match alone. Phrase it as "X may have shifted the tool
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Prompt · Stabilize Agents Shipgate strict mode

The user has Agents Shipgate running in **advisory** mode and wants to graduate to **strict** mode (CI fails on findings) without surprising contributors.
Expand Down Expand Up @@ -37,9 +37,9 @@

5. **Update the CI workflow.** Replace the existing advisory step with strict + baseline. Use [`examples/github-actions/03-strict-with-baseline.yml`](https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/examples/github-actions/03-strict-with-baseline.yml) as the template:
```yaml
- uses: ThreeMoonsLab/agents-shipgate@v1.0.0a1
- uses: ThreeMoonsLab/agents-shipgate@v0.14.0
with:
shipgate_version: '1.0.0a1'
shipgate_version: '0.14.0'
ci_mode: strict
fail_on: critical
baseline: .agents-shipgate/baseline.json
Expand Down
4 changes: 2 additions & 2 deletions adoption-kits/codex-skill/assets/advisory-pr-comment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ThreeMoonsLab/agents-shipgate@v1.0.0a1
- uses: ThreeMoonsLab/agents-shipgate@v0.14.0
with:
ci_mode: advisory
diff_base: target
pr_comment: 'true'
shipgate_version: '1.0.0a1'
shipgate_version: '0.14.0'
4 changes: 2 additions & 2 deletions docs/agent-contract-current.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Downstream repos generated with
`init --agent-instructions=default` get the minimal local copy at
`.shipgate/agent-contract.json`.

- Latest release: `v1.0.0a1` (see [pyproject.toml](../pyproject.toml) for the in-tree version)
- Latest release: `v0.14.0` (see [pyproject.toml](../pyproject.toml) for the in-tree version)
- Runtime contract: `9`
- Current report schema: `0.28` — [`docs/report-schema.v0.28.json`](report-schema.v0.28.json)
- Current packet schema: `0.7` — [`docs/packet-schema.v0.7.json`](packet-schema.v0.7.json)
Expand Down Expand Up @@ -362,7 +362,7 @@ exactly one stdout JSON object using
`schema_version: "shipgate.codex_boundary_result/v1"` and the schema in
[`codex-boundary-result-schema.v1.json`](codex-boundary-result-schema.v1.json).
The removed `--format agent-json` alias and `agent_result_v1` schema string are
breaking 1.0.0-alpha changes; see [STABILITY.md](../STABILITY.md#migration-note-100-alpha).
breaking 0.14.0 changes; see [STABILITY.md](../STABILITY.md#migration-note-0-14-0).

Coding agents should switch on `decision`, `completion_allowed`, `must_stop`,
`first_next_action`, `human_review`, `repair`, and `policy`. Do not derive an agent
Expand Down
2 changes: 1 addition & 1 deletion docs/agent-handoff-schema.v1.json
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,7 @@
"type": "string"
},
"version": {
"default": "1.0.0a1",
"default": "0.14.0",
"title": "Version",
"type": "string"
}
Expand Down
2 changes: 1 addition & 1 deletion docs/agent-result-schema.v1.json
Original file line number Diff line number Diff line change
Expand Up @@ -369,7 +369,7 @@
"type": "string"
},
"version": {
"default": "1.0.0a1",
"default": "0.14.0",
"title": "Version",
"type": "string"
}
Expand Down
2 changes: 1 addition & 1 deletion docs/agents/protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ The stdout object has:

Consumers must make decisions from JSON fields, never from prose or Markdown.
The stable schema is `docs/codex-boundary-result-schema.v1.json`. The
`1.0.0-alpha` contract renamed this local boundary result away from the older
`0.14.0` contract renamed this local boundary result away from the older
generic `agent_result_v1` schema string. `decision`, `completion_allowed`, `must_stop`,
`first_next_action`, `human_review`, `repair`, and `policy` are the control
signals. `risk_level` is explanatory and may differ between local-check and
Expand Down
2 changes: 1 addition & 1 deletion docs/codex-boundary-result-schema.v1.json
Original file line number Diff line number Diff line change
Expand Up @@ -369,7 +369,7 @@
"type": "string"
},
"version": {
"default": "1.0.0a1",
"default": "0.14.0",
"title": "Version",
"type": "string"
}
Expand Down
31 changes: 0 additions & 31 deletions docs/decks/architecture-overview/README.md

This file was deleted.

Binary file removed docs/decks/architecture-overview/output/output.pptx
Binary file not shown.
4 changes: 0 additions & 4 deletions docs/decks/architecture-overview/package.json

This file was deleted.

Loading