Skip to content

Add Copilot bug-reproduction agent workflow#1040

Open
wenytang-ms wants to merge 7 commits into
mainfrom
add-copilot-repro-agent
Open

Add Copilot bug-reproduction agent workflow#1040
wenytang-ms wants to merge 7 commits into
mainfrom
add-copilot-repro-agent

Conversation

@wenytang-ms

Copy link
Copy Markdown
Contributor

Summary

Enable the GitHub Copilot coding agent to reproduce reported bugs when a maintainer assigns an issue to @copilot. For reproducible cases it uses the reporter's project, decides whether a UI/E2E test is needed, reproduces with AutoTest when it is, and leaves a committed regression test. Some bugs are reproduced without a UI test.

What this adds

  • .github/workflows/copilot-setup-steps.yml — prepares the agent's ephemeral environment (JDK 21, Node 20, @vscjava/vscode-autotest, Xvfb/GTK, a baseline VSIX) so it can build the extension and launch headless UI tests without a slow dependency hunt. Mirrors the Linux path of e2eUI.yml.
  • .github/skills/repro/SKILL.md — the reproduction workflow: extract the report, decide UI vs non-UI, pull in the reporter's project and distill a minimal committed fixture, reproduce (red) → fix → prove (green), report back. UI plans live in test/e2e-plans/ and are auto-discovered by CI.
  • .github/copilot-instructions.md — new Bug reproduction section routing assigned issues to the repro skill.
  • .github/ISSUE_TEMPLATE/bug_report.yml + config.yml — structured bug form that requires a minimal reproducible project, steps, expected/actual, and versions, so the agent has what it needs.

Design notes

  • Triggering is native: assign the issue to @copilot (or @-mention it). No custom trigger workflow needed.
  • UI vs non-UI decision avoids over-using UI tests: tree/menu/command/classpath bugs → AutoTest plan; pure logic/backend/build bugs → test/maven-suite or jdtls.ext test.
  • UI reproduction needs the agent firewall to allow the VS Code download + Marketplace hosts; the skill documents falling back to the non-UI path when they are blocked.
  • Runs AutoTest with --no-llm so pass/fail comes only from deterministic verifiers.

Validation

  • git diff --check
  • YAML parses for all three new YAML files; copilot-setup-steps job name/runs-on verified.

Note: copilot-setup-steps.yml only takes effect once merged to the default branch.

Enable the Copilot coding agent to reproduce reported bugs when an issue is assigned to it:
- copilot-setup-steps.yml: preinstall JDK 21, Node 20, AutoTest, Xvfb/GTK, and a baseline VSIX so the agent can build and run UI tests without a trial-and-error dependency hunt
- repro skill: decide UI vs non-UI reproduction, pull in the reporter's project, distill a minimal committed fixture, reproduce (red), fix, prove (green), and report back
- copilot-instructions.md: add a Bug reproduction section routing assigned issues to the repro skill
- bug_report issue form + config: collect a minimal reproducible project and structured repro info for the agent
Add .github/scripts/prewarm-vscode.js and a copilot-setup-steps step that warms AutoTest's <repo>/.vscode-test cache (VS Code stable + vscjava.vscode-java-pack) before the agent firewall engages, so firewalled UI reproductions launch offline.

Refine repro/uitest guidance: separate reproduction from fix-proof (UI test's key value is red->green screenshots), require verifiers only on the decisive assertion step, and make PRs state repro method + execution status.
Keep vscodeVersion stable (=latest, no pinning): document that the run-time (dns block) on update.code.visualstudio.com is expected/non-fatal because @vscode/test-electron falls back to the pre-warmed cached build, so the agent must not abandon the UI path.

Point repro/uitest evidence at CI: e2eUI.yml already uploads full test-results (screenshots + results.json) as e2e-results-<os>-<plan> artifacts, so no manual screenshot attaching is needed. Note optional firewall allowlist for a fully clean run.
For repro-issue-<n>.yaml plans, e2eUI now rebuilds the PR base (un-fixed)
into its own VSIX and runs the plan against base AND head in one CI run,
requiring base=RED (deterministic assertion fail) and head=GREEN. This
turns 'red->green' into a machine-checked invariant instead of prose in
the PR body, closing the gap where an agent asserts a fix works without
actually reproducing the bug.

- .github/scripts/repro-gate.js: judge that reads both results.json files,
  distinguishes a genuine assertion RED from an infra crash/error, and
  exits non-zero with a clear verdict (NOT_REPRODUCED / NOT_FIXED /
  INCONCLUSIVE) plus a job-summary table.
- e2eUI.yml: discover-plans splits regression vs repro matrices; adds
  build-base-{linux,windows} + repro-gate-{linux,windows} (PR-only, only
  when a repro plan exists). After merge to main the plan is demoted to an
  ordinary green regression.
- repro/SKILL.md, copilot-instructions.md: document the gate, red-first
  local loop, and single-PR (plan+fix) flow.
Make clear the repro path is opt-in, not automatic for every assigned
issue. Only reproducible bugs enter the repro/uitest flow; features,
refactors, dep bumps, docs, and non-reproducible reports take an ordinary
PR with no repro-issue-*.yaml. Since the CI red->green gate only fires
when a repro plan file is present, the routing decision is made purely by
whether a plan is committed. Also clarify that lint + java-dep-* regression
E2E always run, while the gate is the additional opt-in check.
The skill listed 'attached zip' as a repro source but never said how to
obtain it. Add a download+unzip recipe (curl -L follows the user-attachments
302 to a signed objects.githubusercontent.com URL), note that github.com +
objects.githubusercontent.com + codeload.github.com are on the coding-agent
firewall's DEFAULT allowlist (so attachment downloads and repo clones are NOT
blocked, unlike the VS Code binary), handle signed-URL expiry, point the plan
workspace at the extracted project, and treat the archive as untrusted input
(extract only, commit just the minimal distilled fixture).
An OS-specific bug (e.g. the Windows-only Copy Relative Path drive-letter
issue) does not manifest on the other OS, so running its repro plan through
that OS's gate would spuriously report NOT_REPRODUCED. discover-plans now
routes repro plans by filename suffix:
  repro-issue-<n>-windows.yaml -> Windows gate only
  repro-issue-<n>-linux.yaml   -> Linux gate only
  repro-issue-<n>.yaml         -> both gates (OS-agnostic)
build-base-* and repro-gate-* are gated per-OS (has_repro_linux /
has_repro_windows). Documented the naming convention in repro/SKILL.md.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant