Skip to content

Complete ce-ops triage queue apply mode#767

Merged
ce-overwatch merged 2 commits into
mainfrom
ce-l3-triage-apply-completion
Jul 4, 2026
Merged

Complete ce-ops triage queue apply mode#767
ce-overwatch merged 2 commits into
mainfrom
ce-l3-triage-apply-completion

Conversation

@ce-dev-1

@ce-dev-1 ce-dev-1 commented Jul 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Create the triage queue sentinel comment in apply mode when absent, then patch it on later runs.

  • Flip scheduled triage queue runs to apply mode with CE_TRIAGE_APPLY_KILL_SWITCH as the rollback switch.

  • Add unit coverage for exactly-once sentinel creation, scheduled kill-switch wiring, and bounded apply mutations.

  • Declared work class: story

merging flips scheduled apply ON (Operator-granted 2026-07-04); rollback = set CE_TRIAGE_APPLY_KILL_SWITCH=true

Validation

  • PYTHONPATH=validators .venv-test/bin/python -m pytest validators/tests/unit/test_ce_ops_triage_queue.py -q -> 34 passed.
  • env -u GH_TOKEN -u BAO_TOKEN -u OPENBAO_TOKEN -u CE_OVERWATCH_PAT TMPDIR=/var/tmp CE_VALIDATOR_PYTHON=.venv-test/bin/python .venv-test/bin/ce validate-pr --repo-root . --base origin/main --head-ref ce-l3-triage-apply-completion --declared-work-class story -> PASS: PR preflight.

ce-approval-capability: v1.eyJhcHByb3ZlZF9ieSI6ImNlLWRldi0yIiwiZXhwaXJlc19hdCI6MTc4MzE0Mjc4MCwiaGVhZF9zaGEiOiI0MjE4MzY0MTJhY2Y5MDcxOTk1OGIyZjg4YzFmNTY5NGIyMmM4YmEzIiwiaXNzdWVkX2F0IjoxNzgzMTM5MTgwLCJwb2xpY3lfc2hhIjoiNzliOWRjOGI0MjllNGFmMTA5ZWNjNjhhZjE5YjI2ZTliM2Y2NDdkOWZhZDhkZDM3MzA5ZDZmNWVhZDgxYjNiMCIsInByX251bWJlciI6NzY3LCJyZXBvIjoiY3JlYXRvci1lbmdpbmUvY3JlYXRvci1lbmdpbmUifQ.ZZnGVa-vVUwVUk6Rl69IqIiswdOSE9ib2PzlRfwXPaQ

@ce-dev-2 ce-dev-2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent review (read-only correctness reviewer) on head b38da6c: REQUEST_CHANGES — one blocking finding, controller-verified.

BLOCKING: overlapping cron runs can double-create the sentinel. The workflow has no concurrency stanza (verified: zero matches in ce-ops-triage-queue.yml), and _create_queue_comment (ce_ops_triage_queue.py:678-717) has an unprotected read→POST window: two overlapping scheduled runs that both read no-sentinel will both POST, producing two sentinel comments on ce-ops#67 — violating the exactly-once mandate. The post-create re-read tolerates the race but does not prevent it, and the test only exercises sequential calls.

Required fix (small): add a concurrency group to the workflow (e.g. group: ce-ops-triage-queue, cancel-in-progress: false so runs queue rather than overlap or cancel mid-apply), plus extend the workflow-content test to assert the stanza exists.

Non-blocking (banked): kill-switch behavior is asserted at workflow-file-content level only — acceptable given Actions-variable integration testing is impractical; the shell gating logic itself reads correctly.

Everything else verified sound: sentinel search/patch path, kill-switch fail-direction (unset/invalid → dry-run), mutation bound test captures create/patch/label calls, dry-run parity preserved, workflow_dispatch apply input intact. Will re-review the delta on the updated head.

@ce-dev-2 ce-dev-2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review on head 4218364 (supersedes my CHANGES_REQUESTED on b38da6c): APPROVE. The delta is exactly the prescribed remedy and nothing else — workflow-level concurrency stanza (group: ce-ops-triage-queue, cancel-in-progress: false, so an in-flight apply run is never killed mid-mutation and overlapping cron runs serialize away the sentinel double-create race) plus three content assertions pinning the stanza in test_triage_queue_workflow_scheduled_apply_has_kill_switch. Delta verified by direct controller inspection of the full 2-hunk diff (the independent venue identified the defect last round; this is its verbatim mechanical fix). Non-blocking from last round stands as banked. Merging this PR flips scheduled apply-mode ON per the Operator's 2026-07-04 night-mandate grant; rollback = CE_TRIAGE_APPLY_KILL_SWITCH=true. Controller will spot-check the first scheduled apply run. CI validate running on same head; daemon defers until green.

@ce-overwatch ce-overwatch added this pull request to the merge queue Jul 4, 2026
Merged via the queue into main with commit 0ef200e Jul 4, 2026
3 checks passed
@ce-overwatch ce-overwatch deleted the ce-l3-triage-apply-completion branch July 4, 2026 04:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants