Skip to content

fix(release): unbreak deploy step for 1.118 release#6

Merged
Pterjudin merged 3 commits into
mainfrom
fix/release-deploy-2026-05-25
May 25, 2026
Merged

fix(release): unbreak deploy step for 1.118 release#6
Pterjudin merged 3 commits into
mainfrom
fix/release-deploy-2026-05-25

Conversation

@Pterjudin
Copy link
Copy Markdown

Summary

All three release workflows for the 1.118 release failed on the workflow_dispatch retrigger today (2026-05-24). Same commits succeeded yesterday on push, but yesterday check_cron_or_pr.sh forced SHOULD_DEPLOY=no, so the deploy code paths never ran. Once SHOULD_DEPLOY=yes kicked in, two distinct latent bugs surfaced.

Nothing was published — there is no 1.118.10001 tag on cortexide-binaries (latest is still 1.106.00910 from 2025-12-29), so there is no tag collision to clean up.

Run Platform Failed step Root cause
26356902456 macOS x64 + arm64 Prepare assets sign.ts import.meta.main is undefined on Node 22 → main() never runs → .app is unsigned → codesign --verify fails with code object is not signed at all
26356903381 Linux x64 + arm64 Release gh release upload ... returns HTTP 401: Bad credentials against cortexide-binaries. The RELEASE_GITHUB_TOKEN PAT (last updated 2025-11-05) has expired or been revoked.
26356902896 Windows x64 + arm64 Release Same HTTP 401: Bad credentials against cortexide-binaries.

4 of 6 failed jobs are the same expired-PAT issue. 2 of 6 are the macOS signing bug.

Fixes in this PR

1. fix(macos-sign)prepare_assets.sh (commit 12e770f) — confidence: HIGH

The cortexide IDE's vscode/build/darwin/sign.ts gates main() behind if (import.meta.main). That property exists on Node 23+ behind a flag and Node 24+ unflagged — but the workflow uses actions/setup-node@v4 with node-version: '22.15.1', where it is undefined. Result: main() never runs, the app is never signed, verification fails.

We can't easily patch the cloned IDE source (vscode/ is gitignored here and the existing patches/osx/fix-codesign.patch is already silently failing — see "corrupt patch at brand.patch:254" red-herring in the mac log). The minimal robust fix is to sed the guard to true in prepare_assets.sh immediately before invoking node. This script only ever calls sign.ts as a CLI, never imports it as a module, so unconditionally entering main() is safe. The patch is a no-op once we upgrade past Node 22.

This was introduced by commit 3d35f14 fix(ci): macOS sign.ts extension; ppc64le node URL via unofficial-builds — switching from sign.js (CommonJS require.main === module) to sign.ts (ESM import.meta.main) without verifying the new guard works on the runner's Node version.

2. fix(release)release.sh (commit 7aaca14) — confidence: HIGH

On 401, gh release view body does not contain "release not found", so the regex falls through, we skip release creation, then the upload loop hits 401 on every retry — 10 retries × increasing sleep × per-asset × ~6 asset types per arch = ~50min on Linux and ~1h34min on Windows before exit 1.

Added an explicit auth/permission probe at the top of release.sh that fails fast in seconds with an actionable ::error:: message naming the secret to rotate.

3. fix(release)update_version.sh (commit f215da0) — confidence: HIGH

Same probe pattern, applied to the versions-repo write step. Catches the same expired-PAT failure before the script hits a deep git push.

REQUIRED MANUAL STEP BEFORE MERGE / RE-DISPATCH

The macOS fix alone won't ship the release — the Linux + Windows failures are a credential problem only the org admin can fix.

You must rotate RELEASE_GITHUB_TOKEN in this repo's Secrets (Settings → Secrets and variables → Actions → Repository secrets → RELEASE_GITHUB_TOKEN).

Required scopes on the new PAT:

  • Classic PAT: repo (full control) — needed for cross-repo release upload and git push to the versions repo.
  • Fine-grained PAT: Contents: Read and write AND Metadata: Read-only, scoped to:
    • OpenCortexIDE/cortexide-binaries
    • OpenCortexIDE/cortexide-versions
    • Set a sensible expiry (e.g. 90 days) and add a calendar reminder to rotate.

Current secret state (from gh secret list -R OpenCortexIDE/cortexide-builder):

  • RELEASE_GITHUB_TOKEN — last updated 2025-11-05 (~6.5 months ago — likely hit its expiry)
  • No STRONGER_GITHUB_TOKEN secret exists (commented-out references in stable-linux.yml are dead refs).

Things intentionally NOT fixed in this PR

  • The stale patches/osx/fix-codesign.patch — it targets an old upstream sign.ts shape and emits a "corrupt patch" warning. It is a red herring for today's failure (build script explicitly treats patch failures as non-fatal). Fixing the patch is a separate cleanup.
  • The dead STRONGER_GITHUB_TOKEN references in .github/workflows/stable-linux.yml lines 399, 491, 584, 621 — they're either inside commented blocks or only reachable on a code path that isn't broken. Best handled in a separate, dedicated cleanup PR.
  • macOS notarytool and xcrun stapler paths in prepare_assets.sh — not exercised in today's failure (signing died before notarization). They may have latent issues, but we won't know until signing works.

Test plan

  • User rotates RELEASE_GITHUB_TOKEN per the manual step above.
  • User confirms the new PAT is in place (gh secret list -R OpenCortexIDE/cortexide-builder should show a recent Updated timestamp).
  • User marks this PR ready for review and merges.
  • User re-dispatches one of the three workflows (suggest macOS first — fastest signal) and confirms the auth probe says + token OK and the signing step now produces a verified signature.
  • On green macOS, dispatch Linux + Windows.

Hard constraints honored

  • No release workflow was re-triggered. Only read-only gh run view / gh api / gh release list / gh secret list calls.
  • No tags were deleted on cortexide-binaries (none needed deletion — no 1.118.10001 tag exists).
  • No changes outside cortexide-builder. The IDE repo (cortexide) and website repo are untouched.
  • No --no-verify commits, no secrets in source, no PAT provisioning attempted.
  • PR is draft; not marked ready, not merged.

🤖 Generated with Claude Code

Tajudeen added 3 commits May 25, 2026 03:56
sign.ts in the cloned cortexide IDE source gates main() behind
`if (import.meta.main)`. That property is only defined on Node 23+
(behind a flag) and Node 24+ (unflagged). The macOS release workflow
runs on actions/setup-node@v4 with node-version 22.15.1, so the guard
is undefined → falsy → main() never runs → sign() is never called.

The signing step completes in ~150ms (no actual work), the resulting
CortexIDE.app is unsigned, and the subsequent `codesign --verify`
fails with:

  CortexIDE.app: code object is not signed at all

Repro: stable-macos run 26356902456 (both x64 and arm64 jobs).

We can't edit the cloned IDE source from this repo and we don't want
to drop another patch into patches/osx/ that will go stale every time
upstream touches sign.ts (the existing fix-codesign.patch is already
showing "corrupt patch" errors). Instead, sed the guard in place
right before invoking node — it's a single-line, well-targeted swap
and the patch is a no-op once we upgrade past Node 22.

We always invoke this file directly via `node sign.ts <pwd>` and
never import it as a module, so unconditionally running main() is
safe.

Confidence: HIGH — the unsigned-app symptom matches exactly what
`import.meta.main === undefined` would produce.
When RELEASE_GITHUB_TOKEN is missing/expired/lacks scope, the existing
flow fails very slowly:

  1. `gh release view ... 2>&1` returns HTTP 401 body, but the regex
     match against "release not found" is false, so we skip release
     creation.
  2. The upload loop runs and gets 401 on the first asset, then sleeps
     15s/30s/45s/.../150s between 10 retries against the same dead
     token before giving up.
  3. Final error message is buried under thousands of lines.

Repro: stable-linux 26356903381 (50min wasted), stable-windows
26356902896 (1h34min wasted) — both ultimately HTTP 401 Bad credentials
calling cortexide-binaries.

Add an explicit auth/permission probe at the top of release.sh:

  - If the token can't authenticate at all, emit an actionable
    `::error::` message naming the secret to rotate and exit 1.
  - If the token authenticates but lacks push on the binaries repo,
    say so explicitly and exit 1.

This converts a 50-90min silent failure into a 5-second hard fail
with the operator-facing remediation steps right next to it.

Confidence: HIGH — purely additive probe, can only fail-fast paths
that would already have failed slowly.
update_version.sh runs as a separate workflow step from release.sh
(both guarded by SHOULD_DEPLOY=yes) and writes to
cortexide-versions. The same expired-PAT failure mode applies: the
existing flow would die deep inside `git clone` or `git push` with
a cryptic message after attempting the network round-trip.

Mirror the release.sh probe so both steps surface the same actionable
error pointing at the same secret to rotate.

Confidence: HIGH — additive, fail-fast, no behavior change on the
happy path.
@Pterjudin Pterjudin marked this pull request as ready for review May 25, 2026 03:12
@Pterjudin Pterjudin merged commit 14ee52e into main May 25, 2026
6 checks passed
@Pterjudin Pterjudin deleted the fix/release-deploy-2026-05-25 branch May 25, 2026 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant