Skip to content

Add an infra debug optional job#80020

Open
roivaz wants to merge 1 commit into
openshift:mainfrom
roivaz:aro-hcp-infra-debug
Open

Add an infra debug optional job#80020
roivaz wants to merge 1 commit into
openshift:mainfrom
roivaz:aro-hcp-infra-debug

Conversation

@roivaz
Copy link
Copy Markdown
Contributor

@roivaz roivaz commented Jun 3, 2026

Summary by CodeRabbit

This pull request adds an optional debug job to the ARO-HCP CI pipeline that allows infrastructure to persist for debugging purposes without automatic cleanup.

Changes made:

  1. New debug job in CI configuration (Azure-ARO-HCP-main.yaml): Added an optional e2e-parallel-debug-infra job that runs the same aro-hcp-local-e2e workflow as the standard parallel job but configured for debugging. The job targets the ci01 environment with SKIP_DEPROVISION=true, ensuring infrastructure remains available after the test completes for manual investigation.

  2. Skip deprovision support (aro-hcp-deprovision-environment-commands.sh): Enhanced the deprovisioning step to check for a SKIP_DEPROVISION environment variable. When set to "true", the script exits early without performing Azure cleanup operations, preserving the test infrastructure.

  3. Environment variable documentation (aro-hcp-deprovision-environment-ref.yaml): Formally added the SKIP_DEPROVISION boolean parameter to the step definition, allowing it to be passed through the CI pipeline when debugging infrastructure issues.

Practical effect: Developers and CI operators can now optionally run e2e-parallel-debug-infra instead of the standard parallel job to preserve ARO-HCP test infrastructure after execution, enabling post-test debugging without needing to recreate the environment.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b8294206-dec8-4b98-a2e3-b7d9c0200ad0

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd5b08 and 7be7999.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/Azure/ARO-HCP/Azure-ARO-HCP-main-presubmits.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (3)
  • ci-operator/config/Azure/ARO-HCP/Azure-ARO-HCP-main.yaml
  • ci-operator/step-registry/aro-hcp/deprovision/environment/aro-hcp-deprovision-environment-commands.sh
  • ci-operator/step-registry/aro-hcp/deprovision/environment/aro-hcp-deprovision-environment-ref.yaml

Walkthrough

This PR introduces infrastructure debugging support for ARO-HCP CI tests by adding a skip-deprovision mechanism. A new environment variable SKIP_DEPROVISION prevents cleanup operations, an optional CI job e2e-parallel-debug-infra enables this behavior, and the deprovision script checks the variable at startup to exit early.

Changes

Debug Infrastructure Control

Layer / File(s) Summary
Skip-deprovision implementation and CI integration
ci-operator/step-registry/aro-hcp/deprovision/environment/aro-hcp-deprovision-environment-ref.yaml, ci-operator/step-registry/aro-hcp/deprovision/environment/aro-hcp-deprovision-environment-commands.sh, ci-operator/config/Azure/ARO-HCP/Azure-ARO-HCP-main.yaml
Environment variable SKIP_DEPROVISION is declared with default false and documentation; deprovision script checks the variable at startup and exits early when enabled; new optional CI job e2e-parallel-debug-infra sets SKIP_DEPROVISION to true with dev cloud, ci01 environment, and 10-hour timeout to preserve infrastructure for debugging.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested labels

ok-to-test, rehearsals-ack

Suggested reviewers

  • enxebre
  • petr-muller
🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a new optional CI job (e2e-parallel-debug-infra) for infrastructure debugging purposes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR contains no Ginkgo test definitions; only CI configuration YAML and shell scripts. Check for Ginkgo test names is not applicable.
Test Structure And Quality ✅ Passed PR contains only CI configuration (YAML) and shell scripts; no Ginkgo test code exists to review against the custom check criteria.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR; it only modifies CI configuration files and deprovisioning scripts to add an optional debug job.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. Changes are CI pipeline YAML and shell script deprovisioning configuration only.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only CI infrastructure config (ci-operator, step-registry), not deployment manifests or operators. Topology-aware scheduling check not applicable.
Ote Binary Stdout Contract ✅ Passed PR modifies only CI YAML configuration and shell scripts; contains no Go code, OTE binaries, or test extension code subject to the JSON stdout contract requirement.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds CI infrastructure configuration and deprovisioning scripts only; no new Ginkgo e2e tests (It/Describe/Context/When) are added, making the IPv6/disconnected compatibility check inapplicable.
No-Weak-Crypto ✅ Passed No weak cryptography, custom crypto implementations, or non-constant-time comparisons found in PR changes affecting CI configuration and deprovision scripts.
Container-Privileges ✅ Passed No container privilege escalation indicators found in the PR. The modified files contain CI configuration and shell scripts without privileged Pod specifications.
No-Sensitive-Data-In-Logs ✅ Passed The new script logs only static messages (no variable expansion), uses --output none for Azure login, and doesn't expose passwords, tokens, keys, PII, or customer data.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from deads2k and geoberle June 3, 2026 09:47
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jun 3, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: roivaz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 3, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@roivaz: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-Azure-ARO-HCP-main-e2e-parallel-debug-infra Azure/ARO-HCP presubmit Presubmit changed
pull-ci-Azure-ARO-HCP-main-e2e-parallel Azure/ARO-HCP presubmit Registry content changed

Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@roivaz
Copy link
Copy Markdown
Contributor Author

roivaz commented Jun 3, 2026

/pj-rehearse pull-ci-Azure-ARO-HCP-main-e2e-parallel-debug-infra

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@roivaz: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@roivaz
Copy link
Copy Markdown
Contributor Author

roivaz commented Jun 3, 2026

/pj-rehearse pull-ci-Azure-ARO-HCP-main-e2e-parallel-debug-infra

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@roivaz: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@roivaz
Copy link
Copy Markdown
Contributor Author

roivaz commented Jun 3, 2026

/pj-rehearse pull-ci-Azure-ARO-HCP-main-e2e-parallel-debug-infra

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@roivaz: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jun 3, 2026

@roivaz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/Azure/ARO-HCP/main/e2e-parallel-debug-infra 7be7999 link unknown /pj-rehearse pull-ci-Azure-ARO-HCP-main-e2e-parallel-debug-infra

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant