Skip to content

OCPNODE-4443: Add runc upgradeable guard to block upgrades on RHEL 10 streams#5891

Open
bitoku wants to merge 2 commits into
openshift:mainfrom
bitoku:atokubi/runc-upgradeable-guard
Open

OCPNODE-4443: Add runc upgradeable guard to block upgrades on RHEL 10 streams#5891
bitoku wants to merge 2 commits into
openshift:mainfrom
bitoku:atokubi/runc-upgradeable-guard

Conversation

@bitoku

@bitoku bitoku commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Assisted-by: Claude Code https://claude.com/claude-code

- What I did

Block cluster upgrades if any MachineConfigPool targeting a RHEL 10 or CentOS 10 OS image stream has runc configured as the default container runtime. Each pool's effective stream is resolved via the OSImageStream API, so upgraded clusters that remain on RHCOS 9 are not affected.

- How to verify it

e2e, runc cluster upgrade to rhcos 10.

- Description for the changelog

Block cluster upgrades if any MachineConfigPool targeting a RHEL 10 or CentOS 10 OS image stream has runc configured as the default container runtime.

Summary by CodeRabbit

  • New Features

    • Added detection of runc container runtime configuration in machine settings
    • Added validation to prevent runc usage on RHEL 10/CentOS 10 with guidance to migrate to crun
  • Tests

    • Added comprehensive tests for runtime detection and RHEL 10 validation across various configurations

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 28, 2026
@openshift-ci-robot

openshift-ci-robot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This pull request references OCPNODE-4443 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Assisted-by: Claude Code https://claude.com/claude-code

- What I did

Block cluster upgrades if any MachineConfigPool targeting a RHEL 10 or CentOS 10 OS image stream has runc configured as the default container runtime. Each pool's effective stream is resolved via the OSImageStream API, so upgraded clusters that remain on RHCOS 9 are not affected.

- How to verify it

e2e, runc cluster upgrade to rhcos 10.

- Description for the changelog

Block cluster upgrades if any MachineConfigPool targeting a RHEL 10 or CentOS 10 OS image stream has runc configured as the default container runtime.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Apr 28, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This pull request adds RHEL 10 compatibility enforcement by preventing the runc container runtime from being used on RHEL 10 and CentOS 10 systems. It introduces runtime detection from CRI-O drop-in configuration files within MachineConfig Ignition content, stream identification helpers, and validation that rejects rendered configurations using runc on RHEL 10 pools.

Changes

RHEL 10 Runtime Validation

Layer / File(s) Summary
CRI-O Runtime Detection Infrastructure
pkg/controller/common/helpers.go
Adds TOML parsing import and DetectRuncInMachineConfig() to extract the effective container runtime from CRI-O drop-in files within Ignition, handling multiple drop-ins with "last wins" semantics and returning the MachineConfig name only if runc is detected.
CRI-O Detection Tests
pkg/controller/common/helpers_test.go
Test suite validating runtime detection across multiple drop-ins, override ordering, gzip compression, and both runc and crun scenarios.
RHEL 10 Stream Identification
pkg/osimagestream/streams.go
Helper function IsRHEL10Stream() to identify RHEL 10 and CentOS 10 OS image streams by name matching.
RHEL 10 Validation in Render Controller
pkg/controller/render/render_controller.go
Integrates validateNoRuncOnRHEL10() into syncGeneratedMachineConfig() to detect and reject runc usage on RHEL 10 pools with a migration-focused error message.
RHEL 10 Validation Tests
pkg/controller/render/render_controller_test.go
Table-driven test covering RHEL 10 validation across runtimes, streams, drop-in orderings, and nil configurations, asserting correct error presence and messaging.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.58% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a guard to block upgrades when runc is detected on RHEL 10 streams, which is reflected throughout the implementation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR uses Go's standard testing package, not Ginkgo. All test names are stable, descriptive strings with no dynamic information, meeting the intended purpose of the check.
Test Structure And Quality ✅ Passed Tests use table-driven pattern with single responsibility per case, use testify assertions properly, create only mock objects (no cleanup needed), and validate error messages appropriately.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests added. PR adds standard Go unit tests only. MicroShift check applies to Ginkgo e2e tests, which are not present.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The PR adds only standard Go unit tests (TestDetectRuncInMachineConfig, TestValidateNoRuncOnRHEL10), not Ginkgo e2e tests. The SNO compatibility check applies to e2e tests, not unit tests.
Topology-Aware Scheduling Compatibility ✅ Passed PR contains only validation/detection logic in controller code—no deployment manifests, pod affinity constraints, nodeSelectors, or scheduling assumptions introduced.
Ote Binary Stdout Contract ✅ Passed PR adds helper functions and unit tests with no stdout writes in process-level code, main(), init(), or test suite setup. Code is library/controller code, not OTE binary entry points.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests found in PR. New tests are standard Go unit tests without IPv4 assumptions or external connectivity requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@bitoku

bitoku commented Apr 28, 2026

Copy link
Copy Markdown
Contributor Author

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 28, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/operator/status_test.go (1)

1487-1518: Good test coverage for RHEL 9 exemption.

This test case properly verifies that pools on RHEL 9 streams are not blocked even when runc is configured. This is critical for ensuring the upgrade guard only affects RHEL 10 / CentOS 10 targets.

Consider adding a similar test case for CentOS 10 to verify that both osimagestream.StreamNameRHEL10 and osimagestream.StreamNameCentOS10 trigger the upgrade block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/status_test.go` around lines 1487 - 1518, Add a test parallel to
"runc not blocked on rhel-9 stream" that sets the
OSImageStream.Status.DefaultStream to osimagestream.StreamNameCentOS10 (and
optionally another for osimagestream.StreamNameRHEL10) and asserts that
optr.checkRuncUpgradeableGuard() returns upgradeBlock == true, non-empty
message, and no error; reuse the same Operator setup (mcLister with
rendered-worker config containing the runc drop-in, osImageStreamLister,
fgHandler) and reference the Operator.checkRuncUpgradeableGuard method, the
osImageStreamLister/osisIndexer setup, and mockMCPLister/mockMCLister used in
the existing test to locate where to add the new test(s).
pkg/operator/status.go (1)

506-514: Consider whether TOML parse failures should be non-fatal.

Currently, a malformed TOML drop-in in any pool's rendered MachineConfig will cause the entire detection to fail, potentially preventing the operator from evaluating runc status for all pools.

If the TOML is malformed due to a user-provided MachineConfig, consider logging a warning and continuing rather than returning an error. However, if you expect all drop-ins to be well-formed since they're generated by the controller, the current behavior is appropriate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/status.go` around lines 506 - 514, The TOML decoding failure
currently returns an error and aborts detection; change this to log a warning
and continue so one malformed CRI-O drop-in doesn't fail the whole routine: in
the block that decodes into crioRuntimeConfig using
toml.NewDecoder(bytes.NewReader(fileData)).Decode(&cfg), replace the return of
fmt.Errorf(...) with a non-fatal log (e.g., klog.Warningf or the package's
logger) that includes path and mc.Name and the decode error, then skip using cfg
for that drop-in and continue processing to allow effectiveRuntime evaluation
from other drop-ins; keep the existing behavior (return error) only if you
determine the drop-ins should always be trusted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/operator/status_test.go`:
- Around line 1487-1518: Add a test parallel to "runc not blocked on rhel-9
stream" that sets the OSImageStream.Status.DefaultStream to
osimagestream.StreamNameCentOS10 (and optionally another for
osimagestream.StreamNameRHEL10) and asserts that
optr.checkRuncUpgradeableGuard() returns upgradeBlock == true, non-empty
message, and no error; reuse the same Operator setup (mcLister with
rendered-worker config containing the runc drop-in, osImageStreamLister,
fgHandler) and reference the Operator.checkRuncUpgradeableGuard method, the
osImageStreamLister/osisIndexer setup, and mockMCPLister/mockMCLister used in
the existing test to locate where to add the new test(s).

In `@pkg/operator/status.go`:
- Around line 506-514: The TOML decoding failure currently returns an error and
aborts detection; change this to log a warning and continue so one malformed
CRI-O drop-in doesn't fail the whole routine: in the block that decodes into
crioRuntimeConfig using toml.NewDecoder(bytes.NewReader(fileData)).Decode(&cfg),
replace the return of fmt.Errorf(...) with a non-fatal log (e.g., klog.Warningf
or the package's logger) that includes path and mc.Name and the decode error,
then skip using cfg for that drop-in and continue processing to allow
effectiveRuntime evaluation from other drop-ins; keep the existing behavior
(return error) only if you determine the drop-ins should always be trusted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 326ee946-c807-400a-9eb7-82bffd5370e1

📥 Commits

Reviewing files that changed from the base of the PR and between e0916a2 and 5a0e6e8.

📒 Files selected for processing (2)
  • pkg/operator/status.go
  • pkg/operator/status_test.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/operator/status.go (1)

393-412: ⚠️ Potential issue | 🔴 Critical

The claimed expansion of runc detection logic is not implemented.

The AI summary states that cfeEvalRunc was expanded to "evaluate rendered MachineConfig Ignition data for CRI-O drop-ins" and consider OS image streams. However, the actual implementation only checks ContainerRuntimeConfig resources for DefaultRuntime == runc.

While DetectRuncInMachineConfig exists in pkg/controller/common/helpers.go and validateNoRuncOnRHEL10 exists in pkg/controller/render/render_controller.go, neither is called by cfeEvalRunc. The current implementation will flag any ContainerRuntimeConfig with runc regardless of target OS stream.

If the intent is to also detect runc from rendered MachineConfig CRI-O drop-ins and only flag runc as an evaluation concern for RHEL 10/CentOS 10 pools, this logic is missing from the status.go implementation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/operator/status.go` around lines 393 - 412, cfeEvalRunc currently only
checks ContainerRuntimeConfig.DefaultRuntime for runc but must also inspect
rendered MachineConfig CRI-O drop-ins and restrict findings to RHEL10/CentOS10
pools; update cfeEvalRunc to (1) keep the existing crcLister loop for
ContainerRuntimeConfig, (2) iterate rendered MachineConfigs (use
DetectRuncInMachineConfig from pkg/controller/common/helpers.go) to detect runc
in CRI-O drop-ins, and (3) apply the same OS-stream restriction/logic used by
validateNoRuncOnRHEL10 (call or mirror that check) so the function only returns
true for pools targeting RHEL10/CentOS10; reference cfeEvalRunc,
DetectRuncInMachineConfig, validateNoRuncOnRHEL10, ContainerRuntimeConfig and
rendered MachineConfig resources when making the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/operator/status.go`:
- Around line 393-412: cfeEvalRunc currently only checks
ContainerRuntimeConfig.DefaultRuntime for runc but must also inspect rendered
MachineConfig CRI-O drop-ins and restrict findings to RHEL10/CentOS10 pools;
update cfeEvalRunc to (1) keep the existing crcLister loop for
ContainerRuntimeConfig, (2) iterate rendered MachineConfigs (use
DetectRuncInMachineConfig from pkg/controller/common/helpers.go) to detect runc
in CRI-O drop-ins, and (3) apply the same OS-stream restriction/logic used by
validateNoRuncOnRHEL10 (call or mirror that check) so the function only returns
true for pools targeting RHEL10/CentOS10; reference cfeEvalRunc,
DetectRuncInMachineConfig, validateNoRuncOnRHEL10, ContainerRuntimeConfig and
rendered MachineConfig resources when making the changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d132fa61-d68c-42df-87d4-3cfa7203dbea

📥 Commits

Reviewing files that changed from the base of the PR and between 5a0e6e8 and 5e4727d.

📒 Files selected for processing (7)
  • pkg/controller/common/helpers.go
  • pkg/controller/common/helpers_test.go
  • pkg/controller/render/render_controller.go
  • pkg/controller/render/render_controller_test.go
  • pkg/operator/status.go
  • pkg/operator/status_test.go
  • pkg/osimagestream/streams.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/controller/render/render_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/operator/status_test.go

@bitoku bitoku force-pushed the atokubi/runc-upgradeable-guard branch from 5e4727d to 73f6f26 Compare May 11, 2026 16:01
@bitoku

bitoku commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

/verified by @bitoku

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 11, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@bitoku: This PR has been marked as verified by @bitoku.

Details

In response to this:

/verified by @bitoku

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bitoku

bitoku commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

/unhold

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 11, 2026
@openshift-ci-robot

openshift-ci-robot commented May 11, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This pull request references OCPNODE-4443 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Assisted-by: Claude Code https://claude.com/claude-code

- What I did

Block cluster upgrades if any MachineConfigPool targeting a RHEL 10 or CentOS 10 OS image stream has runc configured as the default container runtime. Each pool's effective stream is resolved via the OSImageStream API, so upgraded clusters that remain on RHCOS 9 are not affected.

- How to verify it

e2e, runc cluster upgrade to rhcos 10.

- Description for the changelog

Block cluster upgrades if any MachineConfigPool targeting a RHEL 10 or CentOS 10 OS image stream has runc configured as the default container runtime.

Summary by CodeRabbit

  • New Features

  • Cluster upgrades are blocked when runc is detected as the effective container runtime for RHEL 10 / CentOS 10 node pools, with guidance to migrate to crun.

  • Runtime detection now considers rendered CRI-O drop-in configs and pool-specific OS image streams.

  • Added an exported helper to identify RHEL 10 / CentOS 10 image streams.

  • Tests

  • Added unit tests covering runtime detection, drop-in override ordering, and the upgrade-gating validation.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/controller/common/helpers.go`:
- Around line 1538-1559: GetIgnitionFileDataByPath currently returns raw bytes
which may be gzip-compressed; before attempting toml.Decode you must detect and
decompress Ignition gzip payloads so TOML parsing and runc detection won't fail.
In the loop that reads fileData (after calling GetIgnitionFileDataByPath and
before toml.NewDecoder(...).Decode(&cfg)), check for gzip magic bytes
(0x1f,0x8b) and, if present, create a gzip.Reader to decompress fileData into a
new byte slice and use that decompressed data for toml decoding; keep using the
same crioRuntimeConfig and error wrapping semantics if decompression or decoding
fails.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 59d53499-18c5-4cef-983c-09366c3b72d5

📥 Commits

Reviewing files that changed from the base of the PR and between 5e4727d and 73f6f26.

📒 Files selected for processing (7)
  • pkg/controller/common/helpers.go
  • pkg/controller/common/helpers_test.go
  • pkg/controller/render/render_controller.go
  • pkg/controller/render/render_controller_test.go
  • pkg/operator/status.go
  • pkg/operator/status_test.go
  • pkg/osimagestream/streams.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • pkg/osimagestream/streams.go
  • pkg/controller/render/render_controller_test.go
  • pkg/controller/common/helpers_test.go
  • pkg/operator/status.go
  • pkg/controller/render/render_controller.go

Comment thread pkg/controller/common/helpers.go Outdated
@bitoku bitoku force-pushed the atokubi/runc-upgradeable-guard branch from 73f6f26 to 0fd053c Compare May 11, 2026 16:17
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label May 11, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/controller/common/helpers_test.go (1)

1955-1968: ⚡ Quick win

Add a reversed-input ordering case to lock in “lexical last drop-in wins.”

Current override tests pass files in both insertion and lexical order, so an implementation bug that relies on slice order could still pass. Add one case with reversed file order to pin the expected precedence.

Proposed test case
 		{
 			name: "crun overridden by runc in later drop-in",
 			mc: helpers.NewMachineConfig("test-runc-override", nil, "", []ign3types.File{
 				helpers.CreateEncodedIgn3File("/etc/crio/crio.conf.d/00-default", makeCRIODropIn("crun"), 0644),
 				helpers.CreateEncodedIgn3File("/etc/crio/crio.conf.d/01-ctrcfg", makeCRIODropIn("runc"), 0644),
 			}),
 			expectedMC: "test-runc-override",
 		},
+		{
+			name: "lexical order wins even if input slice is reversed",
+			mc: helpers.NewMachineConfig("test-runc-lexical", nil, "", []ign3types.File{
+				helpers.CreateEncodedIgn3File("/etc/crio/crio.conf.d/01-ctrcfg", makeCRIODropIn("runc"), 0644),
+				helpers.CreateEncodedIgn3File("/etc/crio/crio.conf.d/00-default", makeCRIODropIn("crun"), 0644),
+			}),
+			expectedMC: "test-runc-lexical",
+		},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/common/helpers_test.go` around lines 1955 - 1968, The test
suite in helpers_test.go relies on file insertion order and doesn't assert
lexical-last-wins for CRI-O drop-ins; add a new test case that uses
helpers.NewMachineConfig and helpers.CreateEncodedIgn3File with the same two
drop-in contents from makeCRIODropIn("runc") and makeCRIODropIn("crun") but
provide the files in reversed slice order (e.g., put the "01-ctrcfg" entry
before "00-default") and assert expectedMC equals the name that should win under
lexical ordering (same as the earlier test where lexical last wins); place this
new case alongside the existing "runc overridden by crun..." and "crun
overridden by runc..." cases so the test verifies precedence is determined by
filename, not insertion order.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/controller/common/helpers_test.go`:
- Around line 1955-1968: The test suite in helpers_test.go relies on file
insertion order and doesn't assert lexical-last-wins for CRI-O drop-ins; add a
new test case that uses helpers.NewMachineConfig and
helpers.CreateEncodedIgn3File with the same two drop-in contents from
makeCRIODropIn("runc") and makeCRIODropIn("crun") but provide the files in
reversed slice order (e.g., put the "01-ctrcfg" entry before "00-default") and
assert expectedMC equals the name that should win under lexical ordering (same
as the earlier test where lexical last wins); place this new case alongside the
existing "runc overridden by crun..." and "crun overridden by runc..." cases so
the test verifies precedence is determined by filename, not insertion order.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2d6683e1-1475-4dc9-9684-b7ea885c3b8c

📥 Commits

Reviewing files that changed from the base of the PR and between 73f6f26 and 0fd053c.

📒 Files selected for processing (7)
  • pkg/controller/common/helpers.go
  • pkg/controller/common/helpers_test.go
  • pkg/controller/render/render_controller.go
  • pkg/controller/render/render_controller_test.go
  • pkg/operator/status.go
  • pkg/operator/status_test.go
  • pkg/osimagestream/streams.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • pkg/controller/render/render_controller.go
  • pkg/controller/common/helpers.go
  • pkg/controller/render/render_controller_test.go
  • pkg/osimagestream/streams.go
  • pkg/operator/status.go
  • pkg/operator/status_test.go

@cpmeadors

Copy link
Copy Markdown

/cc @asahay19

@openshift-ci openshift-ci Bot requested a review from asahay19 May 11, 2026 19:16
@bitoku

bitoku commented May 12, 2026

Copy link
Copy Markdown
Contributor Author

/pipeline required

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aws-ovn
/test e2e-aws-ovn-upgrade
/test e2e-gcp-op-ocl-part1
/test e2e-gcp-op-ocl-part2
/test e2e-gcp-op-part1
/test e2e-gcp-op-part2
/test e2e-gcp-op-single-node
/test e2e-hypershift

@bitoku

bitoku commented May 12, 2026

Copy link
Copy Markdown
Contributor Author

/retest

// effective default container runtime is runc. Drop-ins are processed in
// lexicographic order (matching CRI-O's layering behavior) so the last file's
// default_runtime wins.
func DetectRuncInMachineConfig(mc *mcfgv1.MachineConfig) (string, error) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to do this, but we do have the potential optimization of just checking the paths of the MC. We know what the file name is when we use ctrcfg, as well as what it is when it was an upgraded drop in. technically, they can use MC to drop in a file, but that's unsupported from node team perspective so if they break themselves... 🤷 .

I don't think this needs optimizing per-se, but it does make clear what cases we're supporting and may cut down the code a bit

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes complete sense, but I'd still like to keep this check.
Even if it's unsupported, user complaints will ultimately fall on us to handle.
We can just be more explicit in the comments about what is and isn't supported.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this as is as well, technically, if we want to be absolutely sure, we'd be checking the on-disk crio config per node to see if anything is using runc. Practically speaking that's probably overkill.

Out of curiosity, if the user does bypass this check (let's say, they overrode /etc/crio/crio.conf instead of using a dropin), what would happen to the node post upgrade to R10? Does crio fail upon reboot?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRI-O fails upon reboot, yes, so I guess the node stalls NotReady.
It's unsupported operation, and not going to check that though.

@rphillips

Copy link
Copy Markdown
Contributor

Are we modifying the Cluster Fleet Evaluations? I'm not seeing new code in this area. By enabling a new check in the CFE code path it will set MCO as not upgradeable.

@bitoku

bitoku commented May 20, 2026

Copy link
Copy Markdown
Contributor Author

@rphillips

Are we modifying the Cluster Fleet Evaluations? I'm not seeing new code in this area. By enabling a new check in the CFE code path it will set MCO as not upgradeable.

Ah thanks, no we aren't.
I misunderstood how this works. I thought it just flags, and didn't understand it sets upgradeable=false.
That's not what I meant. I'll revert it.

Sorry I read it wrong.
We don't want to block MCO upgrade or OCP upgrade, they should be allowed until runc is removed.
What I want to block is to upgrade to RHCOS10 with runc. They are both configured per pool.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aws-ovn
/test e2e-aws-ovn-upgrade
/test e2e-gcp-op-ocl-part1
/test e2e-gcp-op-ocl-part2
/test e2e-gcp-op-part1
/test e2e-gcp-op-part2
/test e2e-gcp-op-single-node
/test e2e-hypershift

@bitoku

bitoku commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-gcp-ovn openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

@bitoku

bitoku commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/de265370-633b-11f1-9c47-870e6f6dcba6-0

@bitoku

bitoku commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/67ece2a0-634f-11f1-8aab-94564898c66c-0

@bitoku

bitoku commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2 periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: An error was encountered. No known errors were detected, please see the full error message for details.

Full error message. unable to get additional pr info from string: periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2: string: periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 doesn't match expected format: org/repo#number

Please contact an administrator to resolve this issue.

@bitoku

bitoku commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b9a6bab0-63dd-11f1-828a-621d5ba8e722-0

@bitoku

bitoku commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c1104ff0-63dd-11f1-9f15-5905ea882eb2-0

bitoku and others added 2 commits June 9, 2026 13:29
Add render-time validation that rejects generated MachineConfigs when
runc is set as the default container runtime and the pool targets a
RHEL 10 or CentOS 10 OS image stream, since runc is not available on
RHCOS 10. Introduce DetectRuncInMachineConfig() to parse CRI-O drop-in
files from Ignition configs and determine the effective default runtime
by processing files in lexicographic order. Add IsRHEL10Stream() helper
to identify RHEL 10 / CentOS 10 streams.

Assisted-by: Claude Code <https://claude.com/claude-code>
Move validateNoRuncOnRHEL10 into generateRenderedMachineConfig so all
callers are protected, including RunBootstrap.

Co-authored-by: Ryan Phillips <rphillips@redhat.com>
Assisted-by: Claude Code <https://claude.com/claude-code>
@bitoku bitoku force-pushed the atokubi/runc-upgradeable-guard branch from af088bb to 50a5088 Compare June 9, 2026 13:30
@bitoku

bitoku commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8921abf0-6407-11f1-9686-c8cdaefe3b6e-0

@bitoku

bitoku commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/retest-required

@bitoku

bitoku commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

@bitoku

bitoku commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/38c35e00-64b2-11f1-8870-ba2c7e80b8b9-0

@bitoku

bitoku commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1af69240-64d3-11f1-991c-fc8c3045c5fc-0

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/origin#31266
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/289b6a20-657c-11f1-8b2c-d2cc517bfd33-0

@bitoku

bitoku commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/324d1dc0-6644-11f1-9791-c9a6e2bd4d91-0

@bitoku

bitoku commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2 openshift/origin#31266

@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b8439fe0-665c-11f1-8102-7cdaa8f39c7e-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants