Skip to content

fix(gpu): select single CDI GPU defaults#1675

Draft
elezar wants to merge 1 commit into
feat/gpu-request-specfrom
fix/1477-single-cdi-gpu-defaults-elezar
Draft

fix(gpu): select single CDI GPU defaults#1675
elezar wants to merge 1 commit into
feat/gpu-request-specfrom
fix/1477-single-cdi-gpu-defaults-elezar

Conversation

@elezar
Copy link
Copy Markdown
Member

@elezar elezar commented Jun 2, 2026

Summary

Builds on #1156 to implement Docker and Podman CDI GPU selection for resource_requirements.gpu.

Default GPU requests select one concrete NVIDIA CDI device. Count requests select count concrete devices. Explicit device_ids pass through unchanged.

Related Issue

Closes #1477

Blocked by #1156.

Changes

  • Adds shared CDI inventory and round-robin selection with count-aware checks.
  • Docker uses daemon CDI inventory from DiscoveredDevices.
  • Podman maps local /dev/nvidiaN devices to CDI IDs.
  • Updates GPU docs and optional GPU e2e expectations.

Testing

  • cargo check -p openshell-core -p openshell-cli -p openshell-server -p openshell-driver-kubernetes -p openshell-driver-docker -p openshell-driver-podman -p openshell-driver-vm --tests
  • cargo test -p openshell-core -p openshell-cli -p openshell-server -p openshell-driver-kubernetes -p openshell-driver-docker -p openshell-driver-podman -p openshell-driver-vm gpu --tests
  • mise run pre-commit

Checklist

  • Follows Conventional Commits
  • Documentation updated

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

@elezar elezar marked this pull request as draft June 3, 2026 13:55
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 3, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@elezar elezar changed the base branch from main to feat/gpu-request-spec June 3, 2026 13:58
@elezar elezar force-pushed the fix/1477-single-cdi-gpu-defaults-elezar branch from de229a4 to 7fa3cc6 Compare June 3, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(gpu): select one CDI GPU by default for Docker and Podman

1 participant