test(qemu): replace Fedora Cloud image with Alpine Linux tiny image by mangelajo · Pull Request #809 · jumpstarter-dev/jumpstarter

mangelajo · 2026-06-18T10:35:51Z

Motivation

Fedora Cloud qcow2 images used in QEMU driver tests weigh ~556 MB (x86_64) + ~519 MB (aarch64) = ~1.1 GB total, taking ~1 minute to download on every CI run. Alpine Linux 3.22.4 UEFI tiny images reduce that to ~278 MB total — a 74% reduction — and download in ~6 s on GitHub Actions runners, making a cache unnecessary.

	Fedora 43	Alpine 3.22.4 tiny	Reduction
x86_64	556 MB	127 MB	77%
aarch64	519 MB	151 MB	71%
Total	~1,076 MB	~278 MB	74%

Alpine's nocloud tiny images support UEFI boot on both architectures and use tiny-cloud (a minimal cloud-init alternative) that reads the standard NoCloud CIDATA vfat volume the QEMU driver already generates.

Driver changes (`driver.py`)

tiny-cloud differs from full cloud-init in two ways:

Hostname: tiny-cloud reads @hostname from the YAML key hostname in meta-data, not local-hostname. Add hostname alongside local-hostname so both implementations set it correctly.
Password: tiny-cloud ignores plain_text_passwd in the users stanza. Add a runcmd entry to set the password via chpasswd. Credentials are escaped with shlex.quote() so special characters are safe. Full cloud-init images are unaffected.

hostport=0 support: Hostfwd now allows hostport=0, which tells QEMU's user-mode networking to pick a free port automatically. After QEMU starts, the driver queries the actual assigned port via QMP human-monitor-command info usernet, logs it at INFO level:

hostfwd 'ssh': resolved port 0 -> 127.0.0.1:<port> (guest port 22)

and stores it in _resolved_hostports so get_hostfwd_port() can return it to the client side.

Client changes (`client.py`)

shell() now checks for an ssh hostfwd entry by calling get_hostfwd_port('ssh'). If present, it opens a direct fabric.Connection to 127.0.0.1:<resolved_port>, bypassing the jumpstarter streaming layer entirely. If no ssh hostfwd is configured (KeyError), it falls back to the original FabricAdapter path over vsock.

This is necessary because Alpine uses OpenRC, not systemd, so systemd-ssh-generator never runs and sshd only listens on TCP. Fedora cloud images ship systemd-ssh-generator (systemd ≥ 256) which automatically binds sshd to AF_VSOCK port 22 inside VMs — that is why the original vsock path worked with Fedora but not Alpine.

Test changes (`driver_test.py`)

Pass hostfwd={'ssh': {'hostport': 0, 'guestport': 22}} so qemu.shell() uses the TCP path with a QEMU-assigned port; no probe socket needed.
Wait for bootstrap_complete: done on the serial console before attempting login, guaranteeing tiny-cloud has set the password and sshd is ready (avoids the race that existed with Fedora too, where the extra setenforce console round-trip happened to provide enough delay).
Press Enter if the GRUB countdown appears to skip the 10 s wait.
Update post-login prompt from bash [user@host ~]$ → Alpine ash host:~$.
Remove sudo setenforce 0 (no SELinux on Alpine).
Relax uname -r assertion to a non-empty check.

CI changes (`python-tests.yaml`)

Remove the actions/cache steps entirely. Alpine images download in ~6 s on GitHub runners (measured from CI logs), which is faster than cache restore overhead for files this size.

coderabbitai · 2026-06-18T10:37:42Z

Warning

Review limit reached

@mangelajo, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 18 minutes and 36 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 720f4279-6480-428a-ab0a-72658840bcf0

📥 Commits

Reviewing files that changed from the base of the PR and between 83153d4 and 9a25775.

📒 Files selected for processing (4)

.github/workflows/python-tests.yaml
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver_test.py

📝 Walkthrough

Walkthrough

Migrates the QEMU integration test from Fedora Cloud to Alpine 3.22.4 UEFI tiny while introducing dynamic SSH forwarding port resolution. The driver's cidata() gains hostname and password-initialization support via chpasswd runcmd for Alpine compatibility. Host port resolution is added via QEMU info usernet parsing, allowing hostport=0 to request ephemeral port assignment. The SSH client switches to direct TCP connections using resolved ports. The test uses port-0 ephemeral forwarding, Alpine images with cache fallback, and GRUB-aware console automation. The CI workflow downloads Alpine artifacts.

Changes

Alpine Migration with Dynamic SSH Port Resolution

Layer / File(s)	Summary
Hostfwd port-0 support and dynamic resolution `python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py`	`Hostfwd.hostport` validation relaxed to allow `0` for ephemeral assignment. `Qemu._resolved_hostports` dict field added to store actual assigned ports. `QemuPower.on()` parses QEMU `info usernet` to resolve port-0 entries. `Qemu.get_hostfwd_port()` consults resolved mapping before configured values.
Cloud-init Alpine compatibility: hostname and chpasswd runcmd `python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py`	`shlex` import added for safe credential quoting. `cidata()` meta-data gains `hostname` key from `self.hostname`. `user-data` gains `runcmd` that pipes `username:password` into `chpasswd` to initialize user password at boot for Alpine cloud-init.
SSH client connection via direct TCP forwarding `python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py`	`fabric.Connection` imported. `QemuClient.shell()` prefers direct TCP connection to `127.0.0.1:<port>` after resolving SSH hostfwd port; falls back to `FabricAdapter` tunneling on `KeyError`.
Test: Alpine image and ephemeral SSH port `python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver_test.py`	Configures `hostfwd` with `hostport=0` for ephemeral SSH port selection. Switches flashed image from Fedora Cloud qcow2 to Nocloud Alpine 3.22.4 qcow2 with cached-image-first fallback to CDN download. Adds GRUB countdown detection (sends Enter on "automatically in …"); removes SELinux `setenforce 0` and exact kernel version assertion; replaces with non-empty `uname -r` validation after login.
CI: Alpine cloud image caching and download `.github/workflows/python-tests.yaml`	Replaces Fedora Cloud cache/download steps with Alpine `3.22.4-uefi-tiny` cache key and downloads `nocloud_alpine-3.22.4-${arch}-uefi-tiny-r0.qcow2` for `aarch64` and `x86_64` from Alpine CDN; cache save excluded from pull requests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

jumpstarter-dev/jumpstarter#321: Originally introduced the QEMU driver and integration test scaffolding that this PR extends with hostfwd port resolution and Alpine cloud-init compatibility.
jumpstarter-dev/jumpstarter#402: Previously implemented hostfwd support; this PR extends it by allowing port-0 and adding runtime port resolution from QEMU's usernet state.

Poem

🐇 From Fedora's shore I bid farewell,
To Alpine's peaks where port-zero dwell.
Through GRUB's countdown, a keypress sent,
With fabric's TCP, SSH's ascent.
chpasswd whispers the password's song,
And dynamic ports guide users along! 🏔️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: replacing Fedora Cloud images with Alpine Linux tiny images in QEMU driver tests.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, explaining motivation, image replacement details, driver changes, test changes, and SSH connectivity updates.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch use-alpine-image-for-qemu-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/python-tests.yaml:
- Around line 123-129: The combined actions/cache@v5 action for the Alpine cloud
images step creates a cache poisoning risk because it saves artifacts from pull
request events to a shared static cache key that's used across all branches.
Split this into two separate steps: one using actions/cache/restore@v5 that runs
on all events including pull requests, and another using actions/cache/save@v5
that only runs on trusted events (add a condition like if: github.event_name !=
'pull_request'). Apply this to the cache step with the key
alpine-3.22.4-uefi-tiny and path python/packages/jumpstarter-driver-qemu/images
to ensure the cache is restored in PRs but only saved by trusted workflows.

In
`@python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver_test.py`:
- Around line 64-69: The current port allocation in the socket.bind() call at
lines 66-68 creates a race condition where the port is released immediately
after reservation, allowing another process to claim it before QEMU attempts to
bind for hostfwd. Implement a retry mechanism that attempts to reserve a port
with socket.bind() and then immediately use it with QEMU; if the bind fails due
to address already in use, catch the exception and retry with a new port
allocation. Repeat this process until a port is successfully reserved and used
without collision.

In `@python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py`:
- Around line 536-538: The `runcmd` entry is constructing a shell command by
directly interpolating `self.username` and `self.password` into a single-quoted
string, which can be broken or exploited if these values contain single quotes,
newlines, or other special characters. Use Python's `shlex.quote()` function to
properly escape both `self.username` and `self.password` values before inserting
them into the shell command string in the runcmd list. This ensures that any
special characters in the credentials are safely handled as literal values
rather than being interpreted as shell syntax.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1c4c2088-1fb5-49cd-8495-779f5d8a927e

📥 Commits

Reviewing files that changed from the base of the PR and between ada5fa8 and 23a0c05.

📒 Files selected for processing (3)

.github/workflows/python-tests.yaml
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver_test.py

raballew

On a high level, I don't like hardcoding this in so many different places plus we are completely missing out the renovate process here. I think we will need to ensure that this image gets constantly updated as well.

@hostname

## Motivation Fedora Cloud qcow2 images used in QEMU driver tests weigh ~556 MB (x86_64) + ~519 MB (aarch64) = ~1.1 GB total, taking ~1 minute to download on every CI run. Alpine Linux 3.22.4 UEFI tiny images reduce that to ~127 MB + ~151 MB = ~278 MB — a 74% reduction — and download in ~6 s on GitHub Actions runners, making a cache unnecessary. ## Image replacement x86_64: 556 MB → 127 MB (77%) aarch64: 519 MB → 151 MB (71%) Total: ~1076 MB → ~278 MB (74%) Alpine's nocloud tiny images support UEFI boot on both architectures and use tiny-cloud (a minimal cloud-init alternative) that reads the standard NoCloud CIDATA vfat volume the QEMU driver already generates. ## Driver changes (driver.py) tiny-cloud differs from full cloud-init in two ways: 1. Hostname: tiny-cloud reads @hostname from the YAML key 'hostname' in meta-data, not 'local-hostname'. Add 'hostname' alongside 'local-hostname' so both implementations set it correctly. 2. Password: tiny-cloud ignores plain_text_passwd in the users stanza. Add a runcmd entry to set the password via chpasswd. Credentials are escaped with shlex.quote() so special characters are safe. Full cloud-init images are unaffected. hostport=0 support: Hostfwd now allows hostport=0, which tells QEMU's user-mode networking to pick a free port automatically. After QEMU starts, the driver queries the actual assigned port via QMP 'human-monitor-command info usernet', logs it at INFO level: hostfwd 'ssh': resolved port 0 -> 127.0.0.1:<port> (guest port 22) and stores it in _resolved_hostports so get_hostfwd_port() can return it to the client side. ## Client changes (client.py) shell() now checks for an 'ssh' hostfwd entry by calling get_hostfwd_port('ssh'). If present, it opens a direct fabric Connection to 127.0.0.1:<resolved_port>, bypassing the jumpstarter streaming layer entirely. If no ssh hostfwd is configured (KeyError), it falls back to the original FabricAdapter path over vsock. This is necessary because Alpine uses OpenRC, not systemd, so systemd-ssh-generator never runs and sshd only listens on TCP. Fedora cloud images ship systemd-ssh-generator (systemd >= 256) which automatically binds sshd to AF_VSOCK port 22 inside VMs — that is why the original vsock path worked with Fedora but not Alpine. ## Test changes (driver_test.py) - Pass hostfwd={'ssh': {hostport=0, guestport=22}} so qemu.shell() uses the TCP path with a QEMU-assigned port; no probe socket needed. - Wait for 'bootstrap_complete: done' on the serial console before attempting login, guaranteeing tiny-cloud has set the password and sshd is ready (avoids the race that existed with Fedora too, where the extra setenforce console round-trip happened to provide enough delay). - Press Enter if the GRUB countdown appears to skip the 10 s wait. - Update post-login prompt from bash '[user@host ~]$' to Alpine ash 'host:~$'. - Remove 'sudo setenforce 0' (no SELinux on Alpine). - Relax 'uname -r' assertion to a non-empty check. ## CI changes (python-tests.yaml) Remove the actions/cache steps entirely. Alpine images download in ~6 s on GitHub runners (measured), which is faster than cache restore overhead for files this size.

mangelajo · 2026-06-18T12:38:01Z

On a high level, I don't like hardcoding this in so many different places plus we are completely missing out the renovate process here. I think we will need to ensure that this image gets constantly updated as well.

no real need for this, this does not go into production or anything, is just an image that we boot to run the tests. But that's an existing problem from the past, nothing new.

mangelajo · 2026-06-18T12:38:28Z

        run: |
          brew install renode/tap/renode

-      - name: Cache Fedora Cloud images


the cache was actually slower than the download.

mangelajo · 2026-06-18T12:39:14Z

+        # If the driver has an 'ssh' hostfwd entry, fetch the actual host port
+        # (resolving any port=0 assignment) and connect directly over TCP.
+        # Otherwise fall back to tunnelling through the jumpstarter stream (vsock).
+        try:


we had to implement this because alpine does not setup ssh over vhost, and we didn't want to pick a port that could potentially conflict with whatever is on the testing host.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py`:
- Around line 82-97: The try/except block in the context manager is too broad
and will catch KeyError exceptions raised by the caller's code inside the with
block, causing unintended fallback behavior. Restructure this using
try/except/else pattern: wrap only the self.call("get_hostfwd_port", "ssh") call
in the try block, move the Connection creation and yield statement to an else
block (which executes only on success), and keep the FabricAdapter fallback in
the except KeyError block. This ensures that only KeyError from the
get_hostfwd_port call triggers the fallback, while caller errors are properly
propagated.
- Line 7: The hardcoded Connection(host="127.0.0.1") usage in the SSH shell
connection block (around lines 84-90) ignores the custom Hostfwd.hostaddr
setting and breaks for non-local clients. Replace this hardcoded approach by
resolving into the ssh child which is a TcpNetwork instance at line 489 that
already contains the correct hostaddr and port configuration. Reference the
FabricAdapter fallback pattern at lines 92-97 as an example of the correct
approach for properly tunneling through jumpstarter's stream instead of creating
direct connections to 127.0.0.1.

In `@python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py`:
- Around line 385-403: After the for loop that processes usernet lines in the
hostfwd resolution block, add validation to ensure all zero-port entries were
successfully resolved. Create a set of unresolved ports by checking which
entries in zero_fwds are not present in self.parent._resolved_hostports, and
raise a RuntimeError with the list of missing keys if any are found.
Additionally, in the get_hostfwd_port() method (not shown in this diff), add a
check that raises RuntimeError if the hostport value equals 0, preventing silent
failures when unresolved ports are returned to callers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6d8109b7-9f5f-4cf2-bbc8-e2b176dad19a

📥 Commits

Reviewing files that changed from the base of the PR and between fcc4776 and 83153d4.

📒 Files selected for processing (4)

.github/workflows/python-tests.yaml
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py
python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver_test.py

💤 Files with no reviewable changes (1)

.github/workflows/python-tests.yaml

🚧 Files skipped from review as they are similar to previous changes (1)

python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver_test.py

coderabbitai · 2026-06-18T12:41:59Z

 from contextlib import contextmanager

 import click
+from fabric import Connection


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd jumpstarter-driver-qemu -type d

Repository: jumpstarter-dev/jumpstarter

Length of output: 242

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py | head -100

Repository: jumpstarter-dev/jumpstarter

Length of output: 4055

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py | tail -20

Repository: jumpstarter-dev/jumpstarter

Length of output: 913

🏁 Script executed:

grep -n "Connection\|127.0.0.1\|Hostfwd" python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 172

🏁 Script executed:

fd jumpstarter-driver-network -type d

Repository: jumpstarter-dev/jumpstarter

Length of output: 242

🏁 Script executed:

find . -name "*jumpstarter-driver-network*" -type d 2>/dev/null | head -5

Repository: jumpstarter-dev/jumpstarter

Length of output: 116

🏁 Script executed:

find . -name "FabricAdapter*" -o -path "*/jumpstarter_driver_network/*" -name "*.py" 2>/dev/null | head -10

Repository: jumpstarter-dev/jumpstarter

Length of output: 979

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-network/jumpstarter_driver_network/adapters/fabric.py | head -60

Repository: jumpstarter-dev/jumpstarter

Length of output: 1419

🏁 Script executed:

grep -n "Hostfwd\|hostaddr" python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | head -20

Repository: jumpstarter-dev/jumpstarter

Length of output: 884

🏁 Script executed:

find . -name "*.py" -type f | xargs grep -l "Hostfwd\|hostaddr" 2>/dev/null | head -10

Repository: jumpstarter-dev/jumpstarter

Length of output: 228

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | grep -A 5 -B 5 "hostfwd\|Hostfwd" | head -50

Repository: jumpstarter-dev/jumpstarter

Length of output: 2610

🏁 Script executed:

grep -n "TcpNetwork\|get_hostfwd_port" python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 398

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | sed -n '571,590p'

Repository: jumpstarter-dev/jumpstarter

Length of output: 871

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | sed -n '485,495p'

Repository: jumpstarter-dev/jumpstarter

Length of output: 443

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | sed -n '455,475p'

Repository: jumpstarter-dev/jumpstarter

Length of output: 1112

🏁 Script executed:

grep -B5 -A10 "def get_hostfwd_port" python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 604

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py | sed -n '78,98p'

Repository: jumpstarter-dev/jumpstarter

Length of output: 1024

🏁 Script executed:

grep -n "export\|get_hostfwd_port\|get_hostaddr" python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 480

🏁 Script executed:

grep -n "def get_\|`@export`" python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | grep -A1 "def get_"

Repository: jumpstarter-dev/jumpstarter

Length of output: 305

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | sed -n '569,610p'

Repository: jumpstarter-dev/jumpstarter

Length of output: 1847

🏁 Script executed:

grep -B5 -A10 "def get_hostfwd" python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 604

🏁 Script executed:

cat -n python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py | sed -n '431,436p'

Repository: jumpstarter-dev/jumpstarter

Length of output: 327

Use the network child for SSH shell connections instead of hardcoding 127.0.0.1.

The Connection(host="127.0.0.1") at lines 84-90 ignores the custom Hostfwd.hostaddr setting and fails for non-local clients. When the client runs on a different machine, it connects to the client's loopback instead of the server's configured host address. The proper path is to resolve into the ssh child (a TcpNetwork instance at line 489) which already has the correct hostaddr and port. The fallback pattern using FabricAdapter at lines 92-97 demonstrates the correct approach for tunneling through jumpstarter's stream.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py` at line 7, The hardcoded Connection(host="127.0.0.1") usage in the SSH shell connection block (around lines 84-90) ignores the custom Hostfwd.hostaddr setting and breaks for non-local clients. Replace this hardcoded approach by resolving into the ssh child which is a TcpNetwork instance at line 489 that already contains the correct hostaddr and port configuration. Reference the FabricAdapter fallback pattern at lines 92-97 as an example of the correct approach for properly tunneling through jumpstarter's stream instead of creating direct connections to 127.0.0.1.

coderabbitai · 2026-06-18T12:41:59Z

+        try:
+            port = int(self.call("get_hostfwd_port", "ssh"))
+            with Connection(
+                host="127.0.0.1",
+                port=port,
+                user=self.username,
+                connect_kwargs={"password": self.password},
+            ) as conn:
+                yield conn
+        except KeyError:
+            with FabricAdapter(
+                client=self.ssh,
+                user=self.username,
+                connect_kwargs={"password": self.password},
+            ) as conn:
+                yield conn


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the file find . -type f -name "client.py" | grep jumpstarter-driver-qemu

Repository: jumpstarter-dev/jumpstarter

Length of output: 147

🏁 Script executed:

# Read the client.py file to see the full context cat -n "python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py" | head -150

Repository: jumpstarter-dev/jumpstarter

Length of output: 4941

🏁 Script executed:

# Check the imports to see if contextmanager is imported head -20 "python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py"

Repository: jumpstarter-dev/jumpstarter

Length of output: 618

Narrow the try block to avoid catching caller KeyErrors in this context manager.

In a @contextmanager, exceptions from the caller's with body resume at the yield statement. A caller that raises KeyError inside the context will be caught by the current broad try block, triggering the fallback path unintentionally and potentially masking the original error. Move the except handler to wrap only the get_hostfwd_port call using try/except/else:

Proposed fix

+ try: + port = int(self.call("get_hostfwd_port", "ssh")) + except KeyError: + with FabricAdapter( + client=self.ssh, + user=self.username, + connect_kwargs={"password": self.password}, + ) as conn: + yield conn + else: with Connection( host="127.0.0.1", port=port, user=self.username, connect_kwargs={"password": self.password}, ) as conn: yield conn - except KeyError: - with FabricAdapter( - client=self.ssh, - user=self.username, - connect_kwargs={"password": self.password}, - ) as conn: - yield conn

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/client.py` around lines 82 - 97, The try/except block in the context manager is too broad and will catch KeyError exceptions raised by the caller's code inside the with block, causing unintended fallback behavior. Restructure this using try/except/else pattern: wrap only the self.call("get_hostfwd_port", "ssh") call in the try block, move the Connection creation and yield statement to an else block (which executes only on success), and keep the FabricAdapter fallback in the except KeyError block. This ensures that only KeyError from the get_hostfwd_port call triggers the fallback, while caller errors are properly propagated.

coderabbitai · 2026-06-18T12:41:59Z

+        # Resolve any hostport=0 hostfwd entries to the actual port QEMU chose.
+        # Parse 'info usernet': lines look like "TCP[HOST_FORWARD] fd addr port addr port ..."
+        # Store resolved ports on the parent so get_hostfwd_port() can return them to clients.
+        zero_fwds = {k: v for k, v in self.parent.hostfwd.items() if v.hostport == 0}
+        if zero_fwds:
+            usernet = await qmp.execute("human-monitor-command", {"command-line": "info usernet"})
+            self.logger.debug("info usernet output:\n%s", usernet)
+            for line in usernet.splitlines():
+                parts = line.split()
+                if len(parts) >= 6 and "HOST_FORWARD" in parts[0]:
+                    # parts: Protocol[State] fd hostaddr hostport guestaddr guestport ...
+                    actual_hostaddr, actual_hostport, actual_guestport = parts[2], int(parts[3]), int(parts[5])
+                    for k, v in zero_fwds.items():
+                        if v.hostaddr == actual_hostaddr and v.guestport == actual_guestport:
+                            self.logger.info(
+                                "hostfwd '%s': resolved port 0 -> %s:%d (guest port %d)",
+                                k, actual_hostaddr, actual_hostport, actual_guestport,
+                            )
+                            self.parent._resolved_hostports[k] = actual_hostport


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "driver.py" | grep jumpstarter-driver-qemu

Repository: jumpstarter-dev/jumpstarter

Length of output: 147

🏁 Script executed:

wc -l python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 149

🏁 Script executed:

sed -n '380,410p' python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 1849

🏁 Script executed:

sed -n '560,580p' python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 765

🏁 Script executed:

sed -n '430,440p' python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 382

🏁 Script executed:

sed -n '460,470p' python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

Repository: jumpstarter-dev/jumpstarter

Length of output: 553

🏁 Script executed:

rg "_resolved_hostports" python/packages/jumpstarter-driver-qemu/

Repository: jumpstarter-dev/jumpstarter

Length of output: 623

🏁 Script executed:

rg "get_hostfwd_port" python/packages/jumpstarter-driver-qemu/

Repository: jumpstarter-dev/jumpstarter

Length of output: 498

🏁 Script executed:

rg "get_hostfwd_port" --type py

Repository: jumpstarter-dev/jumpstarter

Length of output: 498

Fail fast when a dynamic hostfwd port is unresolved.

Currently, if info usernet parsing misses a hostport=0 entry, get_hostfwd_port() returns 0 to the caller, causing SSH and other services to fail silently on an invalid port. Validate that every zero-port entry resolves during VM startup, and explicitly reject unresolved ports in get_hostfwd_port().

The proposed fix is necessary and should be applied:

Accumulate resolved ports in a local dict and validate all zero_fwds entries are resolved; raise RuntimeError with the list of missing keys if any are unresolved.

In get_hostfwd_port(), check if hostport == 0 and raise RuntimeError instead of returning the invalid port.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py` around lines 385 - 403, After the for loop that processes usernet lines in the hostfwd resolution block, add validation to ensure all zero-port entries were successfully resolved. Create a set of unresolved ports by checking which entries in zero_fwds are not present in self.parent._resolved_hostports, and raise a RuntimeError with the list of missing keys if any are found. Additionally, in the get_hostfwd_port() method (not shown in this diff), add a check that raises RuntimeError if the hostport value equals 0, preventing silent failures when unresolved ports are returned to callers.

mangelajo · 2026-06-19T10:31:45Z

I need to review the qemu/ssh changes more carefully

raballew · 2026-06-19T07:59:41Z

+                port=port,
+                user=self.username,
+                connect_kwargs={"password": self.password},
+            ) as conn:
+                yield conn
+        except KeyError:


Okay this one is a little backwards but hear me out: I think shell() method catches KeyError to fall back from tcp to vsock, but this never actually works. On the driver side, get_hostfwd_port() raises KeyError, but the gRPC transport catches all exceptions and reraises them as DriverError via StatusCode.UNKNOWN but DriverError doesn't inherit from KeyError so VMs without SSH hostfwd will crash with an unhandled DriverError instead of falling back to vsock.

raballew · 2026-06-19T07:59:41Z

        Path(self.parent._pty).unlink(missing_ok=True)
        Path(self.parent._pty).symlink_to(pty)

+        # Resolve any hostport=0 hostfwd entries to the actual port QEMU chose.
+        # Parse 'info usernet': lines look like "TCP[HOST_FORWARD] fd addr port addr port ..."
+        # Store resolved ports on the parent so get_hostfwd_port() can return them to clients.
+        zero_fwds = {k: v for k, v in self.parent.hostfwd.items() if v.hostport == 0}
+        if zero_fwds:
+            usernet = await qmp.execute("human-monitor-command", {"command-line": "info usernet"})
+            self.logger.debug("info usernet output:\n%s", usernet)
+            for line in usernet.splitlines():
+                parts = line.split()
+                if len(parts) >= 6 and "HOST_FORWARD" in parts[0]:
+                    # parts: Protocol[State] fd hostaddr hostport guestaddr guestport ...
+                    actual_hostaddr, actual_hostport, actual_guestport = parts[2], int(parts[3]), int(parts[5])
+                    for k, v in zero_fwds.items():
+                        if v.hostaddr == actual_hostaddr and v.guestport == actual_guestport:
+                            self.logger.info(
+                                "hostfwd '%s': resolved port 0 -> %s:%d (guest port %d)",
+                                k, actual_hostaddr, actual_hostport, actual_guestport,
+                            )
+                            self.parent._resolved_hostports[k] = actual_hostport
+
        await qmp.execute("system_reset")
        await qmp.disconnect()



A ValueError from malformed output or a QMP error leaves the connection open and the QEMU process orphaned. You could wrap the QMP operation block in try/finally to guarantee await qmp.disconnect() runs.

raballew · 2026-06-19T07:59:41Z

 class Hostfwd(BaseModel):
    protocol: Literal["tcp"] = "tcp"
    hostaddr: str = "127.0.0.1"
-    hostport: int = Field(ge=1, le=65535)
+    hostport: int = Field(ge=0, le=65535)  # 0 = let QEMU pick a free port
    guestport: int = Field(ge=1, le=65535)




If on() fails after process creation but before usernet resolution completes, get_hostfwd_port() returns a stale port. I think adding self.parent._resolved_hostports.clear() at the beginning of on() before starting the new QEMU process, and also in off() alongside the existing cleanup should resolve this.

raballew · 2026-06-19T07:59:41Z

    protocol: Literal["tcp"] = "tcp"
    hostaddr: str = "127.0.0.1"
-    hostport: int = Field(ge=1, le=65535)
+    hostport: int = Field(ge=0, le=65535)  # 0 = let QEMU pick a free port


[HIGH] hostaddr: str = "127.0.0.1" has no format validation. The value is interpolated unsanitized into a comma-separated QEMU -netdev argument. Since QEMU uses commas to delimit netdev options, a crafted hostaddr like 127.0.0.1,hostfwd=tcp:0.0.0.0:4444-:4444 would inject an additional hostfwd rule. The hostport and guestport fields have Field(ge=..., le=...) constraints, but hostaddr accepts arbitrary strings.

Suggested fix: add a Pydantic field_validator using ipaddress.ip_address() to ensure hostaddr is a valid IP address.

AI-generated, human reviewed

raballew · 2026-06-19T07:59:41Z

+    def get_hostfwd_port(self, key: str) -> int:
+        """Return the actual host port for a hostfwd entry (resolves port 0 assignments)."""
+        if key in self._resolved_hostports:
+            return self._resolved_hostports[key]
+        if key in self.hostfwd:
+            return self.hostfwd[key].hostport
+        raise KeyError(f"hostfwd key {key!r} not found")


[HIGH] get_hostfwd_port() is a new exported method with three code paths (resolved port, configured port, KeyError), and the info usernet parsing logic uses positional string splitting with int(parts[3]) and int(parts[5]). Neither has any unit test. The only coverage is the full integration test that boots a real VM. Extracting the parsing logic into a standalone function (e.g., _parse_usernet_hostfwd(usernet_output, zero_fwds)) would make it straightforward to add parameterized unit tests covering all three get_hostfwd_port() code paths as well as edge cases in the usernet output format.

AI-generated, human reviewed

raballew · 2026-06-19T07:59:41Z

    flash_timeout: int = field(default=30 * 60)  # 30 minutes

    _tmp_dir: TemporaryDirectory = field(init=False, default_factory=TemporaryDirectory)
+    # Maps hostfwd key -> actual host port after QEMU resolves port 0 assignments
+    _resolved_hostports: dict[str, int] = field(init=False, default_factory=dict)

    @classmethod
    def client(cls) -> str:


Defer TcpNetwork child creation for hostport=0 entries until after port resolution in on(), or update the child's port after resolution. Use a different key namespace for hostfwd children to avoid overwriting vsock children.

raballew · 2026-06-19T07:59:42Z

+            for line in usernet.splitlines():
+                parts = line.split()
+                if len(parts) >= 6 and "HOST_FORWARD" in parts[0]:
+                    # parts: Protocol[State] fd hostaddr hostport guestaddr guestport ...
+                    actual_hostaddr, actual_hostport, actual_guestport = parts[2], int(parts[3]), int(parts[5])
+                    for k, v in zero_fwds.items():
+                        if v.hostaddr == actual_hostaddr and v.guestport == actual_guestport:
+                            self.logger.info(
+                                "hostfwd '%s': resolved port 0 -> %s:%d (guest port %d)",
+                                k, actual_hostaddr, actual_hostport, actual_guestport,
+                            )
+                            self.parent._resolved_hostports[k] = actual_hostport


The info usernet parser uses int(parts[3]) and int(parts[5]) without try/except. A malformed line would raise `ValueError

raballew · 2026-06-19T07:59:42Z

 from contextlib import contextmanager

 import click
+from fabric import Connection


from fabric import Connection is a new import added by this PR, but fabric is not listed in dependencies.

raballew · 2026-06-19T07:59:42Z

+                port=port,
+                user=self.username,
+                connect_kwargs={"password": self.password},
+            ) as conn:
+                yield conn
+        except KeyError:
+            with FabricAdapter(
+                client=self.ssh,
+                user=self.username,
+                connect_kwargs={"password": self.password},
+            ) as conn:
+                yield conn


Narrow the try/except to only wrap the self.call("get_hostfwd_port", "ssh") call, then branch on the result.

raballew · 2026-06-19T07:59:42Z

        with qemu.shell() as s:
-            assert s.run("uname -r").stdout.strip() == f"6.17.1-300.fc43.{arch}"
+            assert s.run("uname -r").stdout.strip() != ""


Also assert the return code, e.g., result = s.run("uname -r"); assert result.ok; assert result.stdout.strip() != "".

mangelajo requested a review from raballew June 18, 2026 10:36

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread .github/workflows/python-tests.yaml Outdated

Comment thread python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver_test.py Outdated

Comment thread python/packages/jumpstarter-driver-qemu/jumpstarter_driver_qemu/driver.py

mangelajo force-pushed the use-alpine-image-for-qemu-tests branch from 0001659 to e7cceca Compare June 18, 2026 11:56

mangelajo changed the title ~~test: replace Fedora Cloud image with Alpine tiny image in QEMU tests~~ test(qemu): replace Fedora Cloud image with Alpine Linux tiny image Jun 18, 2026

mangelajo force-pushed the use-alpine-image-for-qemu-tests branch 4 times, most recently from cb01714 to 8e61e54 Compare June 18, 2026 12:15

raballew requested changes Jun 18, 2026

View reviewed changes

mangelajo force-pushed the use-alpine-image-for-qemu-tests branch 3 times, most recently from b2962b3 to 83153d4 Compare June 18, 2026 12:33

mangelajo force-pushed the use-alpine-image-for-qemu-tests branch from 83153d4 to 9a25775 Compare June 18, 2026 12:37

mangelajo commented Jun 18, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

mangelajo requested a review from raballew June 18, 2026 12:44

mangelajo marked this pull request as draft June 19, 2026 10:31

raballew requested changes Jun 19, 2026

View reviewed changes

Conversation

mangelajo commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Driver changes (driver.py)

Client changes (client.py)

Test changes (driver_test.py)

CI changes (python-tests.yaml)

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

raballew left a comment

Choose a reason for hiding this comment

Uh oh!

mangelajo commented Jun 18, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mangelajo Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

mangelajo commented Jun 19, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mangelajo commented Jun 18, 2026 •

edited

Loading

Driver changes (`driver.py`)

Client changes (`client.py`)

Test changes (`driver_test.py`)

CI changes (`python-tests.yaml`)

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

mangelajo Jun 18, 2026 •

edited

Loading