Skip to content

virtio/net: fix ENAMETOOLONG on long gvproxy socket paths (macOS)#697

Open
arewm wants to merge 1 commit into
containers:mainfrom
arewm:fix-unixgram-local-socket-path
Open

virtio/net: fix ENAMETOOLONG on long gvproxy socket paths (macOS)#697
arewm wants to merge 1 commit into
containers:mainfrom
arewm:fix-unixgram-local-socket-path

Conversation

@arewm
Copy link
Copy Markdown

@arewm arewm commented May 29, 2026

Assisted-by: Claude Code (Sonnet 4.6)

Problem

On macOS, Unixgram::open derives the local bind address by appending -krun.sock to the peer (gvproxy) socket path:

let local_addr = UnixAddr::new(&PathBuf::from(format!("{}-krun.sock", path.display())))

macOS enforces a 104-byte unix socket path limit. The gvproxy peer path can be long enough that appending any suffix pushes the local path over this limit. UnixAddr::new returns ENAMETOOLONG, which propagates as BadActivate and panics the vCPU thread. krunkit stays alive but the VM is frozen at 0% CPU indefinitely.

This affects any deployment where the gvproxy socket path is long enough that a 9-byte suffix would exceed the limit — common with default podman machine naming on macOS.

Fix

Place the local bind socket in the same directory as the peer, using a short PID + per-process atomic counter filename (krun-net-<pid>-<n>.sock). The peer filename always contains the machine name, making it longer than our fixed-format name for any reasonably-named machine. PID ensures uniqueness across concurrent krunkit processes; the counter handles multiple virtio-net devices within a single process. The existing unlink() before bind() handles stale sockets from crashed previous runs.

Test

Adds net-gvproxy-long-path: exercises the gvproxy backend with a peer socket path long enough that the old suffix-based approach would have exceeded the 104-byte macOS limit. Verifies the VM starts and achieves network connectivity.

Fixes: #689

@slp
Copy link
Copy Markdown
Collaborator

slp commented May 29, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a potential ENAMETOOLONG error on macOS when creating local unixgram socket paths by using a combination of the process ID and a global atomic counter instead of appending a suffix to the peer path. It also adds a test case to verify this behavior with long socket paths. Feedback on the changes highlights a security vulnerability (CWE-377) caused by hardcoding /tmp for the temporary socket path, which could lead to a local Denial of Service (DoS). It is recommended to use std::env::temp_dir() instead to leverage secure, user-specific temporary directories.

Comment thread src/devices/src/virtio/net/unixgram.rs Outdated
@arewm arewm force-pushed the fix-unixgram-local-socket-path branch 3 times, most recently from 473b4db to 5166a30 Compare May 29, 2026 13:06
macOS imposes a 104-byte limit on unix socket paths. The gvproxy peer
socket path can be long enough that appending a suffix to derive the
local bind address exceeds this limit, triggering ENAMETOOLONG. This
surfaced as BadActivate from the net device, panicking the vCPU thread
and leaving the VM frozen at 0% CPU while krunkit stayed alive.

Place the local bind socket in the same directory as the peer using a
short PID + per-process atomic counter filename (krun-net-<pid>-<n>.sock).
The peer filename always contains the machine name, making it longer than
our fixed-format name for any reasonably-named machine. PID + counter
guarantees uniqueness across concurrent krunkit processes and multiple
virtio-net devices within a single process.

Add a net-gvproxy-long-path integration test that constructs a peer
socket path long enough that the old suffix-based approach would exceed
the 104-byte limit, and verifies that the VM starts and achieves network
connectivity through gvproxy.

Assisted-by: Claude Code (Sonnet 4.6)
Signed-off-by: arewm <arewm@users.noreply.github.com>
@arewm arewm force-pushed the fix-unixgram-local-socket-path branch from 5166a30 to d45b2a7 Compare May 29, 2026 13:07
Copy link
Copy Markdown
Member

@jakecorrenti jakecorrenti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@slp
Copy link
Copy Markdown
Collaborator

slp commented Jun 2, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a per-process counter and process ID to generate unique, shorter local unixgram socket filenames, preventing ENAMETOOLONG errors on macOS when the peer socket path is long. It also adds a test case (net-gvproxy-long-path) to verify this behavior. The review feedback highlights two key improvements: first, implementing the Drop trait for Unixgram to prevent socket file leaks, as the current unlink call before binding no longer cleans up the dynamically named sockets; second, adding a fallback to the temporary directory if the constructed path still exceeds the 104-byte limit due to an exceptionally long parent directory.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/devices/src/virtio/net/unixgram.rs
Comment on lines +95 to +98
let local_path = path
.parent()
.map(|dir| dir.join(&socket_name))
.unwrap_or_else(|| std::env::temp_dir().join(&socket_name));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the parent directory of the peer socket is extremely long, the constructed local_path might still exceed the 104-byte limit on macOS, even with the shorter krun-net-{pid}-{counter}.sock filename. To make this more robust, we can check the length of the constructed path and fall back to the temporary directory if it exceeds 103 bytes.

Suggested change
let local_path = path
.parent()
.map(|dir| dir.join(&socket_name))
.unwrap_or_else(|| std::env::temp_dir().join(&socket_name));
let local_path = {
let p = path
.parent()
.map(|dir| dir.join(&socket_name))
.unwrap_or_else(|| std::env::temp_dir().join(&socket_name));
if p.to_str().map_or(0, |s| s.len()) >= 104 {
std::env::temp_dir().join(&socket_name)
} else {
p
}
};

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, please just always create the socket in std::env::temp_dir().

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that in a previous version of this PR but I thought it would be better to keep all of the sockets closer to the context where they are used. Always using std::env::temp_dir() would certainly simplify implementation.

Copy link
Copy Markdown
Collaborator

@slp slp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Just a minor change requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

virtio-net unixgram backend silently kills vCPU on macOS: local socket path exceeds 104-byte OS limit

3 participants