Skip to content

Release branch v1.5.3#318

Merged
cuonglm merged 13 commits into
v1.0from
release-branch-v1.5.3
Jun 22, 2026
Merged

Release branch v1.5.3#318
cuonglm merged 13 commits into
v1.0from
release-branch-v1.5.3

Conversation

@cuonglm

@cuonglm cuonglm commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Minor Release

This contains new features, significant performance improvements, and bug fixes.

Security

  • Upgraded quic-go to v0.59.1 to address CVE-2026-40898

  • Rejected oversized upstream DNS responses on the DoH, DoH3, and DoQ paths — these previously used io.ReadAll on attacker-controlled responses before enforcing any protocol-level limit, allowing a malicious or compromised upstream to force unbounded buffering. Bodies are now capped at dns.MaxMsgSize (and non-200 DoH error bodies are bounded as well)

  • Validated DNS-over-QUIC response framing (RFC 9250) — the resolver previously assumed at least two bytes were present and could panic on truncated or malicious replies; the length prefix is now validated and framing failures retire the connection from the pool

  • Rate-limited PIN attempts on the control socket to provide defense-in-depth against brute-force if an attacker gains socket access

  • Switched temp file creation to os.CreateTemp for symlink-safe writes, preventing symlink attacks on systems without fs.protected_symlinks (e.g. embedded routers)

  • Switched internal/router/dnsmasq to text/template instead of html/template, since the generated config is plain text

Improved

  • Shared a single QUIC transport and UDP socket across DoQ dials so parallel dial and reconnect churn no longer allocate a socket per attempt or leak sockets; the query stream's send side is now closed before reading the response per RFC 9250 §4.2

  • Updated the Docker base image to bookworm

Fixed

  • Refreshed macOS VPN DNS after pf stabilization

  • Allowed intercept fallback for the default listener

  • Flushed pf states after a forced DNS intercept reload

cuonglm and others added 13 commits June 16, 2026 14:46
Go's default is already TLS 1.2+ (since Go 1.18), but making this
explicit satisfies RFC 7858/9250 recommendations and makes the security
intent clear for auditors.
Since this is a plain-text config, not html.
Current code writes to a predictable path, which on systems without
`fs.protected_symlinks` (e.g. embedded routers) could allow a local
attacker with API compromise to perform symlink attacks.
Currently there is no limit on PIN attempts, allowing unlimited
brute force if an attacker gains socket access. While the socket is
root-only by default, rate limiting is cheap defense-in-depth.
DoQ responses are length-prefixed per RFC 9250. The resolver previously
assumed the stream always contained at least two bytes and unpacked from
buf[2:], which could panic on truncated or malicious replies.

Validate the prefix against the bytes read, return a clear error, and
retire the connection from the pool on framing failure. Unpack only the
slice declared by the prefix so a short read cannot be misinterpreted as
a full message.

Add regression coverage with a small test server that returns malformed
raw payloads (empty, one byte, prefix-only, prefix larger than payload).
DoQ pools now keep a single quic.Transport and UDP socket for all dials,
so parallel dial and reconnect churn no longer allocate a new socket per
attempt or leak the winner's UDP conn when the caller owns the packet
conn.

quicParallelDialer accepts an optional transport: when set, dials use
Transport.DialEarly on that socket; when nil, behavior matches the old
per-dial ListenUDP path (losers close their sockets).

Per RFC 9250 §4.2, close the query stream's send side before reading the
response so strict upstreams see STREAM FIN before answering.

CloseIdleConnections closes the shared transport and underlying UDP
conn so checked-out connections and the OS socket are torn down.

Add a FIN-strict test server, coverage for bootstrap vs parallel-dial
paths, and a Linux-only FD churn regression test.
DoH, DoH3, and DoQ response paths previously used io.ReadAll on
attacker-controlled upstream responses before enforcing any protocol-level
size limit. A malicious or compromised upstream could return an oversized
body or stream and force ctrld to buffer unbounded data before eventually
failing DNS parsing.

Cap DoH/DoH3 response bodies at dns.MaxMsgSize and cap DoQ streams at the
2-byte length prefix plus dns.MaxMsgSize. Also limit non-200 DoH error
bodies so error formatting cannot consume large upstream responses.
Stick to go1.25 for now, since using go1.26 causing a runtime panic when
building arm platforms.
The test:windows CI job intermittently failed to clean up .testbin with
"Access to the path '...cmd_cli.test.exe' is denied". This was previously
attributed to Windows Defender scanning the large unsigned test binaries,
and mitigated with Defender exclusions and cleanup retries. That was
treating a symptom.

Root cause: performUpgrade() self-upgrades by running
exec.Command(os.Executable(), "upgrade", "prod", "-vv") as a detached,
windowless child. In the real ctrld binary this re-execs ctrld and is
correct. Under `go test`, os.Executable() is the test binary itself, and
`go test` stops flag parsing at the first positional arg ("upgrade") and
ignores the rest -- so the child silently re-runs the entire test suite.
That child hits the upgrade tests again and spawns more detached children,
recursively: a fork bomb of hidden processes that pins the runner's
CPU/memory and keeps the test binary's image file locked. Windows refuses
to delete the image of a running process, hence the "Access is denied"
during after_script. Whether any children are still alive when cleanup
runs is a timing race, which is why the failure was flaky.

Two tests reached this path: Test_performUpgrade (directly) and
Test_selfUpgradeCheck (via selfUpgradeCheck -> performUpgrade on the
"upgrade allowed" case).

Fix:
- prog.go: extract the command construction into a package-level
  newUpgradeCmd var. Production behavior is unchanged.
- main_test.go: stub newUpgradeCmd once in TestMain so the whole test
  binary self-execs with `-test.run=^$` (matches no tests, exits
  immediately) instead of re-running the suite. This covers every test
  that reaches performUpgrade, present and future, while still exercising
  the cmd.Start() success path.
@cuonglm cuonglm changed the title [WIP] Release branch v1.5.3 Release branch v1.5.3 Jun 22, 2026
@cuonglm cuonglm merged commit 5bf26da into v1.0 Jun 22, 2026
3 checks passed
@cuonglm cuonglm deleted the release-branch-v1.5.3 branch June 22, 2026 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant