Skip to content

fix: decode split chunks after readable setEncoding#5394

Open
marko1olo wants to merge 1 commit into
nodejs:mainfrom
marko1olo:fix-readable-setencoding-utf8
Open

fix: decode split chunks after readable setEncoding#5394
marko1olo wants to merge 1 commit into
nodejs:mainfrom
marko1olo:fix-readable-setencoding-utf8

Conversation

@marko1olo

Copy link
Copy Markdown
Contributor

Fixes #5002.

Summary

  • route BodyReadable.setEncoding() through Readable.prototype.setEncoding() so Node installs its StringDecoder
  • preserve the previous invalid-encoding no-op guard
  • add a regression for a three-byte UTF-8 character split across response body chunks

Tests

  • NODE_PATH=C:\hades\oss\undici\node_modules node --test test\readable.js
  • NODE_PATH=C:\hades\oss\undici\node_modules C:\hades\oss\undici\node_modules\.bin\eslint.cmd --no-cache lib\api\readable.js test\readable.js
  • git diff --check

Signed-off-by: marko1olo <barsukdana@gmail.com>
@marko1olo marko1olo force-pushed the fix-readable-setencoding-utf8 branch from cb16258 to 04f3c29 Compare June 8, 2026 14:25
@marko1olo

Copy link
Copy Markdown
Contributor Author

Follow-up on the red CI after the amended 04f3c290 push:

  • The readable/setEncoding regressions pass locally:
    • NODE_PATH=C:\hades\oss\undici\node_modules node --test test\readable.js -> 14 passed
    • NODE_PATH=C:\hades\oss\undici\node_modules node --test test\client-request.js --test-name-pattern "request multibyte" -> 47 passed
    • scoped eslint and git diff --check passed
  • The current failing CI cluster is test/interceptors/cache.js (vary headers are present in revalidation request, max-stale, stale-if-error). I reproduced the same 49 passed / 3 failed result on a detached origin/main worktree at 6ea54ef8, without this PR patch.

So the current cache-interceptor failures look like a baseline/main failure rather than a regression from this readable change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

setEncoding('utf8') on response body corrupts multi-byte UTF-8 characters at chunk boundaries

1 participant