Size the buffer to the input for one-shot Base-N encode by nishantmehta · Pull Request #439 · apache/commons-codec

nishantmehta · 2026-06-29T00:02:12Z

Summary

BaseNCodec.encode(byte[], int, int) — which backs encode(byte[]), encodeToString and the static Base64/Base32 helpers — created a Context whose buffer was lazily allocated by ensureBufferSize at max(size, 8192). For a one-shot encode of a small input this allocated the full 8192-byte default streaming buffer regardless of the actual output size.

The exact output size is already computable via getEncodedLength, so this pre-sizes the context buffer to it before encoding. The streaming path (Base64OutputStream etc.) is unchanged and still grows from the default size. When the encoded length does not fit an int the code falls back to the streaming buffer (such an output cannot be returned as a single array anyway). getEncodedLength(byte[]) is refactored to delegate to a private length-based helper so the one-shot encode can size from the requested range length.

Measurement

ThreadMXBean allocation driver, 200k warmed ops, 25-byte input:

call	before	after
`Base64.encodeToString`	8479 B/op	345 B/op (−96%)
`Base32.encodeToString`	8592 B/op	458 B/op (−95%)

Testing

Base64Test, Base32Test, Base16Test, BaseNCodecTest and the Base64 input/output stream tests pass unchanged (360 tests). The streaming buffer growth path is exercised by the existing stream tests.

BaseNCodec.encode(byte[], int, int) (which backs encode(byte[]), encodeToString and the static Base64/Base32 helpers) created a Context whose buffer was lazily allocated by ensureBufferSize at max(size, 8192). For a one-shot encode of a small input this allocated the full 8192-byte default streaming buffer regardless of the actual output size. The exact output size is already computable via getEncodedLength, so pre-size the context buffer to it before encoding. The streaming path (Base64OutputStream etc.) is unchanged and still grows from the default size. When the encoded length does not fit an int the code falls back to the streaming buffer; such an output cannot be returned as a single array anyway. getEncodedLength(byte[]) is refactored to delegate to a private length-based helper so the one-shot encode can size from the requested range length. Measured with a ThreadMXBean allocation driver (200k warmed ops, 25-byte input): Base64.encodeToString 8479 B/op -> 345 B/op (-96%) Base32.encodeToString 8592 B/op -> 458 B/op (-95%) Base64Test, Base32Test, Base16Test, BaseNCodecTest and the Base64 input/output stream tests pass unchanged (360 tests). Signed-off-by: Nishant Mehta <nishantmehta.n@gmail.com>

garydgregory · 2026-06-29T02:30:08Z

Closing: change for the sake of change and not worth the extra complications and bloat.

garydgregory closed this Jun 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Size the buffer to the input for one-shot Base-N encode#439

Size the buffer to the input for one-shot Base-N encode#439
nishantmehta wants to merge 1 commit into
apache:masterfrom
nishantmehta:pr/base-n-buffer-size

nishantmehta commented Jun 29, 2026

Uh oh!

garydgregory commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

nishantmehta commented Jun 29, 2026

Summary

Measurement

Testing

Uh oh!

garydgregory commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants