perf: configurable pool buffer retention limit via SetPoolBufferLimit by xe-nvdk · Pull Request #74 · Basekick-Labs/msgpack

xe-nvdk · 2026-06-12T20:57:54Z

Summary

Closes #62.

PutEncoder/PutDecoder drop internal buffers above 32KiB, so workloads with consistently larger messages re-grow the buffer on every pooled use. This adds SetPoolBufferLimit(n) to raise the retention limit. Of the three options in the issue, the configurable limit was chosen over (1) raising the default — which would silently change memory retention for all users — and (3) adaptive sizing — which adds racy bookkeeping for marginal benefit.

Design notes:

Default (32 KiB) unchanged. Values below the default clamp to it; n <= 0 restores it.
Limit stored in an atomic.Int64; the Put path checks the constant default first (poolBufOversized), so small buffers — the common case — never pay for the atomic load. Verified inlinable.
Applied to both encoder wbuf and decoder buf, per the issue.

Benchmarks (Apple M3 Max, benchstat, count=8)

100KB payloads, limit raised to 256 KiB vs default:

benchmark	default limit	256k limit	delta
MarshalAppendLarge100KB	9.31µs, 104KiB, 3 allocs	4.61µs, 24B, 1 alloc	-50% ns/op
DecodeLargeStream100KB	16.14µs, 208KiB, 3 allocs	10.23µs, 104KiB, 1 alloc	-37% ns/op

No regression at default settings vs v6 (the atomic is not touched for small buffers):

benchmark	v6	branch	delta
StructMarshal	368.5n	363.6n	-1.33%
StructUnmarshal	338.8n	338.9n	~

Testing

pool_test.go (first internal package msgpack test file — needed to inspect unexported wbuf/buf): retention under raised limit, dropping under default, clamping behavior.
go test ./... and go test -race ./... pass.
Internal review pass (correctness+security, quality+perf) completed; only finding was the missing CHANGELOG entry, fixed in the follow-up commit.

PutEncoder/PutDecoder drop internal buffers whose capacity exceeds 32KiB, so workloads with consistently larger messages re-grow the buffer on every pooled use. Add SetPoolBufferLimit(n) to raise the retention limit for such workloads. The default (32KiB) is unchanged; values below the default are clamped to it, n <= 0 restores it. The limit is stored atomically, and the Put path checks the constant default first so small buffers (the common case) never pay for the atomic load — small-message benchmarks are unchanged vs v6: StructMarshal 368.5n ± 0% 363.6n ± 1% -1.33% StructUnmarshal 338.8n ± 0% 338.9n ± 0% ~ 100KB payloads with the limit raised to 256KiB (count=8): MarshalAppendLarge100KB 9.31µs → 4.61µs (-50%) 104KiB → 24B/op 3 → 1 allocs DecodeLargeStream100KB 16.14µs → 10.23µs (-37%) 208KiB → 104KiB 3 → 1 allocs Closes #62

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request introduces a configurable pooled buffer retention limit via SetPoolBufferLimit(n) to allow workloads with large messages to retain grown buffers in the encoder/decoder pools instead of re-allocating. It includes benchmarks, implementation using atomic operations, and unit tests. Feedback points out that the unit tests in pool_test.go rely on the non-deterministic behavior of sync.Pool, which can lead to flaky test failures during garbage collection or goroutine rescheduling. It is recommended to test the underlying poolBufOversized function directly to ensure deterministic test results.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-12T20:59:00Z

+func TestPoolBufferLimitEncoder(t *testing.T) {
+	const big = 100 * 1024
+
+	// Default limit: an oversized wbuf is dropped on Put.
+	enc := GetEncoder()
+	enc.wbuf = make([]byte, big)
+	PutEncoder(enc)
+	enc = GetEncoder()
+	if cap(enc.wbuf) > defaultPoolBufferLimit {
+		t.Fatalf("wbuf cap=%d retained above default limit", cap(enc.wbuf))
+	}
+	PutEncoder(enc)
+
+	// Raised limit: the same buffer is retained.
+	SetPoolBufferLimit(256 * 1024)
+	defer SetPoolBufferLimit(0)
+
+	enc = GetEncoder()
+	enc.wbuf = make([]byte, big)
+	PutEncoder(enc)
+	enc = GetEncoder()
+	if cap(enc.wbuf) < big {
+		t.Fatalf("wbuf cap=%d, want >= %d retained under raised limit", cap(enc.wbuf), big)
+	}
+	PutEncoder(enc)
+}
+
+func TestPoolBufferLimitDecoder(t *testing.T) {
+	const big = 100 * 1024
+
+	dec := GetDecoder()
+	dec.buf = make([]byte, big)
+	PutDecoder(dec)
+	dec = GetDecoder()
+	if cap(dec.buf) > defaultPoolBufferLimit {
+		t.Fatalf("buf cap=%d retained above default limit", cap(dec.buf))
+	}
+	PutDecoder(dec)
+
+	SetPoolBufferLimit(256 * 1024)
+	defer SetPoolBufferLimit(0)
+
+	dec = GetDecoder()
+	dec.buf = make([]byte, big)
+	PutDecoder(dec)
+	dec = GetDecoder()
+	if cap(dec.buf) < big {
+		t.Fatalf("buf cap=%d, want >= %d retained under raised limit", cap(dec.buf), big)
+	}
+	PutDecoder(dec)
+}


These tests rely on sync.Pool returning the exact same encoder/decoder instance that was just put back. However, sync.Pool behavior is non-deterministic: the pool can be cleared at any time by the garbage collector (GC), or it may return a new instance if the goroutine is rescheduled onto a different P or thread. When this happens, GetEncoder() or GetDecoder() will return a newly allocated instance with a nil buffer, causing the test to fail with a false positive (e.g., wbuf cap=0, want >= 102400 retained under raised limit).

To make the tests 100% deterministic and robust, we should test the underlying poolBufOversized function directly with various capacities and configured limits, rather than relying on sync.Pool side effects.

func TestPoolBufferLimit(t *testing.T) { // Test default limit (32 KiB) if poolBufOversized(16 * 1024) { t.Error("16 KiB buffer should not be oversized under default limit") } if poolBufOversized(32 * 1024) { t.Error("32 KiB buffer should not be oversized under default limit") } if !poolBufOversized(33 * 1024) { t.Error("33 KiB buffer should be oversized under default limit") } // Test raised limit (256 KiB) SetPoolBufferLimit(256 * 1024) defer SetPoolBufferLimit(0) if poolBufOversized(100 * 1024) { t.Error("100 KiB buffer should not be oversized under 256 KiB limit") } if poolBufOversized(256 * 1024) { t.Error("256 KiB buffer should not be oversized under 256 KiB limit") } if !poolBufOversized(257 * 1024) { t.Error("257 KiB buffer should be oversized under 256 KiB limit") } }

Ignacio Van Droogenbroeck and others added 2 commits June 12, 2026 14:53

docs: add SetPoolBufferLimit to CHANGELOG (review follow-up)

7215fbb

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

gemini-code-assist Bot reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: configurable pool buffer retention limit via SetPoolBufferLimit#74

perf: configurable pool buffer retention limit via SetPoolBufferLimit#74
xe-nvdk wants to merge 2 commits into
v6from
perf/pool-buffer-limit

xe-nvdk commented Jun 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xe-nvdk commented Jun 12, 2026

Summary

Benchmarks (Apple M3 Max, benchstat, count=8)

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant