PLT-460: Distribution samplers (uniform, zipfian θ)#46
Conversation
Implement the real index samplers behind the frozen Distribution wire format from PLT-455, bound to the PLT-456 seeded sub-streams. Uniform: Stream.Uint64N(n) seeded, rand.Uint64N(n) unbound. Zipfian: YCSB precomputed-zeta generator. zeta(n, theta) is computed once per keyspace size n in O(n) and cached (summed smallest-term-first for numerical stability at n=1e6); each draw is O(1). n arrives at sample time, so the cache fills lazily and recomputes only if n changes. Stream binding mirrors GasPicker.SetStream: Distribution.SetStream type-switches the delegate; a nil stream falls back to the global RNG. bindDistributionStreams wires the per-scenario streams in the generator. FROZEN contract: added stream ids "dist:%d:key" / "dist:%d:size" (append-only; documented in rng.go input #2 and streams.go). Tests: chi-square uniformity; top-1% mass rising monotonically with theta (1% -> 9.5% -> 42% -> 53%); per-stream determinism; seeded != unseeded binding guard; n=0 error; NaN/range sweep across theta; n=1e6 init+1000 draws bounded (~50ms). go build + go test -race green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Guard eta against the n<=2 NaN (denom==0) by pinning it to 0; it is provably never read at n<=2 but a NaN in cached state is a refactor hazard. Document the n-stability contract on SampleIndex so PLT-465 can not silently trigger per-draw O(n) zeta recomputes. Rename thetaPow2 -> halfPowTheta (it holds 0.5^theta), inline the RNG fallback to mirror UniformDistribution/gas.go, and mark the struct not copy-safe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PR SummaryMedium Risk Overview Uniform draws in The generator calls Tests add determinism, seeded vs unseeded, chi-square uniform check, zipfian skew vs θ, init cost, and numerical edge cases. Reviewed by Cursor Bugbot for commit f638310. Bugbot is set up for automated code reviews on this repo. Configure here. |
- Drop the pointless JSON unmarshal into the field-less UniformDistribution (its only field, the seeded stream, is bound via SetStream, not JSON) — SA9005. - Remove the pointless math.IsNaN on a uint64-derived float in the zipfian test; the in-range assertion is the real guard against a NaN-derived draw — SA4015. Caught by golangci-lint (CI gate); local build+test+vet did not run it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the dense distribution narrative (zipfian zeta math, precompute-once design, numerical stability, frozen wire format, seeded-stream reproducibility) into a new package-level doc.go and lean distribution.go's comments to terse pointers. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…parallelism - config/doc.go: 'sub-stream' was split across lines, rendering as 'sub- stream' under go doc; reflow to 'substream'. - distribution.go: restore the '(no Name)' gloss on SampleIndex for parallelism with SetStream and the package doc. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implements PLT-460 — the real
SampleIndexsamplers behind the PLT-455Distributiontype, drawing from PLT-456 seeded sub-streams.What
[0,n).Distribution.SetStream(mirrorsGasPicker.SetStream); newdist:%d:key/dist:%d:sizestream ids added to the FROZEN contract (append-only, non-perturbing). Unbound → global RNG (preserves the unseeded path).Review (systems + idiom)
-raceclean.etaNaN at n≤2; documented the n-must-be-stable contract; renamed a builtin-shadowing helper + a misleading field; marked the mutex-bearing struct copy-unsafe.Decision brief:
designs/sei-load-workload-modeler/PLT-460-distribution-samplers.md.🤖 Generated with Claude Code