perf(encode): reuse the LDM producer table across dict frames by polaz · Pull Request #440 · structured-world/structured-zstd

polaz · 2026-06-21T15:06:43Z

Summary

Two compress-dict / LDM follow-ups on the reused-encoder path (the CoordiNode
per-label-dictionary shape):

Reuse the LDM producer table across frames. On the reused-encoder
compress-dict path, reset built a fresh LdmProducer every frame,
re-allocating its hash table (large at btultra2), while upstream zstd reuses
its ldmState_t. Reuse the existing allocation when the derived LDM params
are unchanged (clear() re-zeros without allocating; only a params change or
the first frame builds a new one).
Run the wasm dict bench over all benched levels. The wasm dictionary
bench looped [3, 19] while the non-dict bench covers [1, 3, 19, 22], so
levels 1 and 22 had no dict coverage. Aligned to LEVELS.

Impact

compress-dict/level_22_btultra2_ldm timing: −3.84% (criterion, i9;
"Performance has improved", 31.66 → 30.45 µs vs base). Byte-identical output.
wasm: levels 1 and 22 now exercised for dict compress/decompress; verified
L1/L22 dict round-trip on the wasm build (L22 gives the best ratio).

Testing

cargo nextest run -p structured-zstd --features "hash dict_builder" — 823/823
pass, including the dict-primed byte-identity and reused-encoder tests.
wasm dict round-trip verified for all four levels against the simd128 payload.

Summary by CodeRabbit

Tests
- Expanded dictionary compression benchmarks to cover all compression levels for more comprehensive testing.
Refactor
- Optimized compression engine efficiency by improving memory allocation reuse across operations, reducing overhead.

On the reused-encoder compress-dict path, reset built a fresh LdmProducer every frame, re-allocating its hash table (large at btultra2). The producer is designed for reuse (clear() re-zeros without allocating); reuse the existing allocation when the derived LDM params are unchanged, matching upstream zstd's ldmState_t lifecycle. clear() keeps the table correct for the new frame.

The dictionary bench looped [3, 19] while the non-dict bench covers [1, 3, 19, 22], so levels 1 and 22 had no dict coverage. Align to LEVELS; verified L1/L22 dict round-trip on the wasm build.

coderabbitai · 2026-06-21T15:06:50Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 020fa367-a116-47ea-b879-be668155a05c

📥 Commits

Reviewing files that changed from the base of the PR and between aed82fc and ab93dc9.

📒 Files selected for processing (2)

zstd-wasm/bench/bench.mjs
zstd/src/encoding/match_generator.rs

📝 Walkthrough

Walkthrough

The PR makes two changes: the dictionary benchmark in bench.mjs now iterates over the shared LEVELS constant (replacing a hardcoded [3, 19] subset) in both the computation and reporting loops, and MatchGeneratorDriver::reset in match_generator.rs is updated to reuse an existing LdmProducer allocation when its params() match the newly derived LDM parameters, calling clear() instead of always constructing a new producer.

Changes

Dictionary LDM reuse and benchmark coverage

Layer / File(s)	Summary
Dictionary benchmark level iteration `zstd-wasm/bench/bench.mjs`	Both the `dictRows` computation loop and the per-scenario results reporting loop now iterate over `LEVELS` instead of the hardcoded `[3, 19]` array.
LdmProducer conditional reuse in `MatchGeneratorDriver::reset` `zstd/src/encoding/match_generator.rs`	In the `HashChain` backend path, derived LDM parameters are computed first; the existing producer is extracted via `hc.take_ldm_producer()`, its `params()` are compared to the derived value, and either `clear()` is called to reuse it or a fresh `LdmProducer::new(p)` is constructed before storing back via `hc.set_ldm_producer`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

perf(encode): L22 small-input compress speed (optimal parser + per-frame table zeroing) #337: The match_generator.rs change directly implements per-frame LDM producer reuse to avoid redundant allocation overhead during dictionary-based compression, which matches the objective described in this issue.
perf(encode): close the L6-L9 Row/lazy dictionary speed gap #434: The benchmark change to iterate over all LEVELS enables measuring dictionary performance across the full level range, which is the gap identified in this issue; the LDM producer reuse is a likely contributing optimization.

Possibly related PRs

structured-world/structured-zstd#397: Both PRs modify MatchGeneratorDriver::reset in match_generator.rs around HashChain LDM producer creation and reset during dictionary (primed) restore, making the LDM state management logic directly overlap.

Poem

🐇 Hop, hop, no more waste!
Old producer, same params? Clear and paste!
LEVELS now run full and wide,
Dict benchmarks bloom on every side.
No fresh alloc when old will do—
This bunny optimized for you! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately reflects the main performance optimization introduced in the PR: reusing the LDM producer table across dictionary-compressed frames instead of allocating a new one per frame.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/compress-dict-ldm-realloc

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-06-21T15:09:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-21T15:16:22Z

Greptile Summary

This PR improves performance on the reused-encoder compress-dict path by reusing the LdmProducer allocation across frames when the derived LDM params are unchanged, mirroring upstream zstd's ldmState_t reuse. It also extends the wasm dict benchmark to cover all four compression levels (1, 3, 19, 22) instead of only 3 and 19.

match_generator.rs: Moves LDM param resolution outside the if let MatcherStorage::HashChain block (needed to satisfy the borrow checker), then reuses the existing producer via take/clear/set when params match, or falls back to LdmProducer::new on a first frame or params change. clear() is called explicitly for correctness even though BtMatcher::reset already did it, making it a harmless double-clear.
bench.mjs: Replaces the hardcoded [3, 19] loop in both dict-bench loops with the shared LEVELS = [1, 3, 19, 22] constant, giving dict compress/decompress coverage at levels 1 and 22.

Confidence Score: 5/5

Safe to merge; both changes are narrow and well-reasoned, and the Rust logic is backed by matching test coverage (823/823 pass) and byte-identical output verification.

The Rust refactor correctly handles all cases (first frame, params-unchanged reuse, params change, LDM disabled, non-BT strategy) and the borrow-checker split is a textbook Rust pattern. The double clear() is redundant but harmless. The bench change is a one-line constant substitution with no risk of regression.

No files require special attention.

Important Files Changed

Filename	Overview
zstd/src/encoding/match_generator.rs	Restructures the LDM producer wiring block to reuse the existing allocation when derived params are unchanged, avoiding per-frame re-allocation of the large LDM hash table on the reused-encoder dict path.
zstd-wasm/bench/bench.mjs	Aligns the dict bench loop variable from a hardcoded `[3, 19]` to the shared `LEVELS` constant so levels 1 and 22 receive dict-compress/decompress coverage.

Comments Outside Diff (1)

zstd/src/encoding/match_generator.rs, line 2635-2641 (link)

The surrounding comment still says "installing here gives the new frame a fresh producer," but the new behaviour may reuse the existing cleared allocation rather than always installing a freshly-allocated one. The semantic outcome is the same (a correctly initialised producer), but the comment is slightly misleading after this change.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

_{Reviews (1): Last reviewed commit: "test(wasm-bench): run dict compress/deco..." | Re-trigger Greptile}

polaz · 2026-06-21T15:59:02Z

@coderabbitai review

coderabbitai · 2026-06-21T15:59:08Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

polaz added 2 commits June 21, 2026 17:15

test(wasm-bench): run dict compress/decompress over all benched levels

ab93dc9

The dictionary bench looped [3, 19] while the non-dict bench covers [1, 3, 19, 22], so levels 1 and 22 had no dict coverage. Align to LEVELS; verified L1/L22 dict round-trip on the wasm build.

polaz merged commit dc9e3f7 into main Jun 21, 2026
28 checks passed

polaz deleted the perf/compress-dict-ldm-realloc branch June 21, 2026 16:16

sw-release-bot Bot mentioned this pull request Jun 21, 2026

chore: release v0.0.43 #438

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(encode): reuse the LDM producer table across dict frames#440

perf(encode): reuse the LDM producer table across dict frames#440
polaz merged 2 commits into
mainfrom
perf/compress-dict-ldm-realloc

polaz commented Jun 21, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Jun 21, 2026

Uh oh!

greptile-apps Bot commented Jun 21, 2026 •

edited

Loading

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Comments Outside Diff (1)

Uh oh!

polaz commented Jun 21, 2026

Uh oh!

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

polaz commented Jun 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Impact

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Jun 21, 2026

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Comments Outside Diff (1)

Uh oh!

polaz commented Jun 21, 2026

Uh oh!

coderabbitai Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

polaz commented Jun 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading

greptile-apps Bot commented Jun 21, 2026 •

edited

Loading

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading