Removed ruleset, dense option#67
Conversation
There was a problem hiding this comment.
Code Review
This pull request replaces the static %dense directive with a dynamic table layout strategy (TableLayout), allowing the parser to automatically choose between DenseState and SparseState based on estimated memory footprints and a configurable dense_limit. Additionally, it removes the unused ruleset fields and related serialization logic from the parser states, simplifying the generated code. Feedback on the changes suggests guarding against a potential division-by-zero error in the dynamic layout calculation and refining the memory footprint heuristic to account for collection and struct overheads.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // Dense table memory estimation: | ||
| // Each slot is an Option<ShiftTarget<StateIndex>>, which takes ~4 bytes (assuming 16-bit state index). | ||
| total_dense_size += (term_span + nonterm_span + reduce_span) * 4; | ||
|
|
||
| // Sparse table memory estimation: | ||
| // Each slot stores a tuple (Symbol, ShiftTarget), taking ~8 bytes. | ||
| total_sparse_size += (term_count + nonterm_count + reduce_count) * 8; |
There was a problem hiding this comment.
The memory footprint estimation for both DenseState and SparseState is a helpful heuristic, but it currently omits several significant overheads:
- Dense Table Overhead: It only estimates the size of the transition/reduce tables (
term_span,nonterm_span,reduce_span), but ignores the memory occupied by the key vectors (shift_goto_map_class_keysandshift_goto_map_nonterm_keys) and the heap/struct overhead of the threeVecs (which is 24 bytes perVecon 64-bit systems, totaling 72 bytes per state). - Sparse Table Overhead: It estimates 8 bytes per slot, but
HashMaphas a much larger overhead per entry (typically 16-32 bytes due to bucket metadata and padding) plus theHashMapstruct overhead itself (48 bytes per map, totaling 144 bytes per state).
While this is a heuristic, the absolute check total_dense_size <= self.dense_limit might be significantly off from the actual heap allocation size. Consider documenting these omissions or refining the heuristic to account for these overheads.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Implemented compile-time memory footprint estimation to dynamically decide between DenseState (direct indexing, fast O(1)) and SparseState (binary search O(log K), memory efficient).
Auto-detection uses a user-configurable limit (defaulting to 32KB) that determines if the dense representation fits cleanly in L1/L2 caches.
Replaced
--densewith--layout <auto|dense|sparse>and added--dense-limit <limit>inarg.rs/main.rsAdded
.layout()and.dense_limit()builder methods inrusty_lr_buildscript
while retaining a deprecated
.dense()forwarding method for backward compatibility.Removed
%denseoption