Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for the box keyword to automatically box large semantic values (such as %tokentype or non-terminal RuleTypes) in the generated parser's data stack enum, optimizing memory usage. The parser generator automatically handles wrapping (Box::new) and unwrapping (*val) during reduce actions. Feedback on the pull request suggests optimizing the parsing of rule typenames in rusty_lr_parser/src/grammar.rs to avoid redundant cloning and parsing of the TokenStream by combining the checks for boxing and placeholder types.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| let is_boxed = if let Some(rt) = &rules_arg.typename { | ||
| let (boxed, _) = check_and_strip_box(rt.clone()); | ||
| boxed | ||
| } else { | ||
| false | ||
| }; | ||
|
|
||
| let ruletype = if rules_arg.typename.is_none() { | ||
| None | ||
| } else if is_placeholder_type(&rules_arg.typename) { | ||
| let placeholder_name = format_ident!("__rustylr_placeholder_{}", rules_arg.name.value()); | ||
| Some(quote! { #placeholder_name }) | ||
| } else { | ||
| rules_arg.typename.clone() | ||
| let (_, stripped) = check_and_strip_box(rules_arg.typename.clone().unwrap()); | ||
| Some(stripped) | ||
| }; |
There was a problem hiding this comment.
The current implementation clones and parses the TokenStream multiple times for each rule's typename (once for is_boxed, once for is_placeholder_type, and once in the else branch). This is redundant and inefficient.
We can optimize this by calling check_and_strip_box once and checking for the placeholder pattern directly on the stripped stream.
let (is_boxed, ruletype) = if let Some(rt) = &rules_arg.typename {
let (boxed, stripped) = check_and_strip_box(rt.clone());
let is_placeholder = {
let mut it = stripped.clone().into_iter();
if let Some(proc_macro2::TokenTree::Ident(ident)) = it.next() {
ident.to_string() == "_" && it.next().is_none()
} else {
false
}
};
if is_placeholder {
let placeholder_name = format_ident!("__rustylr_placeholder_{}", rules_arg.name.value());
(boxed, Some(quote! { #placeholder_name }))
} else {
(boxed, Some(stripped))
}
} else {
(false, None)
};
Description
This PR introduces support for the
boxkeyword in front of%tokentypeand NonTerminalRuleTypedefinitions (e.g.,%tokentype box MyToken;orExpr(box MyLargeASTNode)).Motivation
In RustyLR, all semantic values are stored in a single unified
Dataenum representing the parser's stack. Because a Rustenum's size is dictated by its largest variant, any large type inflated the footprint of all stack slots, leading to performance and memory overhead. Wrapping those types inBoxmanually solved this but was tedious to write since it required manually wrapping/unwrappingBoxinside reduce actions.This change automates the process:
%tokentypeor a rule's return type withboxgenerates::std::boxed::Box<Type>internally in theDatastack enum.*val), exposing the raw unboxed types to reduce actions.::std::boxed::Box::new(...)when pushed back onto the stack.pop_start) is also automatically unboxed.This achieves minimal data stack enum layout size with zero manual boilerplate code.
Changes
rusty_lr_parsernonterminal_info.rs: Added theruletype_boxedflag toNonTerminalInfo.grammar.rs:is_tokentype_boxedtoGrammar.check_and_strip_boxhelper to detect and strip theboxkeyword from types.Grammar::from_grammar_argsandis_placeholder_typeto parse and strip theboxprefix.test_box_keyword_parsingto verify metadata parsing and code emission assertions.pattern.rs: UpdatedNonTerminalInfohelper rules to defaultruletype_boxedtofalse.emit.rs:::std::boxed::Box<Type>within the generatedDataenum.*val) when popping boxed data stack values.::std::boxed::Box::new(...)) when pushing boxed rule outputs.Documentation
SYNTAX.md: Updated the Memory Optimization with Box section to document the newboxkeyword syntax and provide usage examples.