fix: map-left/right arg swap, typeless () columns, empty-filter LIST drop#254
Merged
Conversation
The auto-swap heuristic in map-left/map-right fired when the iterated slot was scalar and the fixed slot was a vector, and in that branch each function performed the OTHER variant's semantics (the author's own comment flagged it: "but we want fn(fixed=scalar, elem)"). The plain paths already match the documented convention (map-left fixes the left arg and iterates the right; map-right fixes the right and iterates the left), and map_iterate handles scalar broadcast via its scalar early-return. Removing the auto-swap blocks makes the two ops clean mirror images with no scalar/vector special-casing.
An empty generic list () used as a column was coerced to I64 in ray_table_fn, so the column rejected any non-integer atom on insert (float/sym/str/date → error: type). q/kdb treats () as a typeless column that adopts the type of the first inserted value. - ray_table_fn: an empty () list column is now stored as an empty RAY_LIST (typeless) instead of an empty I64 vector. - insert: when the target column is a typeless empty RAY_LIST, derive the storage type from the inserted value (atom → its type; typed vec or generic atom-list → element type with I64→F64 promotion; nested payload → stays LIST). The table has 0 rows in this case, so there is nothing to copy and the derived type drives the new column. Also corrects test/rfl/table/query.rfl:11, whose expected empty-select result wrote (list) for the STR Tape column; that only matched before because () was coerced to an empty I64 vec and empty vectors compared equal regardless of element type. The column is STR, so its empty form is (as 'STR []). Adds regression tests in table/update.rfl for first-insert type adoption across F64/SYM/STR/DATE/TIME and repeated typed inserts.
sel_compact's none-pass branch (a WHERE that matches 0 rows) built each empty result column with ray_vec_new(ct, 0). But RAY_LIST == 0 and ray_vec_new rejects type <= 0, returning an error; the 'if (nc && !RAY_IS_ERR(nc))' guard then silently skipped the column. So an empty select/where over a table with a nested LIST column dropped that column from the 0-row result, while a non-empty result (which goes through the gather path that handles RAY_LIST explicitly) kept it — the result schema depended on how many rows matched. Use ray_list_new(0) for LIST columns, mirroring the gather path. Adds regression tests in ops/filter.rfl asserting the result schema is identical for empty and non-empty filters, including through a by-group. Pre-existing bug, surfaced while auditing empty-column handling.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three related correctness fixes found while auditing iterator and empty-column handling.
1.
map-left/map-rightargument swap (53a8174b)The auto-swap heuristic in
map-left/map-rightfired when the iterated slot was scalar and the fixed slot a vector, and in that branch each function performed the other variant's semantics (the author's own comment flagged it:but we want fn(fixed=scalar, elem)). The plain paths already match the documented convention (map-leftfixes the left arg and iterates the right;map-rightfixes the right and iterates the left), so the auto-swap blocks are removed. The two ops are now clean mirror images.2. Typeless empty
()column adopts type on first insert (bfc0a1e0)An empty
()list used as a column was coerced to I64 inray_table_fn, so it rejected any non-integer atom on insert (float/sym/str/date→error: type). Now an empty()column is kept as a typelessRAY_LISTand adopts the type of the first inserted value (q/kdb semantics). Also corrects atable/query.rflexpectation that only matched because()used to be coerced to an empty I64 vec.3. Empty-filter drops LIST columns (
1ecdf90d)sel_compact's none-pass branch (aWHEREmatching 0 rows) built empty columns withray_vec_new(ct, 0); sinceRAY_LIST == 0andray_vec_newrejects type ≤ 0, the column was silently dropped — so the 0-row result schema differed from the non-empty one. Fixed to useray_list_new(0)for LIST columns. Pre-existing bug.Tests
Full suite green: 3443/3445 passed (2 skipped, 0 failed). Added regression tests in
table/update.rfl(first-insert type adoption) andops/filter.rfl(empty-filter schema preservation).