Skip to content

fix(fast-path): recurse into child fields when computing next_fid#46

Open
FANNG1 wants to merge 1 commit into
daft-engine:mainfrom
FANNG1:fix/fast-path-next-fid-child-ids
Open

fix(fast-path): recurse into child fields when computing next_fid#46
FANNG1 wants to merge 1 commit into
daft-engine:mainfrom
FANNG1:fix/fast-path-next-fid-child-ids

Conversation

@FANNG1

@FANNG1 FANNG1 commented Jun 26, 2026

Copy link
Copy Markdown

Fixes #41

Problem

FastPathFragmentWriter computed the field ID for new columns as:

next_fid = max(f.id() for f in self.lance_ds.lance_schema.fields()) + 1

lance_schema.fields() returns only top-level fields. A struct column with M children occupies M+1 consecutive Lance field IDs (1 for the parent + M for the children). The top-level-only scan misses the child IDs, so next_fid can collide with an existing child.

Example for a table with id (fid=0) and a struct meta (fid=1) with children meta.a (fid=2) and meta.b (fid=3):

Scan max fid next_fid Effect
Top-level only 1 2 New file's "fields": [2] → Lance maps to meta.a
Recursive (fix) 3 4 Correct field ID for the new column

With the wrong next_fid, the committed fragment metadata maps the new file to the wrong schema field. Lance finds no file for the new column's actual field ID and returns null for every row.

The same problem occurs with any nested type: fixed_size_list, blob-v2, maps, etc.

Fix

Replace the flat max() with a recursive _max_field_id() that walks all descendants via f.children():

def _max_field_id(fields) -> int:
    best = -1
    for f in fields:
        best = max(best, f.id())
        children = f.children() if callable(getattr(f, "children", None)) else []
        if children:
            best = max(best, _max_field_id(children))
    return best

next_fid = _max_field_id(self.lance_ds.lance_schema.fields()) + 1

Test

TestRegressions::test_next_fid_skips_nested_child_field_ids — creates a table with a struct column (occupying 3 field IDs), merges a new scalar column via fast path, and asserts the column reads back as non-null with correct values.

lance_schema.fields() returns only top-level fields. Nested types (struct,
fixed_size_list, blob-v2, etc.) have child fields with their own IDs.
max(f.id() for top-level f) skips these, so next_fid can collide with an
existing child field ID. The committed fragment metadata then maps the new
column file to the wrong schema field, causing lance to read null for the
new column.

Fix: use recursive _max_field_id() that walks all descendants via f.children().

Fixes daft-engine#41
@FANNG1 FANNG1 force-pushed the fix/fast-path-next-fid-child-ids branch from e070c3e to 9233c54 Compare June 26, 2026 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(fast-path): next_fid collides with nested struct child field IDs, causing new columns to read back as null

1 participant