Preserve null-valued keys in map literals (#2391)#2412
Conversation
|
Hi @jrgemignani , @MuhammadTahaNaveed could you pls review this.. |
f53fddc to
df92b80
Compare
Sorry for the delay |
There was a problem hiding this comment.
Pull request overview
This PR fixes Cypher map literal semantics so that keys with null values are preserved (e.g., RETURN {a: null} returns {"a": null} instead of {}), aligning AGE behavior with openCypher/Neo4j.
Changes:
- Set the grammar-created
cypher_mapnodes to preservenullvalues by default (keep_null = true). - Add regression tests covering
null-valued map entries (including nested maps andkeys()/coalesce()behavior), plus controls for CREATE property stripping. - Update expected regression outputs impacted by the corrected semantics (including an orderability test containing a nested map with a
nullentry).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/backend/parser/cypher_gram.y |
Changes map literal default to preserve null values via keep_null = true. |
regress/sql/expr.sql |
Adds dedicated regression coverage for Issue #2391 and controls for CREATE behavior. |
regress/expected/expr.out |
Updates expected outputs for preserved null keys and new Issue #2391 test block. |
regress/expected/agtype.out |
Updates orderability expected output affected by preserved nested null map entry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
df92b80 to
c1a3cba
Compare
gregfelice
left a comment
There was a problem hiding this comment.
Reviewed the grammar change and traced every keep_null reference. The core fix is correct, with one finding worth addressing before merge.
Core fix — correct. Flipping the map: rule default from false to true gives bare map literals the right openCypher semantics. The consumer at cypher_expr.c:1072 chooses agtype_build_map_nonull (strip) vs agtype_build_map (keep) off keep_null, and the write paths all set it explicitly, so the default flip only affects nodes that relied on it:
- CREATE props (
cypher_clause.c:7049) →false(still strips) ✓ - MATCH node/rel patterns (
5205,5318) →true✓ - SET (
2375) →is_add(+=keeps,=strips) ✓
So bare/nested map expressions are the only affected cases — exactly the bug. Nice and minimal.
Finding — the change reaches a bit further than the summary states. The CREATE/SET overrides set keep_null = false on the top-level property map node only; they don't recurse. A nested map value inside a CREATE/SET property is its own cypher_map node, which now inherits the new true default. So:
CREATE (n {a: {b: null}})
-- before: stores a -> {}
-- after: stores a -> {"b": null}That's a real change in stored data for CREATE/SET, whereas the description says CREATE/SET behaviour is unchanged. I think the new behaviour is arguably more consistent (a nested map literal should preserve its keys regardless of context), so I'm not asking you to suppress it — but two small things:
- Add regression coverage for the nested case under a write — e.g.
CREATE (n {a: {b: null}}) RETURN nandMATCH (n) SET n = {a: {b: null}} RETURN n— so the nested-preserve behaviour is pinned and a future refactor can't silently revert it. - Tighten the wording from "leaves CREATE / SET behaviour unchanged" to something like "top-level CREATE/SET property null-stripping is unchanged" to reflect the nested-value change.
Before merge (independent): I don't see a CI run on this branch — the gate wants a green PG18 cassert build, so it'd be good to get CI to run. Locally-reported 32/33 with the known pre-existing age_upgrade failure matches master.
Otherwise this looks good — clean fix, clear root-cause analysis, solid coverage on the expression paths.
|
@gregfelice Thanks for the review. To enable CI running in PR flow, someone from maintainers group who has permission to do that, can trigger the CI, henc it did not start the CI yet. |
Map literals such as RETURN {a: null} previously dropped keys whose
values were null, producing {} instead of {"a": null}. This
diverged from the openCypher / Neo4j semantics where map literals
preserve every key the user wrote, including those bound to null.
Root cause: cypher_map.keep_null defaulted to false (zero-initialised),
so the grammar-produced node fed agtype_build_map_nonull, which strips
null entries. Call sites that legitimately need strip-null semantics
(CREATE node/edge property maps and SET = assignments) already set
keep_null=false explicitly, and the MATCH pattern path sets it to true
explicitly. Flipping the grammar default to true therefore only affects
the cases that were buggy (bare map expressions and nested map values),
and leaves CREATE/SET behaviour unchanged.
Two preexisting tests encoded the old buggy output and are updated:
expr.out (bare RETURN maps now keep the null value) and agtype.out
(a nested map inside an orderability test no longer drops its null
entry, shifting one row in the ORDER BY result). Dedicated regression
coverage for apache#2391 is added to regress/sql/expr.sql.
- Move 'End of tests' to after the Issue 2391 test block so it marks the actual end of the file. - Remove 'in order' from the mixed-values comment since map key ordering is not guaranteed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Map literals now build with agtype_build_map (keep_null) instead of agtype_build_map_nonull, so the accessor-optimization EXPLAIN plan in expr.out must reflect the new function name. Also strip a stray trailing CRLF on the last line of expr.sql/expr.out that leaked a carriage return into the regression output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
A reviewer noted the keep_null=true default reaches further than the
PR summary stated: CREATE / SET = only override keep_null=false on the
top-level property map, not recursively. A nested map value is its own
cypher_map node, so it now inherits the new default and preserves its
null-valued keys (e.g. CREATE (n {a: {b: null}}) stores a -> {"b": null}).
- regress/sql/expr.sql, regress/expected/expr.out: pin the nested case
under a write with CREATE (n:Nested {a: {b: null}}), MATCH ... SET
n = {a: {b: null}}, and a MATCH verify.
- src/backend/parser/cypher_gram.y: clarify the map: rule comment to
state the CREATE / SET override is top-level only and nested map
values keep the null-preserving default.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bf51abf to
71f8d44
Compare
|
Thanks @gregfelice — good catch, and you're right on both counts. I traced it the same way: the CREATE ( CREATE (n {a: {b: null}})
-- before: a -> {}
-- after: a -> {"b": null}Agreed this is the more consistent behaviour, so I've pinned it rather than suppressed it. Addressed in 71f8d44:
On CI: full |
Summary
Fixes #2391.
Map literals such as
RETURN {a: null}previously dropped keys whose values were null, producing{}instead of{"a": null}. This diverged from the openCypher / Neo4j semantics where map literals preserve every key the user wrote, including those bound to null.Root cause
cypher_map.keep_nulldefaulted tofalse(zero-initialised), so the grammar-produced node fedagtype_build_map_nonull, which strips null entries. Call sites that legitimately need strip-null semantics (CREATE node/edge property maps and SET=assignments) already setkeep_null = falseexplicitly on the top-level property map, and the MATCH pattern path sets it totrueexplicitly. Flipping the grammar default totruetherefore only affects the cases that were buggy (bare map expressions and nested map values).Top-level CREATE / SET property null-stripping is unchanged — those call sites override
keep_null = falseon the top-level property map node. Note the override is not recursive: a nested map value inside a CREATE / SET property is its owncypher_mapnode, so it now inherits the newtruedefault and preserves its null-valued keys (e.g.CREATE (n {a: {b: null}})storesa -> {"b": null}rather thana -> {}). This is the intended, more consistent behaviour — a nested map literal should preserve its keys regardless of context — and is pinned by new regression tests.Changes
src/backend/parser/cypher_gram.y— themap:rule now setsn->keep_null = truewith an explanatory comment noting the top-level-only nature of the CREATE / SET override.regress/sql/expr.sql— dedicated regression coverage for Map literals may drop keys whose values arenull. #2391 (single-key null, multiple nulls,keys(),coalesce, nested map values, mixed null / non-null, empty map, top-level CREATE strip control, MATCH verify, and nested map values under CREATE / SET=).regress/expected/expr.out— two preexisting tests that encoded the old buggy output now show the preserved"z": null; plus expected output for the new Map literals may drop keys whose values arenull. #2391 block.regress/expected/agtype.out— nested{bool: true, i: null}inside an orderability test no longer drops its null entry, shifting one row in theORDER BYresult.Behaviour before / after
CREATE and SET
=still strip top-level null-valued keys (unchanged), matching openCypher semantics for property writes. Nested map literals inside a write now preserve their null keys, e.g.CREATE (n {a: {b: null}})storesa -> {"b": null}.Test results
make installcheckagainst PostgreSQL 18: 32 / 33 pass. The only failure isage_upgrade, which is a preexisting failure unrelated to this change.