Skip to content

Preserve null-valued keys in map literals (#2391)#2412

Merged
jrgemignani merged 4 commits into
apache:masterfrom
crprashant:fix/2391-map-literal-null-keys
Jun 29, 2026
Merged

Preserve null-valued keys in map literals (#2391)#2412
jrgemignani merged 4 commits into
apache:masterfrom
crprashant:fix/2391-map-literal-null-keys

Conversation

@crprashant

@crprashant crprashant commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #2391.

Map literals such as RETURN {a: null} previously dropped keys whose values were null, producing {} instead of {"a": null}. This diverged from the openCypher / Neo4j semantics where map literals preserve every key the user wrote, including those bound to null.

Root cause

cypher_map.keep_null defaulted to false (zero-initialised), so the grammar-produced node fed agtype_build_map_nonull, which strips null entries. Call sites that legitimately need strip-null semantics (CREATE node/edge property maps and SET = assignments) already set keep_null = false explicitly on the top-level property map, and the MATCH pattern path sets it to true explicitly. Flipping the grammar default to true therefore only affects the cases that were buggy (bare map expressions and nested map values).

Top-level CREATE / SET property null-stripping is unchanged — those call sites override keep_null = false on the top-level property map node. Note the override is not recursive: a nested map value inside a CREATE / SET property is its own cypher_map node, so it now inherits the new true default and preserves its null-valued keys (e.g. CREATE (n {a: {b: null}}) stores a -> {"b": null} rather than a -> {}). This is the intended, more consistent behaviour — a nested map literal should preserve its keys regardless of context — and is pinned by new regression tests.

Changes

  • src/backend/parser/cypher_gram.y — the map: rule now sets n->keep_null = true with an explanatory comment noting the top-level-only nature of the CREATE / SET override.
  • regress/sql/expr.sql — dedicated regression coverage for Map literals may drop keys whose values are null. #2391 (single-key null, multiple nulls, keys(), coalesce, nested map values, mixed null / non-null, empty map, top-level CREATE strip control, MATCH verify, and nested map values under CREATE / SET =).
  • regress/expected/expr.out — two preexisting tests that encoded the old buggy output now show the preserved "z": null; plus expected output for the new Map literals may drop keys whose values are null. #2391 block.
  • regress/expected/agtype.out — nested {bool: true, i: null} inside an orderability test no longer drops its null entry, shifting one row in the ORDER BY result.

Behaviour before / after

RETURN {a: null} AS m
-- before: {}
-- after:  {"a": null}

RETURN keys({a: null}) AS ks
-- before: []
-- after:  ["a"]

CREATE and SET = still strip top-level null-valued keys (unchanged), matching openCypher semantics for property writes. Nested map literals inside a write now preserve their null keys, e.g. CREATE (n {a: {b: null}}) stores a -> {"b": null}.

Test results

make installcheck against PostgreSQL 18: 32 / 33 pass. The only failure is age_upgrade, which is a preexisting failure unrelated to this change.

@crprashant

Copy link
Copy Markdown
Contributor Author

Hi @jrgemignani , @MuhammadTahaNaveed could you pls review this..

@crprashant crprashant force-pushed the fix/2391-map-literal-null-keys branch 2 times, most recently from f53fddc to df92b80 Compare April 27, 2026 18:59
@jrgemignani jrgemignani requested a review from Copilot June 8, 2026 18:33
@jrgemignani

Copy link
Copy Markdown
Contributor

Hi @jrgemignani , @MuhammadTahaNaveed could you pls review this..

Sorry for the delay

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Cypher map literal semantics so that keys with null values are preserved (e.g., RETURN {a: null} returns {"a": null} instead of {}), aligning AGE behavior with openCypher/Neo4j.

Changes:

  • Set the grammar-created cypher_map nodes to preserve null values by default (keep_null = true).
  • Add regression tests covering null-valued map entries (including nested maps and keys()/coalesce() behavior), plus controls for CREATE property stripping.
  • Update expected regression outputs impacted by the corrected semantics (including an orderability test containing a nested map with a null entry).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/backend/parser/cypher_gram.y Changes map literal default to preserve null values via keep_null = true.
regress/sql/expr.sql Adds dedicated regression coverage for Issue #2391 and controls for CREATE behavior.
regress/expected/expr.out Updates expected outputs for preserved null keys and new Issue #2391 test block.
regress/expected/agtype.out Updates orderability expected output affected by preserved nested null map entry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread regress/sql/expr.sql Outdated
Comment thread regress/sql/expr.sql Outdated
@crprashant crprashant force-pushed the fix/2391-map-literal-null-keys branch from df92b80 to c1a3cba Compare June 8, 2026 21:25

@gregfelice gregfelice left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the grammar change and traced every keep_null reference. The core fix is correct, with one finding worth addressing before merge.

Core fix — correct. Flipping the map: rule default from false to true gives bare map literals the right openCypher semantics. The consumer at cypher_expr.c:1072 chooses agtype_build_map_nonull (strip) vs agtype_build_map (keep) off keep_null, and the write paths all set it explicitly, so the default flip only affects nodes that relied on it:

  • CREATE props (cypher_clause.c:7049) → false (still strips) ✓
  • MATCH node/rel patterns (5205, 5318) → true
  • SET (2375) → is_add (+= keeps, = strips) ✓

So bare/nested map expressions are the only affected cases — exactly the bug. Nice and minimal.

Finding — the change reaches a bit further than the summary states. The CREATE/SET overrides set keep_null = false on the top-level property map node only; they don't recurse. A nested map value inside a CREATE/SET property is its own cypher_map node, which now inherits the new true default. So:

CREATE (n {a: {b: null}})
-- before: stores  a -> {}
-- after:  stores  a -> {"b": null}

That's a real change in stored data for CREATE/SET, whereas the description says CREATE/SET behaviour is unchanged. I think the new behaviour is arguably more consistent (a nested map literal should preserve its keys regardless of context), so I'm not asking you to suppress it — but two small things:

  1. Add regression coverage for the nested case under a write — e.g. CREATE (n {a: {b: null}}) RETURN n and MATCH (n) SET n = {a: {b: null}} RETURN n — so the nested-preserve behaviour is pinned and a future refactor can't silently revert it.
  2. Tighten the wording from "leaves CREATE / SET behaviour unchanged" to something like "top-level CREATE/SET property null-stripping is unchanged" to reflect the nested-value change.

Before merge (independent): I don't see a CI run on this branch — the gate wants a green PG18 cassert build, so it'd be good to get CI to run. Locally-reported 32/33 with the known pre-existing age_upgrade failure matches master.

Otherwise this looks good — clean fix, clear root-cause analysis, solid coverage on the expression paths.

@crprashant

Copy link
Copy Markdown
Contributor Author

@gregfelice Thanks for the review. To enable CI running in PR flow, someone from maintainers group who has permission to do that, can trigger the CI, henc it did not start the CI yet.
@jrgemignani can do that i believe.

crprashant and others added 3 commits June 26, 2026 12:10
Map literals such as RETURN {a: null} previously dropped keys whose
values were null, producing {} instead of {"a": null}. This
diverged from the openCypher / Neo4j semantics where map literals
preserve every key the user wrote, including those bound to null.

Root cause: cypher_map.keep_null defaulted to false (zero-initialised),
so the grammar-produced node fed agtype_build_map_nonull, which strips
null entries. Call sites that legitimately need strip-null semantics
(CREATE node/edge property maps and SET = assignments) already set
keep_null=false explicitly, and the MATCH pattern path sets it to true
explicitly. Flipping the grammar default to true therefore only affects
the cases that were buggy (bare map expressions and nested map values),
and leaves CREATE/SET behaviour unchanged.

Two preexisting tests encoded the old buggy output and are updated:
expr.out (bare RETURN maps now keep the null value) and agtype.out
(a nested map inside an orderability test no longer drops its null
entry, shifting one row in the ORDER BY result). Dedicated regression
coverage for apache#2391 is added to regress/sql/expr.sql.
- Move 'End of tests' to after the Issue 2391 test block so it marks
  the actual end of the file.
- Remove 'in order' from the mixed-values comment since map key
  ordering is not guaranteed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Map literals now build with agtype_build_map (keep_null) instead of
agtype_build_map_nonull, so the accessor-optimization EXPLAIN plan in
expr.out must reflect the new function name. Also strip a stray trailing
CRLF on the last line of expr.sql/expr.out that leaked a carriage return
into the regression output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
A reviewer noted the keep_null=true default reaches further than the
PR summary stated: CREATE / SET = only override keep_null=false on the
top-level property map, not recursively. A nested map value is its own
cypher_map node, so it now inherits the new default and preserves its
null-valued keys (e.g. CREATE (n {a: {b: null}}) stores a -> {"b": null}).

- regress/sql/expr.sql, regress/expected/expr.out: pin the nested case
  under a write with CREATE (n:Nested {a: {b: null}}), MATCH ... SET
  n = {a: {b: null}}, and a MATCH verify.
- src/backend/parser/cypher_gram.y: clarify the map: rule comment to
  state the CREATE / SET override is top-level only and nested map
  values keep the null-preserving default.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@crprashant crprashant force-pushed the fix/2391-map-literal-null-keys branch from bf51abf to 71f8d44 Compare June 26, 2026 19:19
@crprashant

Copy link
Copy Markdown
Contributor Author

Thanks @gregfelice — good catch, and you're right on both counts. I traced it the same way: the CREATE (cypher_clause.c:7049) and SET = (cypher_clause.c:2378) overrides only set keep_null = false on the top-level property map node and don't recurse, so a nested map value is its own cypher_map node that now inherits the true default and preserves its null keys. Confirmed empirically:

CREATE (n {a: {b: null}})
-- before: a -> {}
-- after:  a -> {"b": null}

Agreed this is the more consistent behaviour, so I've pinned it rather than suppressed it. Addressed in 71f8d44:

  1. Nested write coverage — added to the Issue 2391 block in expr.sql/expr.out: CREATE (n:Nested {a: {b: null}}) RETURN n, MATCH (n:Nested) SET n = {a: {b: null}} RETURN n, and a MATCH verify, so the nested-preserve behaviour can't silently revert.
  2. Wording — tightened the PR description and the map: rule comment in cypher_gram.y from "leaves CREATE / SET behaviour unchanged" to "top-level CREATE / SET property null-stripping is unchanged", and the comment now notes nested map values keep the null-preserving default.

On CI: full make installcheck against PostgreSQL 18 passes locally (all tests green, including age_upgrade) on the rebased branch. The branch is now rebased onto current master.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

@jrgemignani jrgemignani merged commit 54668e1 into apache:master Jun 29, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Map literals may drop keys whose values are null.

4 participants