Skip to content

perf: fuse chained std.map calls into ComposedMappedArr#902

Closed
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:perf/map-map-view-fusion
Closed

perf: fuse chained std.map calls into ComposedMappedArr#902
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:perf/map-map-view-fusion

Conversation

@He-Pin

@He-Pin He-Pin commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Motivation

When std.map(f, std.map(g, arr)) is evaluated, two MappedArr views are nested. Each element access traverses two cache layers and two levels of indirection. This pattern is common in Jsonnet when splitting transformation logic across multiple map calls for readability.

Modification

In Arr.mapped(), detect when the source is already a MappedArr with live state and create a ComposedMappedArr that applies both functions in sequence from the original source array. This eliminates one cache layer and one level of indirection per element access.

Key design decisions:

  • Store separate outerCallPos and innerCallPos for correct error stack frames
  • Guard on inner.source != null to avoid using a fully-consumed (released) inner view
  • Caching is inherited from LazyViewArr — repeated access to the same index returns the cached result without recomputation
  • MappedArr fields (source, func, callPos) widened to private[Val] for extraction

Result

Scala Native A/B (Scala Native 0.5.12, macOS arm64, hyperfine --warmup 3):

Benchmark Baseline PR Delta
lazy_array_sparse_indexing 81.2 ms 35.1 ms 2.31x faster

JVM JMH (JDK 21, G1GC, -Xmx4G, @fork(1) @WarmUp(1) @measurement(1)):

Benchmark Baseline PR Delta
lazy_array_sparse_indexing 4.52 ms 4.45 ms -1.4% (noise)
lazy_array_comprehension 24.1 ms 45.8 ms +90%
lazy_array_reverse_sparse 3.66 ms 9.01 ms +146%

Note: JVM JMH shows regressions on lazy_array_comprehension and lazy_array_reverse_sparse. The ComposedMappedArr fusion adds overhead when the fused view is consumed through comprehension or reverse paths. The Scala Native hyperfine result on lazy_array_sparse_indexing (the primary target workload) shows clear improvement.

Comprehensive test coverage added in lazy_array_views.jsonnet: full materialization, repeated indexed access, reverse, and foldl on fused chains.

References

@He-Pin He-Pin marked this pull request as draft June 6, 2026 20:29
@He-Pin He-Pin marked this pull request as ready for review June 6, 2026 23:24
@He-Pin He-Pin marked this pull request as draft June 7, 2026 09:09
He-Pin added 2 commits June 9, 2026 10:58
Motivation:
When `std.map(f, std.map(g, arr))` is evaluated, two MappedArr views are
nested. Each element access traverses two cache layers and two indirection
levels. Common in Jsonnet when splitting logic across multiple map calls.

Modification:
In `Arr.mapped()`, detect when the source is already a `MappedArr` with
live state and create a `ComposedMappedArr` that applies both functions
in sequence from the original source, eliminating one cache layer and
one level of indirection per element access.

Result:
Native: lazy_array_sparse_indexing 20.5 → 19.7ms (-3.9%)
Existing test coverage: lazy_array_views.jsonnet tests chained maps.
Motivation:
ComposedMappedArr was using the outer map's callPos for both inner and
outer function calls, causing error stack frames to point to the wrong
source position when the inner function fails.

Modification:
- Store separate outerCallPos and innerCallPos
- Make MappedArr.callPos accessible as private[Val] for extraction
- Add comprehensive fused map chain tests (full materialization,
  repeated indexed access, reverse, foldl)

Result:
Correct error positions when inner function of a fused map chain throws.
@He-Pin He-Pin force-pushed the perf/map-map-view-fusion branch from 4d40b1d to 225dfbf Compare June 9, 2026 02:59
@He-Pin

He-Pin commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

I think this scenario should theoretically be very rare, so I'll disable it for now.

@He-Pin He-Pin closed this Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant