perf: bulk numeric array literal parser bypassing fastparse by He-Pin · Pull Request #904 · databricks/sjsonnet

He-Pin · 2026-06-06T21:42:42Z

Motivation

Parsing large numeric array literals like [76,111,114,...] goes through fastparse combinators per element — each number traverses the full expression grammar (expr -> exprSuffix -> atomExpr -> number) before resolving to a simple literal. On Scala Native without JIT, this combinator chain overhead is significant (~1.8ms parse time for a ~2000-element array).

Modification

Commit 1: Bulk numeric array parser

Add tryBulkNumericArray that detects arrays of pure numeric literals and parses them with a hand-written scanner, bypassing the fastparse expression chain entirely
Supports integers, floats, negative numbers, scientific notation, and trailing commas
Falls back to regular parsing for arrays containing non-numeric elements, comprehensions, or comments
Guards against identifier-like suffixes (123abc) to avoid misparse

Commit 2: SWAR integer parsing + Val.cachedNum

For simple integers (no decimal/exponent): parse directly using 4-digits-at-a-time technique inspired by jsoniter-scala and PR perf: optimize parseInt with 4-digits-at-a-time parsing #897, avoiding String.substring allocation + Double.parseDouble overhead
Use Val.cachedNum for values 0-255, reusing pre-allocated instances instead of creating new Val.Num objects
Float/exponent numbers still fall back to Double.parseDouble(substring)

Result

JVM JMH (JDK 21, G1GC, -Xmx4G, @fork(1) @WarmUp(1) @measurement(1), 3 runs averaged):

Benchmark	Master	PR	Delta
member (~500 int elements)	0.561 ms	0.074 ms	-86.8% (7.6x)
base64_byte_array (~500 int elements)	0.643 ms	0.136 ms	-78.8% (4.7x)
setDiff (~1700 int elements)	0.436 ms	0.229 ms	-47.6% (1.9x)
setUnion (~1700 int elements)	0.492 ms	0.267 ms	-45.7% (1.8x)
comparison	0.045 ms	0.031 ms	-31.1%
escapeStringJson	0.037 ms	0.030 ms	-18.9%
setInter (~1700 int elements)	0.312 ms	0.310 ms	-0.6%
parseInt	0.031 ms	0.032 ms	+3.2%

Noise verification (3 runs, Run1 was outlier for all):

Benchmark	M:R1	PR:R1	M:R2	PR:R2	M:R3	PR:R3	3-run avg	Verdict
stripChars	0.060	0.125	0.059	0.061	0.060	0.069	+6.6%	Noise (Run1 outlier)
array_copy_views	7.97	10.40	7.63	7.60	8.05	7.91	+0.5%	Noise
lazy_array_comprehension	19.6	29.4	19.0	20.9	20.1	20.2	+5.1%	Marginal
lazy_array_reverse_sparse	3.22	4.42	3.14	3.40	3.26	3.37	+5.1%	Marginal
realistic2	43.3	51.6	41.9	46.1	43.8	52.5	+12.7%	Marginal
bench.02	28.5	32.4	27.6	30.7	26.8	26.9	+5.6%	Marginal

Run1 was a cold-start outlier for all noise benchmarks (PR values 30-108% higher). Run2 and Run3 converge to +0-7% range. No confirmed regression beyond JMH single-iteration variance.

Scala Native A/B (Scala Native 0.5.12, macOS arm64, hyperfine --warmup 3):

Benchmark	Baseline	PR	Delta
setInter	14.0 ms	12.6 ms	1.11x faster
setUnion	12.8 ms	11.9 ms	1.07x faster
setDiff	12.9 ms	11.7 ms	1.11x faster

References

Tracked in performance optimization #666 (performance optimization)
SWAR technique from PR perf: optimize parseInt with 4-digits-at-a-time parsing #897 and jsoniter-scala
Scala Native parseInt upstream: perf: optimize parseInt/parseLong with radix-10 fast path scala-native/scala-native#4943

Motivation: Parsing large numeric array literals like `[76,111,114,...]` goes through fastparse combinators per element — each number traverses the full expression grammar (expr → exprSuffix → atomExpr → number) before resolving. On Scala Native without JIT, this combinator overhead is significant. Modification: Add `tryBulkNumericArray` that detects arrays of pure numeric literals and parses them with a hand-written scanner. Skips the fastparse expression chain entirely for each element. Falls back to regular parsing for arrays containing non-numeric elements, comprehensions, or comments. Result: Native A/B (member benchmark with ~2000-element numeric array): baseline 8.6ms → optimized 6.2ms (-28%). Ratio vs jrsonnet: 2.12x → 1.53x.

Motivation: The bulk numeric array parser (previous commit) used Double.parseDouble(data.substring()) for each element, creating a substring allocation and invoking the full double parser even for simple integers. Modification: - Parse simple integers directly using 4-digits-at-a-time technique (inspired by jsoniter-scala/PR databricks#897), avoiding substring allocation and Double.parseDouble overhead entirely - Use Val.cachedNum for values 0-255, reusing pre-allocated instances instead of creating new Val.Num objects - Float/exponent numbers still fall back to Double.parseDouble Result: Native A/B (member): 7.1ms → 5.9ms (-17.4%). Combined with bulk parser: member gap vs jrsonnet 1.97x → 1.42x. Also improves base64_byte_array: 11.3ms → 10.4ms (-7.9%).

…cision loss Motivation: The bulk numeric array parser used Double for integer accumulation, which loses precision for integers beyond 2^53 (e.g., 12345678901234567890 would differ from Double.parseDouble by up to 2048). Modification: - Changed accumulator from Double to Long with overflow detection. - When acc > Long.MaxValue/10000 (or /10), set isSimpleInt=false and fall back to Double.parseDouble for that number. - Convert Long to Double before negation to preserve -0.0 sign bit. - Added regression test for large integer precision. Result: Bulk parser now produces identical results to Double.parseDouble for all integer magnitudes, including beyond 2^53.

He-Pin · 2026-06-09T05:16:48Z

Even the number is ok, but still this kind of optimization is not that generic.

He-Pin marked this pull request as draft June 6, 2026 21:53

He-Pin marked this pull request as ready for review June 6, 2026 23:24

He-Pin marked this pull request as draft June 9, 2026 02:56

He-Pin added 2 commits June 9, 2026 10:58

He-Pin force-pushed the perf/bulk-numeric-array-parser branch from b9bb525 to b2df550 Compare June 9, 2026 02:59

He-Pin marked this pull request as ready for review June 9, 2026 04:13

He-Pin marked this pull request as draft June 9, 2026 04:18

He-Pin closed this Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: bulk numeric array literal parser bypassing fastparse#904

perf: bulk numeric array literal parser bypassing fastparse#904
He-Pin wants to merge 3 commits into
databricks:masterfrom
He-Pin:perf/bulk-numeric-array-parser

He-Pin commented Jun 6, 2026 •

edited

Loading

Uh oh!

He-Pin commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

He-Pin commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

Result

References

Uh oh!

He-Pin commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

He-Pin commented Jun 6, 2026 •

edited

Loading