Skip to content

Optimize XML Codec#1231

Open
adwsingh wants to merge 1 commit into
mainfrom
adwsingh/xml-perf-final
Open

Optimize XML Codec#1231
adwsingh wants to merge 1 commit into
mainfrom
adwsingh/xml-perf-final

Conversation

@adwsingh

Copy link
Copy Markdown
Contributor

What behavior changes?

  • New native (StAX-free) XML serializer/deserializer behind Builder.useNative(true) or -Dsmithy-java.xml-provider=smithy. StAX remains default.
  • New codec-commons module with shared NumberCodec, TimestampCodec, and StripedPool. JSON/XML and Query codecs now use these instead of inlined copies.
  • QueryFormSerializer rewritten with inline byte buffer + StripedPool, replacing FormUrlEncodedSink.

Why is this change needed?

Performance. StAX imposes per-element overhead from stream state machines and String allocations. The native codec does byte-level parsing/writing inline with pre-computed element name bytes and speculative in-order member lookup.

How was this validated?

  • DifferentialXmlFuzzTest — native vs StAX reference produce identical output on fuzz inputs
  • GeneratedModelSerdeTest — all Smithy types through code-generated shapes, parameterized over both providers
  • NumberCodecTest, TimestampCodecTest, and their fuzz counterparts
  • JMH benchmarks on m7i.xlarge (JDK 25, G1GC, 1G heap)

Benchmark Results

Environment: m7i.xlarge, OpenJDK 25, G1GC, 1G heap, nanosecond precision

Summary

Protocol / Direction          Median Improvement
─────────────────────────────────────────────────────────────────────
XML Deserialization           ████████████████████████████████████  -84%
XML Serialization             ██████████████████████████            -63%
Query Serialization           ██████████████████████                -54%
Query Deserialization         ████████████████████████████████████  -85%
JSON (codec-commons)          ████                                  -10%
CBOR (codec-commons)          ████                                  -10%

restXml

Benchmark                        Baseline    After     Change
────────────────────────────────────────────────────────────────────────────────
CopyObjectOutput_M                 8,169     1,258     ████████████████████ -85%
CopyObjectOutput_OutOfOrder       10,179     1,726     ████████████████████ -83%
PutObject_S                          580       207     ███████████████      -64%
PutObject_M                          603       218     ███████████████      -64%
PutObject_L                          598       225     ███████████████      -62%
GetObject_S                          508       459     ██                   -10%
GetObject_M                          486       456     █                     -6%
GetObject_L                          492       490     ░                     -1%
CopyObjectOutput_Baseline            150       140     █                     -7%
CopyObjectRequest_Baseline           252       237     █                     -6%
CopyObjectRequest_M                1,252     1,187     █                     -5%

awsQuery — Deserialization (XML response parsing)

Benchmark                        Baseline    After     Change
────────────────────────────────────────────────────────────────────────────────
GetMetricDataResponse_S           25,133     2,596     █████████████████████ -90%
GetMetricDataResponse_M           49,116     7,295     ████████████████████  -85%
GetMetricDataResponse_OutOfOrder  50,996     7,921     ████████████████████  -84%
GetMetricDataResponse_L          424,828    78,873     ███████████████████   -81%

awsQuery — Serialization (form-urlencoded request)

Benchmark                        Baseline    After     Change
────────────────────────────────────────────────────────────────────────────────
PutMetricDataRequest_M             5,074     1,911     ███████████████      -62%
GetMetricDataRequest_S             1,098       468     █████████████        -57%
PutMetricDataRequest_S               609       287     ████████████         -53%
PutMetricDataRequest_L            53,696    26,308     ████████████         -51%
GetMetricDataRequest_M             5,364     3,430     ████████             -36%
GetMetricDataRequest_L             9,337     7,506     ████                 -20%
PutMetricDataRequest_Baseline        109        99     █                     -9%

awsJson1.0

Benchmark                        Baseline    After     Change
────────────────────────────────────────────────────────────────────────────────
PutItemRequest_ShallowMap_L       12,186     8,067     ████████             -34%
PutItemRequest_MixedItem_S           656       569     ███                  -13%
PutItemRequest_ShallowMap_S          653       568     ███                  -13%
PutItemRequest_BinaryData_S          294       256     ███                  -13%
GetItemOutput_S                      831       736     ██                   -11%
GetItemOutputBinary_M              3,634     3,233     ██                   -11%
PutItemRequest_MixedItem_M         2,950     2,645     ██                   -10%
PutItemRequest_Nested_L            1,203     1,082     ██                   -10%
GetItemOutput_M                    3,198     2,887     ██                   -10%
GetItemOutputBinary_S                905       821     ██                    -9%
GetItemOutputBinary_L             25,236    22,983     ██                    -9%
GetItemOutput_L                   21,601    19,877     █                     -8%
PutItemRequest_Baseline              225       206     █                     -9%
PutItemRequest_Nested_M              446       416     █                     -7%
PutItemRequest_MixedItem_L         8,108     8,810     ░                    +9%

restJson1

Benchmark                        Baseline    After     Change
────────────────────────────────────────────────────────────────────────────────
GetObject_M                          512       450     ██                   -12%
CopyObjectOutput_M                 1,269     1,145     ██                   -10%
GetObject_S                          505       456     ██                   -10%
CopyObjectOutput_OutOfOrder        1,827     1,661     ██                    -9%
CopyObjectRequest_M                1,328     1,219     █                     -8%
PutObject_L                          958       888     █                     -7%
PutObject_S                          945       880     █                     -7%
PutObject_M                          943       880     █                     -7%
CopyObjectRequest_Baseline           252       238     █                     -6%
CopyObjectOutput_Baseline            163       155     █                     -5%
GetObject_L                          507       484     █                     -4%

rpcv2Cbor

Benchmark                        Baseline    After     Change
────────────────────────────────────────────────────────────────────────────────
PutItemRequest_Nested_L            1,160       947     ████                 -18%
PutItemRequest_Baseline              236       197     ████                 -17%
GetItemOutput_M                    2,165     1,843     ███                  -15%
PutItemRequest_MixedItem_S           608       525     ███                  -14%
GetItemOutput_L                   13,377    11,832     ██                   -12%
PutItemRequest_ShallowMap_L        8,039     7,195     ██                   -11%
GetItemOutputBinary_S                497       451     ██                    -9%
PutItemRequest_ShallowMap_S          571       521     ██                    -9%
PutItemRequest_ShallowMap_M        2,599     2,397     █                     -8%
PutItemRequest_Nested_M              434       408     █                     -6%
PutItemRequest_BinaryData_M        1,164     1,100     █                     -6%
GetItemOutput_OutOfOrder           2,290     2,172     █                     -5%
GetItemOutputBinary_M              1,931     1,834     █                     -5%
GetItemOutputBinary_L             12,028    11,553     █                     -4%
PutItemRequest_MixedItem_L         6,989     6,715     █                     -4%
PutItemRequest_BinaryData_L       19,267    18,612     ░                     -3%
GetItemOutput_S                      476       467     ░                     -2%
PutItemRequest_BinaryData_S          239       241     ░                    +1%
PutItemRequest_MixedItem_M         2,497     2,619     ░                    +5%
GetItemOutput_Baseline                52        57     ░                   +11%

What should reviewers focus on?


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@adwsingh adwsingh force-pushed the adwsingh/xml-perf-final branch from 28c7b19 to ca55468 Compare June 10, 2026 06:30
@adwsingh adwsingh force-pushed the adwsingh/xml-perf-final branch from ca55468 to 5900a1d Compare June 10, 2026 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant