Skip to content

Fix stack overflow and precision loss in toFloatList() conversion#2451

Open
gregfelice wants to merge 1 commit into
apache:masterfrom
gregfelice:fix/tofloatlist-sprintf-overflow
Open

Fix stack overflow and precision loss in toFloatList() conversion#2451
gregfelice wants to merge 1 commit into
apache:masterfrom
gregfelice:fix/tofloatlist-sprintf-overflow

Conversation

@gregfelice

Copy link
Copy Markdown
Contributor

Summary

toFloatList()'s AGTV_FLOAT branch formatted each element with sprintf(buffer, "%f", ...) into a fixed 64-byte stack buffer and re-parsed the string back into a float. That had two defects:

  1. Stack overflow (query-reachable). "%f" prints the full integer part with no width limit, so a large magnitude overflows the 64-byte buffer. RETURN toFloatList([1.0e308]) needs ~317 bytes (309 integer digits + .000000) and smashes the stack. This is the issue reported in Critical: fix stack overflow from unbounded sprintf() #2410.
  2. Precision loss. "%f" emits only 6 fractional digits, so the format-and-reparse round trip was lossy — toFloatList([0.123456789]) returned 0.123457.

The element is already a float8, so the whole format/reparse step is unnecessary. This assigns elem->val.float_value directly, which removes the stack buffer entirely (no magic buffer size to justify) and fixes both the overflow and the precision loss in one move.

Also hardens toStringList(): its "%.*g"/"%ld" conversions use bounded formats and were never overflow-prone, but they're switched from sprintf to snprintf as defensive depth.

Relationship to #2410

This supersedes #2410, which fixed the overflow by enlarging the buffer and switching to snprintf. That approach worked but left the precision round-trip in place and introduced a magic buffer size. Direct assignment is the smaller, more correct end state. The original overflow report and fix are David Christensen's — credited via Co-authored-by — and #2410 will be closed as superseded.

Tests

Adds regression coverage in regress/sql/expr.sql for both the large-magnitude case (no overflow) and precision preservation. Full regression suite passes locally against PostgreSQL 18 (41/41).

  • toFloatList([1.0e308, -1.0e308])[1e+308, -1e+308]
  • toFloatList([0.123456789])[0.123456789]

toFloatList()'s AGTV_FLOAT branch formatted each element with
sprintf(buffer, "%f", ...) into a fixed 64-byte stack buffer and then
re-parsed the string back into a float. This had two defects:

  1. Stack overflow. "%f" prints the full integer part with no width
     limit, so a large magnitude overflows the 64-byte buffer. The value
     is query-reachable: RETURN toFloatList([1.0e308]) needs ~317 bytes
     (309 integer digits + ".000000") and smashes the stack. This is the
     issue reported in apache#2410.

  2. Precision loss. "%f" emits only 6 fractional digits, so the
     format-and-reparse round trip was lossy -- toFloatList([0.123456789])
     returned 0.123457.

The element is already a float8, so the whole format/reparse step is
unnecessary. Assign elem->val.float_value directly. This removes the
stack buffer entirely (no magic buffer size to justify) and fixes both
the overflow and the precision loss at once.

Also harden toStringList(): its "%.*g"/"%ld" conversions use bounded
formats and were never overflow-prone, but switch them from sprintf to
snprintf as defensive depth.

Add regression coverage to regress/sql/expr.sql for both the large
magnitude case (no overflow) and precision preservation.

This reimplements the fix originally proposed by David Christensen in
apache#2410, whose report identified the sprintf overflow.

Co-authored-by: David Christensen <david.christensen@snowflake.com>
@gregfelice gregfelice force-pushed the fix/tofloatlist-sprintf-overflow branch from 5a1929e to 19d421f Compare July 2, 2026 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant