Extended-protocol fixes: Flush handling + prepared-statement client-cache consistency#953
Open
ikeboy003 wants to merge 4 commits into
Open
Conversation
added 4 commits
May 26, 2026 19:16
postgres.js (and likely other drivers) sends Flush between Parse/Describe and Bind/Execute when describeFirst is enabled (the default for parameterized non-prepared queries). pgcat was logging 'Unexpected code: H' and dropping the message, leaving the client hung waiting for the ParameterDescription/RowDescription response that never arrived. Add a Flush arm that drains the buffered extended-protocol messages into self.buffer (same as Sync), appends the Flush byte, sends to the server, forwards the server response, and keeps the server checked out so the subsequent Bind/Execute/Sync can complete on the same backend.
The previous commit added a Flush arm in the client loop but reused
send_and_receive_loop, which calls Server::recv() — that recv loop
only terminates on ReadyForQuery ('Z'). Servers do NOT send
ReadyForQuery in response to Flush, so the read hung forever.
Add Server::recv_flush_response() that breaks on RowDescription,
NoData, ErrorResponse, or BindComplete (the terminal describe /
bind response markers). Wire it into the Flush arm via a direct
send + recv_flush_response call path.
Verified locally: postgres.js with prepare:false + params now
completes in ~500ms through pgcat (previously hung).
- recv_flush_response: skip body parse for non-S/non-1 codes (5–N byte protocol frames; framing byte is enough to dispatch). - recv_flush_response: replace self.buffer.clone() with std::mem::take to avoid one memcpy+alloc per Flush response. - Merge separate saw_terminal flag into single matches!() check on the framing byte. Net effect on cloudrest describe-heavy workload: removes one full-buffer clone and one redundant get_u8/get_i32/read_string pass per Flush. At intra-DC RTT (~0.3ms) and 10k+ qps the saved CPU is the dominant win.
…re-prepare failure When a checked-out backend rejected re-Parse of a previously-prepared statement (e.g. after schema-change cleanup, DEALLOCATE, or transient backend state), the old code removed the entry from the *client* prepared_statements HashMap and returned Ok(). This left tokio-postgres clients in an unrecoverable state: the client still believed the statement was valid (it never received a Close), so its very next Bind for that name missed in buffer_bind and the whole TCP conn dropped with 'Prepared statement sN doesn't exist'. For long-lived ETL clients (pgraft's tokio-postgres prepares many shapes via execute() and caches them per-conn by SQL), this killed every write inflight on that conn and orphaned any open COPY-IN, leaving AccessExclusiveLock on temp staging tables until session termination. New behavior: keep the client cache intact (it mirrors the client's own view), mark the misbehaving server bad so the pool replaces it, and propagate PreparedStatementError so the caller can fail this one operation and retry on a fresh backend.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Four commits. Two distinct fixes that share a theme: pgcat losing extended-protocol state in ways the client cannot detect or recover from.
1. Flush ('H') handling — 3 commits
Drivers using
describeFirst(e.g. postgres.js withprepare: false+ params) sendFlushbetweenParse/DescribeandBind/Execute. pgcat loggedUnexpected code: Hand dropped it, hanging the client.Server::recv()only breaks on'Z', which the server never sends after Flush. Addsrecv_flush_response()that breaks on the actual terminal markers (RowDescription / NoData / ErrorResponse / BindComplete).2. Prepared-statement client-cache desync — 1 commit
ensure_prepared_statement_is_on_serveronPreparedStatementError:prepared_statementsmapOk(())The client never received
Close('S')so it still believed the statement was valid. The nextBindfor that name missed inbuffer_bindand pgcat dropped the entire TCP conn withClientError(\"Prepared statement sN doesn't exist\").For drivers that cache prepared statements per logical connection (tokio-postgres, asyncpg, pgx) and reuse them across many transactions, this is unrecoverable — the client has no way to know its view diverged from pgcat's.
Fix:
Closeis a protocol-level lie.PreparedStatementErrorso the caller fails this one op cleanly and retries on a fresh backend.Test plan
cargo buildcleancargo fmt --checkprepare: false+ params: completes in ~500ms (was hanging)