postgres_connection_string() produces malformed URL when PGHOST is a Unix socket path (CNPG / Kubernetes environments)
Summary
When running pg_durable inside a CloudNativePG (CNPG) managed PostgreSQL cluster, the background worker fails to start and all workflow instances remain in pending state. It seems like the root cause is that CNPG sets PGHOST to a Unix socket directory path (e.g. /controller/run) rather than a TCP hostname. The postgres_connection_string() function in src/types.rs appears to use PGHOST verbatim in a URL-format connection string, which may produce an invalid URL and trigger an "empty host" error.
Environment
- pg_durable (latest main)
- PostgreSQL 17 (Debian Bookworm)
- CloudNativePG (CNPG) v0.28.3 on Kubernetes (Minikube v1.35.1)
- CNPG sets
PGHOST=/controller/run (Unix socket directory)
Steps to Reproduce
- Deploy a CNPG cluster with
pg_durable loaded via shared_preload_libraries.
- Create the extension:
CREATE EXTENSION IF NOT EXISTS pg_durable;
- Start a simple workflow:
SELECT df.start(
'SELECT ''step 1 done''' ~> 'SELECT ''step 2 done'''
);
- Check the instance status — it stays
pending.
- Inspect the connection string used by the extension: ```sql
SELECT * FROM df.debug_connection();
Actual Behavior
debug_connection() returns:
postgres://postgres@/controller/run:5432/postgres
The PostgreSQL pod logs report:
The background worker cannot connect, so no workflow ever advances beyond pending.
Expected Behavior
The connection string should be valid regardless of whether PGHOST is a TCP hostname/IP or a Unix socket directory path. Possible correct forms for a Unix socket path:
- TCP fallback:
postgres://postgres@127.0.0.1:5432/postgres
- libpq keyword/value:
host=/controller/run port=5432 user=postgres dbname=postgres
- Percent-encoded URL:
postgres://postgres@%2Fcontroller%2Frun:5432/postgres
Root Cause
It looks like postgres_connection_string() in src/types.rs unconditionally interpolates PGHOST into a postgres:// URL:
pub fn postgres_connection_string() -> String {
let host = std::env::var("PGHOST").unwrap_or_else(|_| "127.0.0.1".to_string());
let port = unsafe { pgrx::pg_sys::PostPortNumber };
let user = get_worker_role();
let database = get_database();
format!("postgres://{user}@{host}:{port}/{database}")
}
When PGHOST is /controller/run, this produces:
postgres://postgres@/controller/run:5432/postgres
This is not a valid URL — the path segment is misinterpreted as the URL path, leaving the host component empty, which causes libpq to reject the connection string with "empty host".
The same issue affects get_host():
pub fn get_host() -> String {
std::env::var("PGHOST").unwrap_or_else(|_| "127.0.0.1".to_string())
}
Suggested Fix
Some kind of guard is needed to detect whether PGHOST starts with / (indicating a Unix socket directory) and handle it differently from a regular TCP hostname. The following is just one possible approach to illustrate the idea — I am not requesting this exact implementation, and I trust the maintainers to choose the most appropriate solution:
pub fn postgres_connection_string() -> String {
let host = std::env::var("PGHOST").unwrap_or_else(|_| "127.0.0.1".to_string());
let port = unsafe { pgrx::pg_sys::PostPortNumber };
let user = get_worker_role();
let database = get_database();
if host.starts_with('/') {
// Unix socket: encode the path or use keyword/value format
// Option A – percent-encode the socket directory in the URL host
let encoded = host.replace('/', "%2F");
format!("postgres://{user}@{encoded}:{port}/{database}")
// Option B – fall back to loopback TCP (simpler, always available)
// format!("postgres://{user}@127.0.0.1:{port}/{database}")
} else {
format!("postgres://{user}@{host}:{port}/{database}")
}
}
The key point is that when PGHOST is a socket path, the current string interpolation produces an invalid URL. Any approach that avoids inserting a raw socket path into the host component of a postgres:// URL would resolve the issue. Option A preserves the socket path via percent-encoding; Option B falls back to TCP loopback, which is always reachable within the same pod.
Additional Context
CNPG (and possibly other Kubernetes-native PostgreSQL operators) seems to configure PGHOST as a socket directory path rather than a hostname by default. It also appears that modifying this environment variable from within the cluster may be restricted, making it difficult to work around the issue on the user side.
postgres_connection_string()produces malformed URL whenPGHOSTis a Unix socket path (CNPG / Kubernetes environments)Summary
When running
pg_durableinside a CloudNativePG (CNPG) managed PostgreSQL cluster, the background worker fails to start and all workflow instances remain inpendingstate. It seems like the root cause is that CNPG setsPGHOSTto a Unix socket directory path (e.g./controller/run) rather than a TCP hostname. Thepostgres_connection_string()function insrc/types.rsappears to usePGHOSTverbatim in a URL-format connection string, which may produce an invalid URL and trigger an "empty host" error.Environment
PGHOST=/controller/run(Unix socket directory)Steps to Reproduce
pg_durableloaded viashared_preload_libraries.CREATE EXTENSION IF NOT EXISTS pg_durable;pending.SELECT * FROM df.debug_connection();
Actual Behavior
debug_connection()returns:The PostgreSQL pod logs report:
The background worker cannot connect, so no workflow ever advances beyond
pending.Expected Behavior
The connection string should be valid regardless of whether
PGHOSTis a TCP hostname/IP or a Unix socket directory path. Possible correct forms for a Unix socket path:postgres://postgres@127.0.0.1:5432/postgreshost=/controller/run port=5432 user=postgres dbname=postgrespostgres://postgres@%2Fcontroller%2Frun:5432/postgresRoot Cause
It looks like
postgres_connection_string()insrc/types.rsunconditionally interpolatesPGHOSTinto apostgres://URL:When
PGHOSTis/controller/run, this produces:This is not a valid URL — the path segment is misinterpreted as the URL path, leaving the host component empty, which causes libpq to reject the connection string with "empty host".
The same issue affects
get_host():Suggested Fix
Some kind of guard is needed to detect whether
PGHOSTstarts with/(indicating a Unix socket directory) and handle it differently from a regular TCP hostname. The following is just one possible approach to illustrate the idea — I am not requesting this exact implementation, and I trust the maintainers to choose the most appropriate solution:The key point is that when
PGHOSTis a socket path, the current string interpolation produces an invalid URL. Any approach that avoids inserting a raw socket path into thehostcomponent of apostgres://URL would resolve the issue. Option A preserves the socket path via percent-encoding; Option B falls back to TCP loopback, which is always reachable within the same pod.Additional Context
CNPG (and possibly other Kubernetes-native PostgreSQL operators) seems to configure
PGHOSTas a socket directory path rather than a hostname by default. It also appears that modifying this environment variable from within the cluster may be restricted, making it difficult to work around the issue on the user side.