Skip to content

Add label_filter, keyset pagination, and timestamps to df.list_instances#278

Open
crprashant wants to merge 1 commit into
microsoft:mainfrom
crprashant:crprashant/list-instances-filter-pagination
Open

Add label_filter, keyset pagination, and timestamps to df.list_instances#278
crprashant wants to merge 1 commit into
microsoft:mainfrom
crprashant:crprashant/list-instances-filter-pagination

Conversation

@crprashant

@crprashant crprashant commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

What

Extend df.list_instances() with label filtering, keyset pagination, and timestamp columns — the first-class label→instance→status path agreed on #167.

This is delivered as a new overload of df.list_instances, not a change to the existing function:

  • Existing 2-arg function — unchanged. df.list_instances(status_filter text, limit_count integer) still returns its 6 columns (instance_id, label, function_name, status, execution_count, output). All existing positional calls (df.list_instances(), df.list_instances('Running'), df.list_instances('Running', 50)) keep resolving to it.
  • New 4-arg overload df.list_instances(status_filter, limit_count, label_filter, after_cursor) returns the same 6 columns plus three trailing ones: created_at, completed_at, next_cursor.
-- Page 1: newest 50 instances labeled 'nightly-etl'
SELECT instance_id, status, created_at, next_cursor
FROM df.list_instances(NULL, 50, 'nightly-etl');

-- Page 2: pass the prior page's next_cursor as after_cursor
SELECT instance_id, status, created_at, next_cursor
FROM df.list_instances(NULL, 50, 'nightly-etl', '<next_cursor from page 1>');

label_filter matches df.instances.label exactly (NULL = no label filter). Ordering is created_at DESC, id ASC, served by the (created_at DESC, id) indexes added in PR 2. Label-filtered listing is served by a dedicated partial index (label, created_at DESC, id) WHERE label IS NOT NULL, so a label-scoped page stays O(rows-for-label) instead of scanning unrelated instances. next_cursor is an opaque token encoding the last row's (created_at, id); passing it as after_cursor returns the next page via an index-served keyset range predicate (no OFFSET scan). A malformed after_cursor raises a clear error rather than silently returning wrong rows.

Files: src/monitoring.rs (cursor helpers + new overload), sql/pg_durable--0.2.3--0.2.4.sql, src/lib.rs (adds the label index + refines index comments), docs, e2e.

Why

This is PR 4 of the agreed, incremental monitoring-functions plan on #167. It closes #87 (label_filter on list_instances) and addresses #146 (efficient paginated listing for external clients). Keyset pagination keeps page cost constant regardless of how deep into the result set a client reads, and the new overload gives an ergonomic label → instance → status path without forking the monitoring API surface into a separate status_by_label function.

Design note — overload by arity (preserves B1)

Rather than changing the existing function in place (which would alter its tuple shape from 6 to 9 columns), the new capability is a separate Rust function exported under the same SQL name. This keeps the released .so's list_instances_wrapper symbol returning 6 columns, so a customer who loads the new .so but has not yet run ALTER EXTENSION UPDATE (Scenario B1) still gets a correct result from their old 6-column SQL declaration.

The two overloads never collide: the old function's two params both default (matches arity 0–2); the new function defaults only after_cursor, giving it a minimum arity of 3 (matches arity 3–4). The arities are disjoint, so PostgreSQL never reports "function is not unique". pgrx derives the C symbols from the Rust function names, so the two overloads bind to distinct symbols (list_instances_wrapper / list_instances_paged_wrapper).

created_at and completed_at are existing df.instances columns (already populated by the worker); this PR only surfaces them. completed_at is set on transition to completed and is NULL for failed/cancelled.

Upgrade & compatibility

  • Scenario A (fresh vs upgrade parity): a fresh 0.2.4 install emits both overloads; the upgrade path keeps the unchanged 6-column function from the 0.2.3 base and the upgrade script adds only the new overload. The CREATE FUNCTION in the upgrade script is byte-identical to the pgrx-generated fresh-install DDL, so the Scenario A schema snapshot matches.
  • Scenario B1 (binary backward compat): the old function is frozen, so its symbol still returns 6 columns against pre-0.2.4 schemas. The new overload reads only columns present in every prior df.instances schema (id, label, status, created_at, completed_at), so it also runs correctly against an un-upgraded schema — just without the new index serving the keyset path.
  • One index added. completed_at already exists in the base schema, so the upgrade script adds no ALTER TABLE. This PR adds a single partial index — idx_instances_label(label, created_at DESC, id) WHERE label IS NOT NULL — to both the fresh-install DDL (src/lib.rs) and the upgrade script, byte-identical so the Scenario A snapshot matches. The other two df.instances index definitions are unchanged.

0.2.4 is still unreleased, so there is no version bump.

Testing

  • scripts/test-upgrade.sh36/36 (Scenario A byte-identical incl. the new idx_instances_label, B1 against 0.2.2 and 0.2.3, B2 data survival). The zero-arg df.list_instances() B1 check confirms the frozen 6-column path.
  • scripts/test-e2e-local.sh38/38. tests/e2e/sql/05_monitoring_and_explain.sql adds coverage for: label filtering, the new timestamp columns, multi-page keyset pagination via next_cursor/after_cursor, malformed-cursor errors, and completed_at being NULL for a failed instance.
  • cargo fmt / cargo clippy --features pg17 clean.

This branch is rebased on the latest main (includes #276).

Scope

label_filter + keyset pagination + timestamps only. The truncation-policy GUC is PR 5 per the plan. Deferred as a small follow-up (out of this PR's minimal scope): aligning df.instance_info()'s timestamp surface with the new columns.

A performance review of this PR motivated two access-path hardenings now folded in: (1) the partial idx_instances_label above, so label-filtered pages don't scan unrelated instances; and (2) a redundant created_at <= $ts leading conjunct on the keyset predicate, giving the btree a sargable upper bound so deep pages stay seek-based rather than degrading to offset-like scans. The cursor result set is unchanged — both disjuncts of the existing predicate already imply that bound.

cc @pinodeca

Add a new arity-disjoint overload of df.list_instances exposing label_filter, opaque keyset cursors (after_cursor/next_cursor), and created_at/completed_at, while leaving the existing 6-column function frozen to preserve binary backward compatibility (Scenario B1). Closes microsoft#87, addresses microsoft#146.

Add a partial idx_instances_label(label, created_at DESC, id) WHERE label IS NOT NULL so label-filtered pages stay seek-based instead of scanning unrelated instances, and a redundant created_at <= \ leading conjunct on the keyset predicate for a sargable btree bound (result set unchanged). Both index definitions stay byte-identical between the fresh-install DDL and the upgrade script for Scenario A.
@crprashant crprashant force-pushed the crprashant/list-instances-filter-pagination branch from 4392cb8 to 5bb3732 Compare June 30, 2026 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a filter_label parameter to list_instances function

1 participant