v0.6.92: enrichment table column type, table run fixes, scheduled jitter, hosted-key queueing#4756
Conversation
TheodoreSpeaks
commented
May 27, 2026
- feat(tables): Add enrichment table column type (feat(tables): Add enrichment table column type #4752)
- fix(tables): workflow-column run fixes + bounded run-N-rows (fix(tables): workflow-column run fixes + bounded run-N-rows #4754)
- improvement(schedules): jitter scheduled execution starts by 0-30s (improvement(schedules): jitter scheduled execution starts by 30s #4750)
- feat(tools): queue hosted-key tool calls instead of failing with 429 (feat(tools): queue hosted-key tool calls instead of failing with 429 #4416)
- log(db): Add db failure cause log message (log(db): Add db failure cause log message #4749)
…4416) * Add queueing for hosted keys * feat(rate-limiter): FIFO queue for hosted-key per-workspace fairness Replace the per-call distributed lock with a Redis-backed FIFO queue so callers within a workspace get strict ordering instead of racing the bucket. Adds heartbeat-based crash recovery and dead-head reaping in a single Lua script. Bumps Exa search hosted RPM from 5 to 60. * fix(rate-limiter): bound hosted-key queue wait to execution budget; fix heartbeat + telemetry Tie the per-workspace hosted-key queue wait to the surrounding execution budget instead of a flat 5-minute cap. acquireKey now accepts the execution AbortSignal (threaded from ExecutionContext): when present, the wait is bounded by the run's actual plan timeout / cancellation, with the enterprise async ceiling as a backstop; when absent it falls back to MAX_QUEUE_WAIT_MS. This lets long-running async (Trigger.dev) runs use their full budget while no longer letting a single queued call burn a short sync run's entire budget. Also addresses Greptile review: - P1: share one lastHeartbeatAt across all wait phases and cap every sleep to HEARTBEAT_REFRESH_INTERVAL_MS so a long low-RPM retryAfterMs can no longer let the head's heartbeat lapse mid-wait and break FIFO ordering. - P2: derive hostedKeyQueueWaited telemetry reason from the actual bottleneck (queue_position / dimension / actor_requests) instead of hardcoding it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(rate-limiter): make hosted-key queue waits abort-interruptible Replace the plain capped sleeps in the queue-head and bucket-capacity wait loops with an interruptibleSleep that resolves early when the execution AbortSignal fires (timeout or cancellation), cleaning up its own timer and listener. Previously a cancelled/timed-out run could overshoot by up to the heartbeat cap (~10s) before the loop re-checked its budget; now it wakes within a tick. The cap remains for heartbeat renewal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…4750) Cron schedules all fire on the same boundary (e.g. every :00), stampeding the Postgres connection pool at the top of each minute/hour. Spread each due schedule's start across a [0, 30s) window via trigger.dev's delay option (no compute billed during the delay). Wires the previously-unused EnqueueOptions.delayMs through the trigger.dev backend.
* feat(tables): native enrichments sidebar + workflow input mapping
Add a Clay-style enrichments catalog to the table view and wire per-row
input mapping into workflow-backed columns.
- New "Enrichments" entry in the New-column dropdown opens a sliding panel
listing curated enrichment templates; picking one swaps to the workflow
config in-place (no cross-slide) with a back button.
- Type the workflow sidebar as manual | enrichment; enrichment hides the
launch + add-column-inputs affordances.
- Add a "Workflow inputs" advanced panel mapping Start-block input fields to
table columns (left-of-workflow columns only), with name-match auto-fill
and collapsible input-mapping-style rows.
- Persist type + inputMappings on the workflow group (types, contract, route,
service, hook) — jsonb, no migration.
- Consume inputMappings at run time: when present, feed Start-block fields
from the mapped columns; otherwise fall back to name-match spread.
- Clean up inputMappings on column rename/delete (stripGroupDeps + renameColumn).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(emcn): extract CollapsibleCard and reuse for input mapping
Pull the collapsible field-card markup (surface-4 header + surface-2 body,
click/keyboard toggle, truncated title + optional badge) into a shared
`CollapsibleCard` emcn component, and use it in the workflow-builder input
mapping rows and the table sidebar's input-mapping panel.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(tables): code-defined enrichment registry run directly per row
Enrichments are now TS configs in apps/sim/enrichments/ (registry, like
connectors) that run directly per table row via the existing run/dispatch/
cell-write rails — no workflow execution.
- enrichments/{types,registry} + work-email (heuristic) and phone-number (stub).
- WorkflowGroup gains enrichmentId; WorkflowGroupOutput gains outputId
(workflowId/blockId/path kept required, '' for enrichment groups).
- Executor branches on group.type === 'enrichment' → maps inputMappings →
enrich() → outputs by outputId → cell-write. Missing required inputs skip
(blank cell) instead of erroring.
- Sidebar lists the registry; enrichment-config panel maps inputs to columns
and creates the enrichment group (no workflow UI).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(enrichments): provider fallback cascade + hosted-key usage source
Replace each enrichment's single enrich() with an ordered providers[]
fallback cascade. Providers are plain data ({ id, label, toolId,
buildParams, mapOutput }) so the catalog stays client-safe; the
server-only runner (run.ts) calls executeTool per provider, first
non-empty result wins, misses/errors fall through, all-miss = blank cell.
Wire four enrichments on the hosted-safe providers (Hunter, PDL):
- Work Email (fullName, companyDomain): Hunter -> PDL
- Phone Number (fullName, companyDomain): PDL
- Company Domain (companyName): PDL
- Company Info (domain): PDL -> Hunter
Person enrichments take a single canonical fullName (Clay-style); Hunter
gets first/last via splitName(), PDL takes name directly.
Add 'enrichment' to usage_log_source enum (+ migration) so hosted-key
tool cost from these per-row calls can be billed to the table owner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(enrichments): bill hosted-key cost; surface provider errors; abort safety
- runEnrichment now returns { result, cost, error }: accumulates hosted-key
cost across the cascade, and sets `error` only when every provider that ran
errored (auth/rate-limit/outage) vs a clean miss.
- Executor records the cost to the table owner (createdBy) via recordUsage
(source 'enrichment'); billing failures are logged, never error the cell.
- F1: all-providers-errored now writes status 'error' instead of a blank
'completed' cell that looked like "no data found".
- F2: re-check the abort signal after the cascade so a cancel mid-tool-call
isn't recorded as a completed empty cell.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(tables): present enrichment columns as first-class in the grid
- Meta-header shows the enrichment's name + icon (Mail/Phone/Globe/Building2)
instead of "Workflow" + a color chip.
- Per-column header icon uses the enrichment's icon (via columnSourceInfo)
instead of the generic play icon.
- Hide "View execution" for enrichment cells in both the row context menu and
the action bar (no workflow execution exists to open); also hide the
meta-menu "View workflow" item for enrichment groups.
- Clicking an enrichment column header now opens the enrichments sidebar in
edit mode (pre-filled input mappings, Update via useUpdateWorkflowGroup)
instead of the workflow "Configure workflow" sidebar.
- Enrichment config lets the user name each output column (editable per-output,
deduped defaults) since enrichments can produce multiple columns.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(tables): enrichment columns use type icon; output names editable
- Drop the per-column enrichment icon (it duplicated the meta-header icon).
Enrichment output columns now render the standard column-type icon (Text,
etc.) — the enrichment's icon stays only on the group meta-header.
- Make output column names editable in the enrichment config edit mode too;
changed names rename their columns via useUpdateColumn (the rename cascades
into the group's output refs server-side). Validation excludes the output's
own current name.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(tables): wrap enrichment catalog descriptions instead of truncating
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(tables): edit enrichment output columns via the plain column editor
Edit column on an enrichment output now opens the normal column-config sidebar
(rename / type / unique) instead of the workflow 'Configure output column'
panel, which showed workflow-only fields and blocked a simple rename.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(copilot): list_enrichments + add_enrichment table tools
Let the copilot enumerate the code-defined enrichment registry and add an
enrichment column to a table (validating required input mappings against the
table's columns), backed by the same workflow-group machinery the UI uses.
* fix(enrichments): address PR review feedback
- Guard the enrichment cell path on `enrichmentId` so a group typed
'enrichment' without a registry id falls through to the workflow path
instead of erroring.
- Clear stale output values when skipping a row for missing required inputs,
so the auto cascade re-enriches once inputs return (was left completed+filled).
- Write a terminal state on abort in the enrichment path (matches the workflow
path) so a cancel between run and terminal-write can't leave the cell running.
- Edit mode: apply the group update (mappings/deps/auto-run) before column
renames so the primary edit lands even if a rename fails.
- Disable Save once validation has surfaced a missing required input.
- Use the workflowGroupById map instead of O(n) find in the context-menu and
action-bar hot paths.
* chore(commands): add /add-enrichment command
Guides adding a code-defined table enrichment to the registry, with a required
step to verify each provider tool has hosted-key support and chain to
/add-hosted-key when it doesn't.
* fix(enrichments): address second-pass PR review
- updateWorkflowGroup output diff now keys on outputId (falling back to
blockId::path) so enrichment outputs — which share empty blockId/path —
no longer collapse to one key and drop sibling columns.
- Enrichment terminal write now clears output columns absent from the result,
so a partial/empty re-run doesn't leave stale values.
- Editing a group whose enrichment was removed from the registry shows an
explanatory panel instead of silently falling through to the new-enrichment
catalog.
* feat(tables): show "Not found" badge for empty completed enrichment cells
An enrichment that runs to completion but matches nothing now renders a gray
"Not found" badge (like the Queued/Waiting cell states) instead of a blank
cell, so a real miss is distinguishable from an unrun cell. Scoped to
enrichment output columns; an empty string no longer counts as a value.
* fix(enrichments): don't re-run completed no-match enrichments on auto cascade
A completed enrichment with empty outputs is a real no-match result, not an
unfinished run. Eligibility now treats an enrichment's completed status as
terminal (regardless of output fill), so the auto cascade stops re-invoking
billable provider calls on every no-match row each dispatch. Input changes
still clear the exec entry, so genuine re-runs are unaffected; manual Run all
still re-runs.
* fix(enrichments): treat provider 404 as no-match, not a cell error
Providers like People Data Labs signal 'no record found' with HTTP 404, which
executeTool surfaces as a failed ToolResponse (status on output.status). The
cascade now treats a 404 as a clean miss — falls through to the next provider
and lets the cell render 'Not found' — instead of marking the cell errored.
Auth/rate-limit/5xx still propagate as real errors.
* fix(tools): surface HTTP status on error ToolResponse output
executeTool's catch handled Error instances in its first branch and only
extracted status/statusText/data for non-Error object throws — so HTTP errors
(thrown as Error instances carrying .status) lost their status on the returned
output. Surface it for Error instances too, so callers can branch on the
status (e.g. the enrichment cascade treating a provider 404 as a no-match).
* fix lint
* Revert ff
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(db): disable statement_timeout for migrations
* fix(ci): route migration workflow through guarded migrate.ts
* feat(tables): workflow-column run fixes + bounded "run N rows"
- Pass group.autoRun as the add-group dispatch flag so an autoRun=false
column no longer opens a no-op dispatch that flashes the run-count badge.
- Scope the context-menu re-run to the right-clicked workflow cell's group
(cascading to dependents) instead of every group on the row.
- Add an extensible per-dispatch row cap (DispatchLimit { type:'rows', max })
surfaced as "Run 10 / 1,000 empty rows" in the group header; dispatcher
stops after N eligible rows. New limit/processed_count columns on
table_run_dispatches.
- Fix stranded "Queued" cells: the cascade owner now treats a queued marker
(orphan pre-stamp) as a manual run so autoRun=false requested groups are
picked up, and drains late markers before releasing the row lock.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(db): regenerate dispatch limit migration on staging chain (0214)
Re-numbers the table_run_dispatches limit/processed_count columns from the
collided 0212 to 0214 after merging staging (which added its own 0212/0213).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(tables): lint formatting
* fix(tables): address PR review on dispatch cap + cascade drain
- Don't consume the row cap when batchEnqueueAndWait fails; a transient
failure no longer completes a capped dispatch with zero rows started.
- Outer cascade-drain loop only re-drives a genuine queued marker, not any
eligible group, so an empty-output group can't re-run forever.
- completeDispatch forwards limit on the terminal SSE event.
- Extract shared LIMITED_RUN_PRESETS for the Run-N-rows menu items.
* chore(lint): format generated tool-schemas-v1
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
PR SummaryHigh Risk Overview Workflow / run behavior is tightened: optional Ops / polish: scheduled executions get 0–30s random enqueue jitter; Copilot gains Reviewed by Cursor Bugbot for commit 92fd17c. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 92fd17c. Configure here.
Greptile SummaryThis PR bundles five distinct changes: a new enrichment table column type (registry-backed provider cascade with billing), workflow-column run fixes (bounded dispatches,
Confidence Score: 4/5Safe to merge; all three findings are edge cases unlikely to trigger in normal usage and do not affect correctness of the happy path. The queue list TTL is only refreshed on new enqueues, not during the active head waiter's heartbeat cycle. Under a very long wait with no new callers the list can expire, causing all waiters to bypass FIFO ordering simultaneously. The interruptibleSleep TOCTOU window is real but bounded to one extra sleep period by design. The stale enrichmentId field carried across outer-loop iterations is currently harmless but could mislead future code that inspects the payload. apps/sim/lib/core/rate-limiter/hosted-key/queue.ts (queue list TTL refresh) and apps/sim/background/workflow-column-execution.ts (stale enrichmentId in re-drive payload) Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller as Tool Caller
participant RL as HostedKeyRateLimiter
participant Q as HostedKeyQueue (Redis)
participant Bucket as Token Bucket
Caller->>RL: acquireKey(provider, prefix, config, workspaceId, signal)
RL->>Q: enqueue(provider, workspaceId, ticketId)
Q-->>RL: position and enabled
loop Poll until head
RL->>Q: checkHead(provider, workspaceId, ticketId)
Note over Q: Lua EVAL reap dead head check position
Q-->>RL: waiting or head or missing
RL->>Q: refreshHeartbeat every 10s
end
RL->>Bucket: waitForActorCapacity loop until token available
Bucket-->>RL: capacity granted
RL-->>Caller: success true key
Note over RL,Q: finally dequeue ticketId
RL->>Q: dequeue(provider, workspaceId, ticketId)
Reviews (1): Last reviewed commit: "fix(tables): workflow-column run fixes +..." | Re-trigger Greptile |
| /** | ||
| * TTL on the queue list itself. Set on every enqueue. Prevents abandoned queues | ||
| * (whole workspace went silent) from sticking around forever in Redis. | ||
| */ | ||
| const QUEUE_LIST_TTL_SECONDS = 600 |
There was a problem hiding this comment.
Queue list TTL not extended by active head waiter
QUEUE_LIST_TTL_SECONDS (600 s) is only refreshed via pipeline.expire inside enqueue. The refreshHeartbeat method extends the per-ticket key (30 s TTL, refreshed every 10 s) but never touches the queue list key. If the head waiter's execution budget exceeds 10 minutes (ABSOLUTE_MAX_QUEUE_WAIT_MS derives from getMaxExecutionTimeout(), which can be much longer for enterprise async) and no new caller arrives to refresh the list TTL, the queue list expires. Every waiter's next checkHead call sees "missing" (list gone → LINDEX returns nil → script returns "missing") and all proceed to the bucket simultaneously, collapsing FIFO ordering into concurrent bucket racing. Extending the queue list TTL inside refreshHeartbeat (alongside the ticket key) would close this gap.
| */ | ||
| const QUEUE_HEAD_POLL_MS = 200 | ||
|
|
||
| /** | ||
| * Sleep for `ms`, resolving early if `signal` aborts. Cleans up its own timer and listener | ||
| * so neither leaks. Callers don't need to distinguish an early (aborted) return from a normal | ||
| * one — the surrounding wait loop re-checks its budget immediately after and bails when the | ||
| * signal has fired. Falls back to a plain sleep when no signal is provided. | ||
| */ | ||
| function interruptibleSleep(ms: number, signal?: AbortSignal): Promise<void> { | ||
| if (!signal) return sleep(ms) | ||
| if (signal.aborted) return Promise.resolve() | ||
| return new Promise<void>((resolve) => { | ||
| const onAbort = () => { | ||
| clearTimeout(timer) | ||
| resolve() | ||
| } | ||
| const timer = setTimeout(() => { | ||
| signal.removeEventListener('abort', onAbort) | ||
| resolve() | ||
| }, ms) | ||
| signal.addEventListener('abort', onAbort, { once: true }) | ||
| }) |
There was a problem hiding this comment.
TOCTOU window in
interruptibleSleep between signal.aborted check and addEventListener
If the signal fires between the if (signal.aborted) guard at the top and the signal.addEventListener('abort', onAbort, { once: true }) call at the bottom, the abort event will not be delivered to the listener (it already fired before the listener was registered). The sleep then runs to full ms duration instead of resolving early. The call-site comment notes callers re-check their budget after each sleep, so the practical impact is bounded to one extra sleep period (at most HEARTBEAT_REFRESH_INTERVAL_MS = 10 s in the bucket-wait path). Adding a signal.aborted re-check immediately after addEventListener would eliminate the window entirely.
| currentPayload = { | ||
| ...currentPayload, | ||
| groupId: next.id, | ||
| workflowId: next.workflowId, | ||
| executionId: generateId(), | ||
| } |
There was a problem hiding this comment.
Stale
enrichmentId carried forward when re-driving a workflow group after an enrichment group
The outer re-drive loop spreads ...currentPayload when building the next iteration's payload, which means enrichmentId from a prior enrichment group persists if the next group is a workflow group (next.enrichmentId is undefined and is never explicitly cleared). The current runWorkflowAndWriteTerminal implementation routes on group.type === 'enrichment' && group.enrichmentId (using the schema-fresh group object, not payload.enrichmentId), so this stale field does not cause incorrect routing today. However, if any future code path reads payload.enrichmentId as a signal for "this is an enrichment run", the mis-attribution would silently trigger the wrong branch. Explicitly setting enrichmentId: next.enrichmentId (or enrichmentId: undefined) when constructing currentPayload would keep the payload consistent with the target group.
|
Greptile and bugbot issues are legit but not actually that impactful. Will take these in a followup pr. |
