fix: exclude wiped dataclips from work order body search#4889
Conversation
When a dataclip is wiped (per-run wipe or the project data-retention job), its body/request are cleared and wiped_at is stamped, but its full-text search_vector is left intact, so wiped dataclips stayed searchable by the exact erased content via work order body search. Guard the body full-text match on is_nil(input_dataclip.wiped_at) so erased dataclip content is no longer discoverable, regardless of any stale search_vector. The guard targets the input_dataclip binding the body search already uses, so it does not affect other search fields. Fixes OpenFn#4824
The raw Repo.query! binding passed the dataclip id as a 36-char string to a $1::uuid parameter, which Postgrex rejects (expects a 16-byte binary). Wrap it in Ecto.UUID.dump!/1, matching the existing pattern in runs_test.exs.
|
Fixed the Heads-up on the |
|
Heads up on the red |
Description
This PR fixes a data-retention hole where wiped dataclips remained searchable by their erased content.
When a dataclip is wiped (per-run wipe with
save_dataclips: false, or the project data-retention job), itsbody/requestare cleared toNULLandwiped_atis stamped, but its full-textsearch_vectorcolumn is left intact (the indexing trigger/worker only runs on insert, never on the wipeUPDATE). The work order body search matches that stale vector with nowiped_atguard, so a wiped dataclip stays searchable by the exact content that was meant to be erased.The fix adds an
is_nil(input_dataclip.wiped_at)guard to the:bodybranch ofInvocation.build_search_fields_where/2, ANDed onto the full-text match. This targets the:input_dataclipbinding that body search already uses, so erased dataclip content is no longer discoverable regardless of any stalesearch_vector, and other search fields (id / log / dataclip_name / status) are unaffected.This mirrors the read-side guard already present in
search_workorders_for_retry/2viaexclude_wiped_dataclips/1, but applied at the precise binding the body full-text predicate matches rather than the work order's own dataclip.Closes #4824
Validation steps
search_fields: ["body"]) returns it.Lightning.Runs.wipe_dataclips/1, or the project data-retention job).search_vectoris still populated.The two new tests in
test/lightning/invocation_test.exs(describe "search_workorders/3") cover the regression and the non-wiped match case, including a positive control proving the body vector is populated before the negative assertion.Additional notes for the reviewer
search_vectorworkers, or any migration.search_vectorinQuery.wipe_dataclips/1was dropped:search_vectoris not a declared schema field (it is managed by raw-SQL workers), so a schemaupdate_allreferencing it would not be valid.AI Usage
Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):
You can read more details in our
Responsible AI Policy
Pre-submission checklist
/reviewwith Claude Code)
(e.g.,
:owner,:admin,:editor,:viewer) — n/a, this is aread-side search filter with no authorization surface.