feat(validator): delete Bigtable prediction rows on Postgres soft-delete#293
feat(validator): delete Bigtable prediction rows on Postgres soft-delete#293Thykof wants to merge 2 commits into
Conversation
Density tapering and cleanup_old_history now return the affected bigtable_keys and delete those rows after the Postgres commit; per-table GC remains the backstop for deletes that don't land. Adds a one-off backfill script for rows tombstoned before this change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
9cc94c1 to
596e8b1
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the validator’s Postgres soft-delete flows to also (best-effort) delete the corresponding Bigtable prediction rows immediately, rather than waiting for Bigtable GC max-age, and adds a one-off backfill script for previously tombstoned rows.
Changes:
- Add
BigtablePredictionStorage.delete_predictions(time_length, keys)with chunkedmutate_rowsdeletes and failure signaling (for logging by callers). - Update
density_tapering_predictionsandcleanup_old_historytoRETURNING bigtable_keyand delete those keys from Bigtable after the Postgres transaction commits. - Add tests covering Bigtable delete routing/chunking/failure behavior and handler behavior; add a backfill script for historical tombstones.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
synth/validator/miner_data_handler.py |
Collects bigtable_key values on soft-delete and triggers post-commit Bigtable deletes. |
synth/validator/bigtable_prediction_storage.py |
Introduces delete_predictions and documents best-effort delete + GC backstop behavior. |
tests/test_miner_data_handler.py |
Adds handler-level tests to assert Bigtable deletions happen (or are skipped/handled) appropriately. |
tests/test_bigtable_prediction_storage.py |
Adds unit tests for delete routing, chunking, and failure handling. |
scripts/backfill_bigtable_tombstone_deletes.py |
Adds an idempotent, dry-run-by-default script to delete Bigtable rows for already-tombstoned Postgres predictions. |
Comments suppressed due to low confidence (1)
synth/validator/miner_data_handler.py:901
- The exception log message here references
prune_redundant_predictions, but this code is insidedensity_tapering_predictions. This makes logs misleading when Bigtable deletion (or the SQL) fails and complicates debugging/alerting.
)
except Exception as e:
bt.logging.exception(
f"in prune_redundant_predictions (got an exception): {e}"
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Addressed the Copilot review in 82dbece: |
What
When
density_tapering_predictionsorcleanup_old_historysoft-deletes miner predictions in Postgres, the corresponding Bigtable rows are now deleted too, instead of waiting for the per-table GC max-age.BigtablePredictionStorage.delete_predictions(time_length, keys): oneDeleteFromRowmutation per key, sent viamutate_rowsin count-bounded chunks; raises on any unconfirmed status (same contract as writes).bigtable_keys viaRETURNINGand delete them after the Postgres commit — a rollback can never orphan-delete blobs, and a failed Bigtable delete is only logged; those rows age out via GC as before.scripts/backfill_bigtable_tombstone_deletes.py: one-off, idempotent backfill for rows tombstoned before this change (dry-run by default,--executeto delete).Tests
delete_predictions: table routing, chunking, failure statuses.bigtable_keyare skipped, and a failing Bigtable delete does not roll back the Postgres tombstoning.🤖 Generated with Claude Code