Skip to content

feat(validator): delete Bigtable prediction rows on Postgres soft-delete#293

Open
Thykof wants to merge 2 commits into
mainfrom
bigtable-delete-on-soft-delete
Open

feat(validator): delete Bigtable prediction rows on Postgres soft-delete#293
Thykof wants to merge 2 commits into
mainfrom
bigtable-delete-on-soft-delete

Conversation

@Thykof

@Thykof Thykof commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

What

When density_tapering_predictions or cleanup_old_history soft-deletes miner predictions in Postgres, the corresponding Bigtable rows are now deleted too, instead of waiting for the per-table GC max-age.

  • BigtablePredictionStorage.delete_predictions(time_length, keys): one DeleteFromRow mutation per key, sent via mutate_rows in count-bounded chunks; raises on any unconfirmed status (same contract as writes).
  • Both soft-delete paths collect the affected bigtable_keys via RETURNING and delete them after the Postgres commit — a rollback can never orphan-delete blobs, and a failed Bigtable delete is only logged; those rows age out via GC as before.
  • scripts/backfill_bigtable_tombstone_deletes.py: one-off, idempotent backfill for rows tombstoned before this change (dry-run by default, --execute to delete).

Tests

  • Unit tests for delete_predictions: table routing, chunking, failure statuses.
  • Handler tests (testcontainers Postgres): thinned/erased rows' keys are deleted, keeper and in-retention rows are not, rows without a bigtable_key are skipped, and a failing Bigtable delete does not roll back the Postgres tombstoning.

🤖 Generated with Claude Code

@Thykof Thykof requested a review from Copilot July 3, 2026 11:52
Density tapering and cleanup_old_history now return the affected
bigtable_keys and delete those rows after the Postgres commit;
per-table GC remains the backstop for deletes that don't land.
Adds a one-off backfill script for rows tombstoned before this change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@Thykof Thykof force-pushed the bigtable-delete-on-soft-delete branch from 9cc94c1 to 596e8b1 Compare July 3, 2026 11:53

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the validator’s Postgres soft-delete flows to also (best-effort) delete the corresponding Bigtable prediction rows immediately, rather than waiting for Bigtable GC max-age, and adds a one-off backfill script for previously tombstoned rows.

Changes:

  • Add BigtablePredictionStorage.delete_predictions(time_length, keys) with chunked mutate_rows deletes and failure signaling (for logging by callers).
  • Update density_tapering_predictions and cleanup_old_history to RETURNING bigtable_key and delete those keys from Bigtable after the Postgres transaction commits.
  • Add tests covering Bigtable delete routing/chunking/failure behavior and handler behavior; add a backfill script for historical tombstones.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
synth/validator/miner_data_handler.py Collects bigtable_key values on soft-delete and triggers post-commit Bigtable deletes.
synth/validator/bigtable_prediction_storage.py Introduces delete_predictions and documents best-effort delete + GC backstop behavior.
tests/test_miner_data_handler.py Adds handler-level tests to assert Bigtable deletions happen (or are skipped/handled) appropriately.
tests/test_bigtable_prediction_storage.py Adds unit tests for delete routing, chunking, and failure handling.
scripts/backfill_bigtable_tombstone_deletes.py Adds an idempotent, dry-run-by-default script to delete Bigtable rows for already-tombstoned Postgres predictions.
Comments suppressed due to low confidence (1)

synth/validator/miner_data_handler.py:901

  • The exception log message here references prune_redundant_predictions, but this code is inside density_tapering_predictions. This makes logs misleading when Bigtable deletion (or the SQL) fails and complicates debugging/alerting.
                )
        except Exception as e:
            bt.logging.exception(
                f"in prune_redundant_predictions (got an exception): {e}"
            )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread synth/validator/bigtable_prediction_storage.py Outdated
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@Thykof

Thykof commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed the Copilot review in 82dbece: delete_predictions now builds the DirectRows per chunk (equivalent to the suggested changeset, without the rows = keys alias), and the exception log in density_tapering_predictions no longer references the old function name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants