Skip to content

Apply distance constraints in rescore KNN queries (fixes #308)#309

Open
stumpylog wants to merge 1 commit into
asg017:mainfrom
stumpylog:fix/308-rescore-distance-constraints
Open

Apply distance constraints in rescore KNN queries (fixes #308)#309
stumpylog wants to merge 1 commit into
asg017:mainfrom
stumpylog:fix/308-rescore-distance-constraints

Conversation

@stumpylog

Copy link
Copy Markdown

Summary

Fixes #308. On vec0 tables created with INDEXED BY rescore(...), KNN queries silently ignored distance <op> ? constraints:

SELECT rowid, distance FROM t
WHERE embedding MATCH :vector AND k = :limit AND distance <= :threshold
ORDER BY distance

This returned the top-k by distance without filtering out rows above the threshold.

Root cause

The standard chunk-scan path filters candidates against the distance constraints parsed from idxStr/argv. Rescore columns dispatch to rescore_knn, which received those constraints but never applied them.

Fix

Apply the constraints in rescore_knn phase 2, after the exact float distances are computed and before top-k selection — so they target the final rescored distance (what the distance column reports), not the coarse quantized distance from phase 1. Candidates failing a GE/GT/LE/LT constraint are dropped, with an early-out when none survive.

Testing

  • Added test_knn_distance_constraint_le and test_knn_distance_constraint_lt_gt to tests/test-rescore.py; both fail on main and pass with this change.
  • Full loadable suite: 484 passed, 132 skipped.

🤖 Generated with Claude Code

KNN queries on vec0 tables using INDEXED BY rescore(...) silently ignored
`distance <op> ?` constraints. The standard chunk-scan path filters
candidates against parsed distance constraints, but the rescore dispatch
(rescore_knn) never applied them.

Apply the constraints to the rescored float distances in phase 2, before
top-k selection, so they target the same distance the `distance` column
reports. Handle the now-possible zero-surviving-candidates case.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Distance threshold query silently fails when using bit quantization

1 participant