Multi-shard search: Rust merges#3752
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3752 +/- ##
==========================================
- Coverage 85.47% 82.97% -2.50%
==========================================
Files 563 565 +2
Lines 48861 49809 +948
Branches 14351 15127 +776
==========================================
- Hits 41763 41330 -433
- Misses 6463 7846 +1383
+ Partials 635 633 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
| LOCAL => shard_search::search(Arc::clone(&self.index_cache), message.clone()).await, | ||
| REMOTE => search | ||
|
|
||
| if message.shard_ids.len() == 1 { |
There was a problem hiding this comment.
Maybe this should only be for the merge itself?
like, always iterate all queries and then do an if to run the merge code or just return the first result?
| // Distributed query across multiple shards | ||
| let mut responses = Vec::with_capacity(message.shard_ids.len()); | ||
|
|
||
| for shard_id in &message.shard_ids { |
There was a problem hiding this comment.
We probably want to run all task concurrently (with a JoinSet or similar). Otherwise the query time is linear to the number of shards. I think Python already sent all queries in parallel, so this would be a regression.
There was a problem hiding this comment.
Next PR. We can merge them and have it all here
Description
This is the first part of a series of PRs to improve how internal gRPC search requests are done through the cluster. We currently send 1 gRPC request per shard, but we could instead send 1 per searcher (containing a KB shard).
In this PR, we implement multi-shard requests and shard response merging in Rust, although Python will still request 1 per shard.
The change on the proto is bw/c compatible (see https://protobuf.dev/programming-guides/proto3/#wire-safe-changes). Python sends 1 shard id string as a string/list and nidx parses 1 shard id string as a string/list.
How was this PR tested?
New Rust tests on merging