feat(collections): concurrent upload_documents with bounded parallelism#166
feat(collections): concurrent upload_documents with bounded parallelism#166ryanda9910 wants to merge 1 commit into
Conversation
…lelism Sequential document uploads were a scaling bottleneck (issue xai-org#77). Add upload_documents() to sync + async clients: a bounded ThreadPoolExecutor (configurable max_workers, conservative default) that overlaps upload latency, preserves result order, and re-raises the first error without swallowing others. Backward compatible (upload_document unchanged). Idea suggested by @nicsins in xai-org#77.
|
@ryanda9910 Solid PR — LGTM from a rogue contributor! Bounded concurrency with order preservation and clean error handling is spot on. In Dragonscale we use persistent shared knowledge on a dedicated server with unicast patterns — lets us often push higher effective concurrency without the usual bottlenecks. Quick question: default max_workers=4 feels pretty conservative. Any chance we could bump the default to 8 (or let maintainers decide after some load testing)? Thanks for tackling #77 and the shoutout! 🚀 |
|
Thanks for taking a look! The |
|
refined_xai_collections_batch.py try this it prevents throwing errors using some slickness and cunning😎 lol !hopefully it might be a much better solution than i offered earlier as looked through it again and added some things i saw missing good luck |
Problem
Uploading many documents to a collection is currently a serial loop of
upload_documentcalls (issue #77). Each upload is an independent, blocking sequence of gRPC round-trips, so the wall-clock time grows linearly with the number of documents — a scaling bottleneck for collections-management workflows.Approach
Add
upload_documents(...)to both the sync (Client) and async (AsyncClient) collections clients. It's a thin convenience wrapper that overlaps the network latency of independent uploads with bounded concurrency:ThreadPoolExecutor(sync) /asynciogather with a semaphore (async), capped by a configurablemax_workerswith a conservative default to respect server-side rate limits.upload_documentis unchanged; this is purely additive.__context__(avoids the silent error-swallowing anti-pattern).ValueErroron empty input / non-positivemax_workers).Benchmark (mocked 100ms/upload latency, N=40)
Real-world speedup depends on per-upload latency and server limits; methodology is mocked I/O latency so the numbers isolate the concurrency win.
Tests
Sync suite 98 passed, async suite 91 passed (incl. new cases covering parallel success, order preservation, and error propagation).
ruff checkclean.Question for maintainers
What default
max_workersdo you want? I set a conservative default to avoid tripping server-side rate limits — happy to adjust to whatever the service comfortably accommodates.Idea originally suggested by @nicsins in the issue thread. Fixes #77.
This is a draft for review — feedback on the concurrency default and async approach welcome.