Skip to content

[Story] EmbeddingsRanker reusing CodeIndexManager provider with BM25 fallback #576

@edelauna

Description

@edelauna

Context

When the user has an embedding provider configured, we get better recall than BM25 — especially for paraphrased queries ("fetch latest tickets" → `jira.getIssues`). Zoo Code's `CodeIndexManager` already wires Ollama / OpenAI-compatible / Semble embedders, so users don't need to configure anything twice.

Depends on #575 (Bm25Ranker) being merged so the `Ranker` interface exists.

Developer Notes

  • New `src/services/tools/EmbeddingsRanker.ts` implementing the same `Ranker` interface from [Story] Bm25Ranker for MCP tool search (pure-JS, no external deps) #575.
  • Reuse the embedding provider exposed by `CodeIndexManager` — add a thin `embedText(text: string): Promise<number[]>` facade if one doesn't already exist.
  • Maintain an in-memory `Map<toolKey, vector>` cache. Invalidate when the `ToolDoc[]` input set changes.
  • Cosine similarity for ranking. Top-K by score.
  • Factory in `src/services/tools/index.ts`: return `EmbeddingsRanker` if `CodeIndexManager.hasEmbeddingProvider()` is true; fall back to `Bm25Ranker` otherwise, or if any embedding call throws.

Acceptance Criteria

  • `EmbeddingsRanker` implements `Ranker` interface
  • Unit test with mock embedding provider returning known vectors — cosine ranking is correct
  • Integration test: embedding error → factory returns BM25 result, no user-visible failure
  • Factory correctly selects ranker based on provider availability

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions