[Tracking] Dynamic MCP Tool Loading — Implementation Roadmap

# Tech Brief: Dynamic MCP Tool Loading

## Problem

Zoo Code currently loads **all** MCP tool definitions (name, description, full JSON `inputSchema`) into the LLM context on every task. This is implemented in `getMcpServerTools()` (`src/core/prompts/tools/native-tools/mcp_server.ts`) which runs eagerly at task-build time.

With a modest MCP setup (e.g. GitHub MCP with ~27 tools) this costs ~18k tokens before the conversation begins. At 7 servers, users report 67k tokens consumed — 33% of a 200k budget — on tool definitions alone. As the MCP ecosystem grows this only gets worse.

A user on a model with a 128-tool hard limit (e.g. GPT-5.4 via OpenAI) hits API failures with just a few robust MCP servers active.

This is a known pain point across MCP-supporting tools:
- microsoft/vscode#282699
- anthropics/claude-code#11364

## Answer to "does Zoo Code do dynamic or static tool loading?"

**Static.** `McpHub.connectToServer()` calls `tools/list` on connect and caches full definitions. Every task injects all definitions into the context via `getMcpServerTools()`. There is no lazy loading or dynamic discovery today.

## Proposed Solution

A layered approach: first add scoping primitives so users can reduce the candidate pool, then add a ranked dynamic loader that injects only the most relevant tools at task start and lets the model request more via an `mcp_load` tool.

### Architecture

```
Task start
  └─ getMcpServerTools()
       └─ (#571) onlyAllow filter per server
       └─ (#572) blockedMcpServers filter per mode
       └─ existing allowedMcpServers filter (already implemented)
  └─ if scopedTools.length > threshold (#578)
       └─ ToolRouter.attachInitial(taskText, scopedTools, initialK)  ← #577
            └─ EmbeddingsRanker (#576) or Bm25Ranker (#575) fallback
       └─ API tools = nativeTools + attachedMcpTools + mcp_load  ← #579
  └─ rolling window: turns 1–N re-rank and union into attachedMcpToolNames  ← #580
```

### Ranker tiers

| Tier | Implementation | When used |
|------|---------------|-----------|
| Tier 1 | `EmbeddingsRanker` — cosine similarity on provider vectors | When `CodeIndexManager` has an embedding provider configured |
| Tier 2 | `Bm25Ranker` — pure-JS BM25 (k1=1.5, b=0.75) | Always available; automatic fallback if Tier 1 fails |

### Rolling window (union-only)

For the first N turns (default 7), ranking re-runs against the latest user message and *unions* results into `attachedMcpToolNames`. Tools are only ever appended — never removed — preserving prompt cache stability. After turn N the set is frozen; the model uses `mcp_load` to fetch additional tools.

## Stories

### Wave 0 — Scoping Primitives (independent, no prereqs)
| Issue | Description |
|-------|-------------|
| #571 | `onlyAllow` per-server field (complement to `disabledTools`) |
| #572 | `blockedMcpServers` per-mode field (`allowedMcpServers` allowlist already done) |
| #573 | Override-able tool descriptions per server |
| #574 | Detailed context-usage panel showing per-section token cost |

### Wave 1 — Ranker + Dynamic Loader
| Issue | Description | Depends on |
|-------|-------------|------------|
| #575 | Bm25Ranker (pure-JS, always available) | |
| #576 | EmbeddingsRanker with BM25 fallback | #575 |
| #577 | ToolRouter singleton: index lifecycle and search | #575, #576 |
| #578 | Threshold gate + `Task.attachedMcpToolNames` | #577 |
| #579 | `mcp_load` native tool | #578 |
| #580 | Rolling-search window across first N turns | #578, #579 |

### Wave 2 — Polish
| Issue | Description | Depends on |
|-------|-------------|------------|
| #581 | Replace `TooManyToolsWarning` with `DynamicToolsStatus` | #578 |
| #582 | Telemetry for dynamic tool loading | #578, #579, #580 |
| #583 | Settings UI + documentation | #578, #579, #580 |

## Delivery Order

```
#571, #572, #573, #574 (all parallel, no prereqs)

#575 ──► #576 ──► #577 ──► #578 ──► #579 ──► #580
                                └──────────────────► #581
                                                     #582
                                                     #583
```

Wave 0 issues can land at any time and are good first-contributor issues.
The #575 → #576 → #577 → #578 → #579 chain is the critical path.

## Key Design Decisions

| Decision | Choice | Rationale |
|----------|--------|-----------|
| Ranker fallback | BM25 always available; embeddings when provider configured | No setup cost for most users |
| Union-only growth | Tools only appended to `attachedMcpToolNames`, never removed | Preserves prompt cache stability across turns |
| Rolling window | Re-rank for first N turns, then freeze | Handles topic drift in early turns without unbounded growth |
| Kill-switch | `searchTurns: 0` disables dynamic loading | Simple opt-out without removing infrastructure |
| `mcp_load` escape hatch | Model-initiated tool attachment | Lets the model recover from a bad initial ranking |

## Scope Boundaries

**In scope:** VS Code extension only.

**Out of scope:**
- Standalone CLI experience
- Automatic tool eviction (tools are union-only within a task)
- Per-tool embedding caching across restarts (in-memory only for v1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking] Dynamic MCP Tool Loading — Implementation Roadmap #584

Tech Brief: Dynamic MCP Tool Loading

Problem

Answer to "does Zoo Code do dynamic or static tool loading?"

Proposed Solution

Architecture

Ranker tiers

Rolling window (union-only)

Stories

Wave 0 — Scoping Primitives (independent, no prereqs)

Wave 1 — Ranker + Dynamic Loader

Wave 2 — Polish

Delivery Order

Key Design Decisions

Scope Boundaries

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tier	Implementation	When used
Tier 1	`EmbeddingsRanker` — cosine similarity on provider vectors	When `CodeIndexManager` has an embedding provider configured
Tier 2	`Bm25Ranker` — pure-JS BM25 (k1=1.5, b=0.75)	Always available; automatic fallback if Tier 1 fails

Issue	Description
#571	`onlyAllow` per-server field (complement to `disabledTools`)
#572	`blockedMcpServers` per-mode field (`allowedMcpServers` allowlist already done)
#573	Override-able tool descriptions per server
#574	Detailed context-usage panel showing per-section token cost

Issue	Description	Depends on
#575	Bm25Ranker (pure-JS, always available)
#576	EmbeddingsRanker with BM25 fallback	#575
#577	ToolRouter singleton: index lifecycle and search	#575, #576
#578	Threshold gate + `Task.attachedMcpToolNames`	#577
#579	`mcp_load` native tool	#578
#580	Rolling-search window across first N turns	#578, #579

Issue	Description	Depends on
#581	Replace `TooManyToolsWarning` with `DynamicToolsStatus`	#578
#582	Telemetry for dynamic tool loading	#578, #579, #580
#583	Settings UI + documentation	#578, #579, #580

Decision	Choice	Rationale
Ranker fallback	BM25 always available; embeddings when provider configured	No setup cost for most users
Union-only growth	Tools only appended to `attachedMcpToolNames`, never removed	Preserves prompt cache stability across turns
Rolling window	Re-rank for first N turns, then freeze	Handles topic drift in early turns without unbounded growth
Kill-switch	`searchTurns: 0` disables dynamic loading	Simple opt-out without removing infrastructure
`mcp_load` escape hatch	Model-initiated tool attachment	Lets the model recover from a bad initial ranking

[Tracking] Dynamic MCP Tool Loading — Implementation Roadmap #584

Description

Tech Brief: Dynamic MCP Tool Loading

Problem

Answer to "does Zoo Code do dynamic or static tool loading?"

Proposed Solution

Architecture

Ranker tiers

Rolling window (union-only)

Stories

Wave 0 — Scoping Primitives (independent, no prereqs)

Wave 1 — Ranker + Dynamic Loader

Wave 2 — Polish

Delivery Order

Key Design Decisions

Scope Boundaries

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions