Tech Brief: Dynamic MCP Tool Loading
Problem
Zoo Code currently loads all MCP tool definitions (name, description, full JSON inputSchema) into the LLM context on every task. This is implemented in getMcpServerTools() (src/core/prompts/tools/native-tools/mcp_server.ts) which runs eagerly at task-build time.
With a modest MCP setup (e.g. GitHub MCP with ~27 tools) this costs ~18k tokens before the conversation begins. At 7 servers, users report 67k tokens consumed — 33% of a 200k budget — on tool definitions alone. As the MCP ecosystem grows this only gets worse.
A user on a model with a 128-tool hard limit (e.g. GPT-5.4 via OpenAI) hits API failures with just a few robust MCP servers active.
This is a known pain point across MCP-supporting tools:
Answer to "does Zoo Code do dynamic or static tool loading?"
Static. McpHub.connectToServer() calls tools/list on connect and caches full definitions. Every task injects all definitions into the context via getMcpServerTools(). There is no lazy loading or dynamic discovery today.
Proposed Solution
A layered approach: first add scoping primitives so users can reduce the candidate pool, then add a ranked dynamic loader that injects only the most relevant tools at task start and lets the model request more via an mcp_load tool.
Architecture
Task start
└─ getMcpServerTools()
└─ (#571) onlyAllow filter per server
└─ (#572) blockedMcpServers filter per mode
└─ existing allowedMcpServers filter (already implemented)
└─ if scopedTools.length > threshold (#578)
└─ ToolRouter.attachInitial(taskText, scopedTools, initialK) ← #577
└─ EmbeddingsRanker (#576) or Bm25Ranker (#575) fallback
└─ API tools = nativeTools + attachedMcpTools + mcp_load ← #579
└─ rolling window: turns 1–N re-rank and union into attachedMcpToolNames ← #580
Ranker tiers
| Tier |
Implementation |
When used |
| Tier 1 |
EmbeddingsRanker — cosine similarity on provider vectors |
When CodeIndexManager has an embedding provider configured |
| Tier 2 |
Bm25Ranker — pure-JS BM25 (k1=1.5, b=0.75) |
Always available; automatic fallback if Tier 1 fails |
Rolling window (union-only)
For the first N turns (default 7), ranking re-runs against the latest user message and unions results into attachedMcpToolNames. Tools are only ever appended — never removed — preserving prompt cache stability. After turn N the set is frozen; the model uses mcp_load to fetch additional tools.
Stories
Wave 0 — Scoping Primitives (independent, no prereqs)
| Issue |
Description |
| #571 |
onlyAllow per-server field (complement to disabledTools) |
| #572 |
blockedMcpServers per-mode field (allowedMcpServers allowlist already done) |
| #573 |
Override-able tool descriptions per server |
| #574 |
Detailed context-usage panel showing per-section token cost |
Wave 1 — Ranker + Dynamic Loader
| Issue |
Description |
Depends on |
| #575 |
Bm25Ranker (pure-JS, always available) |
|
| #576 |
EmbeddingsRanker with BM25 fallback |
#575 |
| #577 |
ToolRouter singleton: index lifecycle and search |
#575, #576 |
| #578 |
Threshold gate + Task.attachedMcpToolNames |
#577 |
| #579 |
mcp_load native tool |
#578 |
| #580 |
Rolling-search window across first N turns |
#578, #579 |
Wave 2 — Polish
Delivery Order
#571, #572, #573, #574 (all parallel, no prereqs)
#575 ──► #576 ──► #577 ──► #578 ──► #579 ──► #580
└──────────────────► #581
#582
#583
Wave 0 issues can land at any time and are good first-contributor issues.
The #575 → #576 → #577 → #578 → #579 chain is the critical path.
Key Design Decisions
| Decision |
Choice |
Rationale |
| Ranker fallback |
BM25 always available; embeddings when provider configured |
No setup cost for most users |
| Union-only growth |
Tools only appended to attachedMcpToolNames, never removed |
Preserves prompt cache stability across turns |
| Rolling window |
Re-rank for first N turns, then freeze |
Handles topic drift in early turns without unbounded growth |
| Kill-switch |
searchTurns: 0 disables dynamic loading |
Simple opt-out without removing infrastructure |
mcp_load escape hatch |
Model-initiated tool attachment |
Lets the model recover from a bad initial ranking |
Scope Boundaries
In scope: VS Code extension only.
Out of scope:
- Standalone CLI experience
- Automatic tool eviction (tools are union-only within a task)
- Per-tool embedding caching across restarts (in-memory only for v1)
Tech Brief: Dynamic MCP Tool Loading
Problem
Zoo Code currently loads all MCP tool definitions (name, description, full JSON
inputSchema) into the LLM context on every task. This is implemented ingetMcpServerTools()(src/core/prompts/tools/native-tools/mcp_server.ts) which runs eagerly at task-build time.With a modest MCP setup (e.g. GitHub MCP with ~27 tools) this costs ~18k tokens before the conversation begins. At 7 servers, users report 67k tokens consumed — 33% of a 200k budget — on tool definitions alone. As the MCP ecosystem grows this only gets worse.
A user on a model with a 128-tool hard limit (e.g. GPT-5.4 via OpenAI) hits API failures with just a few robust MCP servers active.
This is a known pain point across MCP-supporting tools:
Answer to "does Zoo Code do dynamic or static tool loading?"
Static.
McpHub.connectToServer()callstools/liston connect and caches full definitions. Every task injects all definitions into the context viagetMcpServerTools(). There is no lazy loading or dynamic discovery today.Proposed Solution
A layered approach: first add scoping primitives so users can reduce the candidate pool, then add a ranked dynamic loader that injects only the most relevant tools at task start and lets the model request more via an
mcp_loadtool.Architecture
Ranker tiers
EmbeddingsRanker— cosine similarity on provider vectorsCodeIndexManagerhas an embedding provider configuredBm25Ranker— pure-JS BM25 (k1=1.5, b=0.75)Rolling window (union-only)
For the first N turns (default 7), ranking re-runs against the latest user message and unions results into
attachedMcpToolNames. Tools are only ever appended — never removed — preserving prompt cache stability. After turn N the set is frozen; the model usesmcp_loadto fetch additional tools.Stories
Wave 0 — Scoping Primitives (independent, no prereqs)
onlyAllowper-server field (complement todisabledTools)blockedMcpServersper-mode field (allowedMcpServersallowlist already done)Wave 1 — Ranker + Dynamic Loader
Task.attachedMcpToolNamesmcp_loadnative toolWave 2 — Polish
TooManyToolsWarningwithDynamicToolsStatusDelivery Order
Wave 0 issues can land at any time and are good first-contributor issues.
The #575 → #576 → #577 → #578 → #579 chain is the critical path.
Key Design Decisions
attachedMcpToolNames, never removedsearchTurns: 0disables dynamic loadingmcp_loadescape hatchScope Boundaries
In scope: VS Code extension only.
Out of scope: