Skip to content

[Tracking] Dynamic MCP Tool Loading — Implementation Roadmap #584

@edelauna

Description

@edelauna

Tech Brief: Dynamic MCP Tool Loading

Problem

Zoo Code currently loads all MCP tool definitions (name, description, full JSON inputSchema) into the LLM context on every task. This is implemented in getMcpServerTools() (src/core/prompts/tools/native-tools/mcp_server.ts) which runs eagerly at task-build time.

With a modest MCP setup (e.g. GitHub MCP with ~27 tools) this costs ~18k tokens before the conversation begins. At 7 servers, users report 67k tokens consumed — 33% of a 200k budget — on tool definitions alone. As the MCP ecosystem grows this only gets worse.

A user on a model with a 128-tool hard limit (e.g. GPT-5.4 via OpenAI) hits API failures with just a few robust MCP servers active.

This is a known pain point across MCP-supporting tools:

Answer to "does Zoo Code do dynamic or static tool loading?"

Static. McpHub.connectToServer() calls tools/list on connect and caches full definitions. Every task injects all definitions into the context via getMcpServerTools(). There is no lazy loading or dynamic discovery today.

Proposed Solution

A layered approach: first add scoping primitives so users can reduce the candidate pool, then add a ranked dynamic loader that injects only the most relevant tools at task start and lets the model request more via an mcp_load tool.

Architecture

Task start
  └─ getMcpServerTools()
       └─ (#571) onlyAllow filter per server
       └─ (#572) blockedMcpServers filter per mode
       └─ existing allowedMcpServers filter (already implemented)
  └─ if scopedTools.length > threshold (#578)
       └─ ToolRouter.attachInitial(taskText, scopedTools, initialK)  ← #577
            └─ EmbeddingsRanker (#576) or Bm25Ranker (#575) fallback
       └─ API tools = nativeTools + attachedMcpTools + mcp_load  ← #579
  └─ rolling window: turns 1–N re-rank and union into attachedMcpToolNames  ← #580

Ranker tiers

Tier Implementation When used
Tier 1 EmbeddingsRanker — cosine similarity on provider vectors When CodeIndexManager has an embedding provider configured
Tier 2 Bm25Ranker — pure-JS BM25 (k1=1.5, b=0.75) Always available; automatic fallback if Tier 1 fails

Rolling window (union-only)

For the first N turns (default 7), ranking re-runs against the latest user message and unions results into attachedMcpToolNames. Tools are only ever appended — never removed — preserving prompt cache stability. After turn N the set is frozen; the model uses mcp_load to fetch additional tools.

Stories

Wave 0 — Scoping Primitives (independent, no prereqs)

Issue Description
#571 onlyAllow per-server field (complement to disabledTools)
#572 blockedMcpServers per-mode field (allowedMcpServers allowlist already done)
#573 Override-able tool descriptions per server
#574 Detailed context-usage panel showing per-section token cost

Wave 1 — Ranker + Dynamic Loader

Issue Description Depends on
#575 Bm25Ranker (pure-JS, always available)
#576 EmbeddingsRanker with BM25 fallback #575
#577 ToolRouter singleton: index lifecycle and search #575, #576
#578 Threshold gate + Task.attachedMcpToolNames #577
#579 mcp_load native tool #578
#580 Rolling-search window across first N turns #578, #579

Wave 2 — Polish

Issue Description Depends on
#581 Replace TooManyToolsWarning with DynamicToolsStatus #578
#582 Telemetry for dynamic tool loading #578, #579, #580
#583 Settings UI + documentation #578, #579, #580

Delivery Order

#571, #572, #573, #574 (all parallel, no prereqs)

#575 ──► #576 ──► #577 ──► #578 ──► #579 ──► #580
                                └──────────────────► #581
                                                     #582
                                                     #583

Wave 0 issues can land at any time and are good first-contributor issues.
The #575#576#577#578#579 chain is the critical path.

Key Design Decisions

Decision Choice Rationale
Ranker fallback BM25 always available; embeddings when provider configured No setup cost for most users
Union-only growth Tools only appended to attachedMcpToolNames, never removed Preserves prompt cache stability across turns
Rolling window Re-rank for first N turns, then freeze Handles topic drift in early turns without unbounded growth
Kill-switch searchTurns: 0 disables dynamic loading Simple opt-out without removing infrastructure
mcp_load escape hatch Model-initiated tool attachment Lets the model recover from a bad initial ranking

Scope Boundaries

In scope: VS Code extension only.

Out of scope:

  • Standalone CLI experience
  • Automatic tool eviction (tools are union-only within a task)
  • Per-tool embedding caching across restarts (in-memory only for v1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions