feat: curated MCP tool surface over the REST API#29
Conversation
The MCP catalog is built from a small curated FastAPI app (mcp_app) with 24 well-described tools, instead of auto-generating one tool per REST route (95 of them), which floods agent context and hurts tool selection. Related routes fold behind a dataset/kind selector, and an optional code-mode tool stays gated off by default (FINDATA_MCP_CODE_MODE). The 95 REST routes are untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Review limit reached
More reviews will be available in 11 minutes and 46 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdded a curated MCP FastAPI app, wired it into the main ChangesCurated MCP Surface
Sequence Diagram(s)sequenceDiagram
participant Client
participant findata.api.app
participant FastApiMCP
participant findata.api.mcp_app
participant findata.registry.lookup
Client->>findata.api.app: /mcp request
findata.api.app->>FastApiMCP: handle mounted transport
FastApiMCP->>findata.api.mcp_app: dispatch /registry/lookup
findata.api.mcp_app->>findata.registry.lookup: lookup(q, limit)
findata.registry.lookup-->>findata.api.mcp_app: matches
findata.api.mcp_app-->>FastApiMCP: tool response
FastApiMCP-->>Client: MCP result
sequenceDiagram
participant Client
participant findata.api.mcp_app
participant asyncio.create_subprocess_exec
participant Python_I_child_process
Client->>findata.api.mcp_app: POST /run-code
findata.api.mcp_app->>asyncio.create_subprocess_exec: launch python -I
asyncio.create_subprocess_exec->>Python_I_child_process: execute snippet
Python_I_child_process-->>findata.api.mcp_app: stdout, stderr, exit code
findata.api.mcp_app-->>Client: structured result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a curated Model Context Protocol (MCP) surface to consolidate the 94 fine-grained REST routes into 24 well-described tools, optimizing the catalog size for AI agents. It also adds configuration settings, documentation, and comprehensive tests to guard this new surface. The feedback highlights critical improvements: ensuring subprocesses in the optional code-execution tool are terminated on cancellation to prevent resource leaks, validating that both start and end dates are provided for PTAX range queries, and adding defensive checks for null values in the ANBIMA debentures endpoint.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| try: | ||
| stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout) | ||
| except TimeoutError: | ||
| proc.kill() | ||
| await proc.wait() | ||
| return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"} |
There was a problem hiding this comment.
If the request is cancelled (e.g., due to client disconnect) or another exception occurs during proc.communicate(), the subprocess will be orphaned and continue running in the background. This can lead to resource leaks and high CPU/memory usage. Wrapping the execution in a try...except BaseException block ensures that the subprocess is properly terminated and reaped under all circumstances.
| try: | |
| stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout) | |
| except TimeoutError: | |
| proc.kill() | |
| await proc.wait() | |
| return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"} | |
| try: | |
| stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout) | |
| except TimeoutError: | |
| try: | |
| proc.kill() | |
| except ProcessLookupError: | |
| pass | |
| await proc.wait() | |
| return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"} | |
| except BaseException: | |
| try: | |
| proc.kill() | |
| except ProcessLookupError: | |
| pass | |
| await proc.wait() | |
| raise |
| if start is not None and end is not None: | ||
| if currency.upper() != "USD": | ||
| raise HTTPException(400, "Range queries are USD-only; use `date` for other currencies") | ||
| return await ptax.get_ptax_usd_period(start, end) |
There was a problem hiding this comment.
If only one of start or end is provided, the range query condition start is not None and end is not None evaluates to False. The function then silently falls through to a single-date query, ignoring the provided parameter. To prevent unexpected behavior, we should explicitly validate that both parameters are provided if either is present.
| if start is not None and end is not None: | |
| if currency.upper() != "USD": | |
| raise HTTPException(400, "Range queries are USD-only; use `date` for other currencies") | |
| return await ptax.get_ptax_usd_period(start, end) | |
| if start is not None or end is not None: | |
| if start is None or end is None: | |
| raise HTTPException(400, "Both start and end must be provided for range queries") | |
| if currency.upper() != "USD": | |
| raise HTTPException(400, "Range queries are USD-only; use date for other currencies") | |
| return await ptax.get_ptax_usd_period(start, end) |
| if dataset == "debentures": | ||
| rows = await anbima_src.get_debentures(data) | ||
| if emissor: | ||
| needle = emissor.upper() | ||
| rows = [r for r in rows if needle in r.emissor.upper()] | ||
| return rows[:limit] |
There was a problem hiding this comment.
If get_debentures returns None (e.g., due to missing or empty external data for the given date), slicing or iterating over rows will raise a TypeError. Additionally, if any debenture has a missing or None issuer name, calling r.emissor.upper() will raise an AttributeError. Adding defensive checks prevents potential 500 Internal Server Errors.
| if dataset == "debentures": | |
| rows = await anbima_src.get_debentures(data) | |
| if emissor: | |
| needle = emissor.upper() | |
| rows = [r for r in rows if needle in r.emissor.upper()] | |
| return rows[:limit] | |
| if dataset == "debentures": | |
| rows = await anbima_src.get_debentures(data) | |
| if not rows: | |
| return [] | |
| if emissor: | |
| needle = emissor.upper() | |
| rows = [r for r in rows if r.emissor and needle in r.emissor.upper()] | |
| return rows[:limit] |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/MCP_SURFACE.md`:
- Around line 111-115: The example flow for findata_run_code is describing it as
a sandbox, which conflicts with the security guidance. Update the Example flows
section in MCP_SURFACE.md to present findata_run_code as an opt-in child
interpreter/execution environment instead of a sandbox, and adjust the
surrounding example wording so it matches the security section while keeping
registry_lookup and bcb_ptax unchanged.
- Around line 58-72: The fenced block in the MCP surface document is missing a
language tag, which triggers MD040. Update the fence around the registry and
source list to use a plain text tag so the snippet remains tooling-friendly, and
keep the surrounding content unchanged.
In `@src/findata/api/app.py`:
- Around line 151-167: The public /mcp transport mounted via
FastApiMCP.mount_http on app must not expose code execution when
findata_run_code is enabled in mcp_app. Move code mode off the public transport
entirely or gate it behind a separate authenticated/private-admin mount, and
ensure the FastApiMCP setup for mcp_app does not allow arbitrary payload.code
execution through the public app.
In `@src/findata/api/mcp_app.py`:
- Around line 831-860: _execute_code currently uses tempfile.gettempdir() as a
shared cwd, which lets separate runs interfere with each other’s temp files.
Update _execute_code in src/findata/api/mcp_app.py to create and use a unique
per-request working directory for each invocation, pass that directory as cwd to
asyncio.create_subprocess_exec, and ensure it is cleaned up after
proc.communicate() finishes or on timeout; keep the change localized to
_execute_code and any helper you add for managing the temporary directory.
- Around line 131-147: In bcb_ptax, partial range inputs are currently accepted
because only the case where both start and end are set is handled, so a request
with just one of them silently falls back to the single-day path. Update
bcb_ptax to explicitly detect when exactly one of start or end is provided and
raise an HTTPException(400) before calling ptax.get_ptax_usd_period, while
keeping the existing USD-only range behavior and single-day logic unchanged.
In `@tests/test_mcp_surface.py`:
- Around line 102-105: The current test for the cvm fund holdings endpoint only
verifies the missing cnpj path, so it does not enforce the month requirement
named in test_cvm_fund_holdings_requires_cnpj_and_month. Update the test to
include a second request through TestClient(mcp_app).get("/cvm/fund", ...) with
dataset=holdings and cnpj present but month omitted, and assert it returns 400
with a detail mentioning month so both required parameters are covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 81848135-ae23-4f2b-bdd0-a72e2a5e0aa5
📒 Files selected for processing (6)
AGENTS.mddocs/MCP_SURFACE.mdpyproject.tomlsrc/findata/api/app.pysrc/findata/api/mcp_app.pytests/test_mcp_surface.py
| ``` | ||
| registry_lookup ← start here: CNPJ / ticker / code / name → entities | ||
|
|
||
| bcb_series bcb_ptax bcb_focus (BCB: 12 → 3) | ||
| cvm_company cvm_financials cvm_fund cvm_structured_fund (CVM: 22 → 4) | ||
| b3_quote b3_cotahist b3_index (B3: 9 → 3) | ||
| tesouro_bonds tesouro_siconfi (Tesouro: 6 → 2) | ||
| ibge_indicator ibge_ipca_breakdown (IBGE: 4 → 2) | ||
| ipea_series ipea_search (IPEA: 4 → 2) | ||
| anbima (ANBIMA: 3 → 1) | ||
| openfinance_directory (Open Finance: 15 → 1) | ||
| basedosdados_search basedosdados_sql (BdD: 7 → 2) | ||
| receita_arrecadacao aneel_leiloes susep_empresas | ||
| findata_run_code (code mode, opt-in) | ||
| ``` |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Add a language tag to this fenced block.
This trips MD040 and makes the snippet less tooling-friendly. text would be enough here.
Suggested patch
-```
+```text
registry_lookup ← start here: CNPJ / ticker / code / name → entities
@@
findata_run_code (code mode, opt-in)
-```
+```As per coding guidelines, **/*.md: Keep repository-facing Markdown disciplined and functional.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ``` | |
| registry_lookup ← start here: CNPJ / ticker / code / name → entities | |
| bcb_series bcb_ptax bcb_focus (BCB: 12 → 3) | |
| cvm_company cvm_financials cvm_fund cvm_structured_fund (CVM: 22 → 4) | |
| b3_quote b3_cotahist b3_index (B3: 9 → 3) | |
| tesouro_bonds tesouro_siconfi (Tesouro: 6 → 2) | |
| ibge_indicator ibge_ipca_breakdown (IBGE: 4 → 2) | |
| ipea_series ipea_search (IPEA: 4 → 2) | |
| anbima (ANBIMA: 3 → 1) | |
| openfinance_directory (Open Finance: 15 → 1) | |
| basedosdados_search basedosdados_sql (BdD: 7 → 2) | |
| receita_arrecadacao aneel_leiloes susep_empresas | |
| findata_run_code (code mode, opt-in) | |
| ``` |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 58-58: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/MCP_SURFACE.md` around lines 58 - 72, The fenced block in the MCP
surface document is missing a language tag, which triggers MD040. Update the
fence around the registry and source list to use a plain text tag so the snippet
remains tooling-friendly, and keep the surrounding content unchanged.
Sources: Coding guidelines, Linters/SAST tools
| from findata.api.mcp_app import mcp_app | ||
|
|
||
| # The MCP tool catalog is built from the *curated* `mcp_app` (a separate | ||
| # FastAPI app, ~24 well-described tools), not from the public `app` — that | ||
| # would expose one near-duplicate tool per REST route (~94) and bloat every | ||
| # agent's context. `mount_http(router=app)` serves the /mcp transport on the | ||
| # public app, while the tools are generated from and executed against | ||
| # `mcp_app` (via its ASGI transport). The 94 REST routes stay untouched. | ||
| _mcp = FastApiMCP( | ||
| app, | ||
| mcp_app, | ||
| name=_PROJECT_SLUG, | ||
| description=( | ||
| f"{_PROJECT_STATEMENT} MCP para BCB, CVM, B3, IBGE, IPEA, " | ||
| "Tesouro, Base dos Dados, Open Finance e gráficos experimentais." | ||
| ), | ||
| ) | ||
| _mcp.mount_http() # Serves MCP at /mcp (fastapi-mcp >=0.4) | ||
| _mcp.mount_http(router=app) # Serves MCP at /mcp (fastapi-mcp >=0.4) |
There was a problem hiding this comment.
🔒 Security & Privacy | 🔴 Critical | 🏗️ Heavy lift
The public /mcp mount becomes unauthenticated RCE when code mode is enabled.
Because this transport is mounted on the public app, enabling findata_run_code in src/findata/api/mcp_app.py exposes arbitrary payload.code execution to any MCP client that can reach /mcp. FINDATA_MCP_CODE_MODE is only a feature flag; it is not an access-control boundary.
Please keep code mode off the public transport entirely, or put it behind an explicit auth/private-admin mount before merge.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/findata/api/app.py` around lines 151 - 167, The public /mcp transport
mounted via FastApiMCP.mount_http on app must not expose code execution when
findata_run_code is enabled in mcp_app. Move code mode off the public transport
entirely or gate it behind a separate authenticated/private-admin mount, and
ensure the FastApiMCP setup for mcp_app does not allow arbitrary payload.code
execution through the public app.
| async def _execute_code(code: str, timeout_s: int) -> dict[str, Any]: | ||
| """Run ``code`` in an isolated child interpreter, capturing combined output. | ||
|
|
||
| PROTOTYPE — this is NOT a security sandbox: the child runs arbitrary Python | ||
| with full library and network access. It is gated off by default and intended | ||
| for trusted, local/agent use only. | ||
| """ | ||
| timeout = max(1, min(timeout_s, _CODE_TIMEOUT_MAX)) | ||
| proc = await asyncio.create_subprocess_exec( | ||
| sys.executable, | ||
| "-I", # isolated mode: ignore env vars and user site, don't add cwd to path | ||
| "-c", | ||
| code, | ||
| stdout=asyncio.subprocess.PIPE, | ||
| stderr=asyncio.subprocess.STDOUT, | ||
| cwd=tempfile.gettempdir(), | ||
| ) | ||
| try: | ||
| stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout) | ||
| except TimeoutError: | ||
| proc.kill() | ||
| await proc.wait() | ||
| return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"} | ||
| text = stdout.decode("utf-8", errors="replace") | ||
| return { | ||
| "timed_out": False, | ||
| "exit_code": proc.returncode, | ||
| "truncated": len(text) > _CODE_OUTPUT_CAP, | ||
| "output": text[:_CODE_OUTPUT_CAP], | ||
| } |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Use a per-request working directory for code execution.
tempfile.gettempdir() points every run at the same shared cwd. One invocation can read or clobber another invocation's temp artifacts, which breaks the isolation this feature is trying to provide.
Suggested patch
async def _execute_code(code: str, timeout_s: int) -> dict[str, Any]:
@@
timeout = max(1, min(timeout_s, _CODE_TIMEOUT_MAX))
- proc = await asyncio.create_subprocess_exec(
- sys.executable,
- "-I", # isolated mode: ignore env vars and user site, don't add cwd to path
- "-c",
- code,
- stdout=asyncio.subprocess.PIPE,
- stderr=asyncio.subprocess.STDOUT,
- cwd=tempfile.gettempdir(),
- )
- try:
- stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout)
- except TimeoutError:
- proc.kill()
- await proc.wait()
- return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"}
+ with tempfile.TemporaryDirectory(prefix="findata-mcp-") as workdir:
+ proc = await asyncio.create_subprocess_exec(
+ sys.executable,
+ "-I", # isolated mode: ignore env vars and user site, don't add cwd to path
+ "-c",
+ code,
+ stdout=asyncio.subprocess.PIPE,
+ stderr=asyncio.subprocess.STDOUT,
+ cwd=workdir,
+ )
+ try:
+ stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout)
+ except TimeoutError:
+ proc.kill()
+ await proc.wait()
+ return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def _execute_code(code: str, timeout_s: int) -> dict[str, Any]: | |
| """Run ``code`` in an isolated child interpreter, capturing combined output. | |
| PROTOTYPE — this is NOT a security sandbox: the child runs arbitrary Python | |
| with full library and network access. It is gated off by default and intended | |
| for trusted, local/agent use only. | |
| """ | |
| timeout = max(1, min(timeout_s, _CODE_TIMEOUT_MAX)) | |
| proc = await asyncio.create_subprocess_exec( | |
| sys.executable, | |
| "-I", # isolated mode: ignore env vars and user site, don't add cwd to path | |
| "-c", | |
| code, | |
| stdout=asyncio.subprocess.PIPE, | |
| stderr=asyncio.subprocess.STDOUT, | |
| cwd=tempfile.gettempdir(), | |
| ) | |
| try: | |
| stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout) | |
| except TimeoutError: | |
| proc.kill() | |
| await proc.wait() | |
| return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"} | |
| text = stdout.decode("utf-8", errors="replace") | |
| return { | |
| "timed_out": False, | |
| "exit_code": proc.returncode, | |
| "truncated": len(text) > _CODE_OUTPUT_CAP, | |
| "output": text[:_CODE_OUTPUT_CAP], | |
| } | |
| async def _execute_code(code: str, timeout_s: int) -> dict[str, Any]: | |
| """Run ``code`` in an isolated child interpreter, capturing combined output. | |
| PROTOTYPE — this is NOT a security sandbox: the child runs arbitrary Python | |
| with full library and network access. It is gated off by default and intended | |
| for trusted, local/agent use only. | |
| """ | |
| timeout = max(1, min(timeout_s, _CODE_TIMEOUT_MAX)) | |
| with tempfile.TemporaryDirectory(prefix="findata-mcp-") as workdir: | |
| proc = await asyncio.create_subprocess_exec( | |
| sys.executable, | |
| "-I", # isolated mode: ignore env vars and user site, don't add cwd to path | |
| "-c", | |
| code, | |
| stdout=asyncio.subprocess.PIPE, | |
| stderr=asyncio.subprocess.STDOUT, | |
| cwd=workdir, | |
| ) | |
| try: | |
| stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout) | |
| except TimeoutError: | |
| proc.kill() | |
| await proc.wait() | |
| return {"timed_out": True, "exit_code": None, "output": f"(killed: exceeded {timeout}s)"} | |
| text = stdout.decode("utf-8", errors="replace") | |
| return { | |
| "timed_out": False, | |
| "exit_code": proc.returncode, | |
| "truncated": len(text) > _CODE_OUTPUT_CAP, | |
| "output": text[:_CODE_OUTPUT_CAP], | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/findata/api/mcp_app.py` around lines 831 - 860, _execute_code currently
uses tempfile.gettempdir() as a shared cwd, which lets separate runs interfere
with each other’s temp files. Update _execute_code in src/findata/api/mcp_app.py
to create and use a unique per-request working directory for each invocation,
pass that directory as cwd to asyncio.create_subprocess_exec, and ensure it is
cleaned up after proc.communicate() finishes or on timeout; keep the change
localized to _execute_code and any helper you add for managing the temporary
directory.
| def test_cvm_fund_holdings_requires_cnpj_and_month() -> None: | ||
| r = TestClient(mcp_app).get("/cvm/fund", params={"dataset": "holdings", "year": 2024}) | ||
| assert r.status_code == 400 | ||
| assert "cnpj" in r.json()["detail"] |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Cover the missing month validation explicitly.
This only proves /cvm/fund?dataset=holdings rejects requests without cnpj. If the handler stopped requiring month, this test would still pass. Add a second case with cnpj present and month omitted so the test matches its name.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_mcp_surface.py` around lines 102 - 105, The current test for the
cvm fund holdings endpoint only verifies the missing cnpj path, so it does not
enforce the month requirement named in
test_cvm_fund_holdings_requires_cnpj_and_month. Update the test to include a
second request through TestClient(mcp_app).get("/cvm/fund", ...) with
dataset=holdings and cnpj present but month omitted, and assert it returns 400
with a detail mentioning month so both required parameters are covered.
…urface Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- cvm_fund dataset=returns, cvm_company dataset=search and anbima dataset=ima now apply the documented [:limit] like their sibling branches, so an agent cannot pull the whole-market dataset by omitting a filter. - b3_quote install hint said 'findata-br[b3]'; the package is 'openfindata', matching the REST router and source message. - cvm_fund dataset=periods uses the public list_periods re-export instead of reaching into the private _directory module. - blocks and market_codes splits drop empty elements (trailing comma) like the tickers split already did. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- tesouro_siconfi rejects an RGF period outside 1-3 (the quadrimestre range) with a 400 instead of querying SICONFI with a bad period. - bcb_focus rejects panel=top5 with horizon=monthly (Top-5 is annual-only) instead of silently downgrading to annual. - cvm_structured_fund rejects a dataset for kind=fip (FIP has no facet). - cvm_fund product is now a Literal, so the schema rejects unknown values. - code-mode: the child runs in its own process group and a timeout kills the whole tree (start_new_session + killpg), so a spawned grandchild cannot orphan past the timeout. Reading output fully before the cap stays a documented limit. - _MIN_YEAR_B3_COTAHIST replaces the mislabeled _MIN_YEAR_BCB_SGS for the cotahist lower bound. - 3 offline tests for the new validations. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Curated MCP surface
Replaces the 1:1 auto-generated MCP catalog (one tool per REST route, 95 tools) with 24 hand-curated tools that dispatch to the same
findata.sources.*functions the REST routers use. Related routes fold behind adataset/kindselector; an optionalfindata_run_codetool is gated off by default (FINDATA_MCP_CODE_MODE).Why:
FastApiMCP(app)turned all 95 REST routes into 95 near-duplicate tools, loading roughly 21k tokens oftools/listbefore the first call and hurting tool selection. The curated catalog is about 7k tokens and easier to pick from.Changes
src/findata/api/mcp_app.py: the curated tool catalog (24 tools).app.py:FastApiMCP(mcp_app).mount_http(router=app)serves/mcpon the public app; the 95 REST routes stay unchanged.tests/test_mcp_surface.py: 10 offline tests (catalog size, REST integrity, dispatch validation, code-mode gating).docs/MCP_SURFACE.md: design write-up.Validation
ruff format --check,ruff check,mypy src/findata, andpytest -m "not integration"(249 passed) green locally.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests