feat(tools): audit, fix three bugs, add multi_edit / glob_files / todo_write by Pterjudin · Pull Request #52 · OpenCortexIDE/cortexide

Pterjudin · 2026-05-25T04:30:01Z

Summary

Outcome of a full read-through audit of the agent tool surface. Fixes three verified bugs and adds three missing tools that close gaps versus Claude Code / Cursor / OpenCode.

Bug fixes

search_pathnames_only silently dropped `include_pattern` — validator destructured `params.search_in_folder` (which the model never sends) while the LLM-facing parameter name is `include_pattern`. Every include_pattern argument was discarded, returning unfiltered results.
terminalToolService silent degradation on missing CommandDetection — when shell-integration capability failed to mount within 10s, the resolve path was skipped, leaving `waitUntilDone` as a dead promise. Commands appeared to complete but returned stale buffer contents. Now throws a clear error instead.
Removed dead pathname_search comment block in prompts.ts — easy to mistake for a live definition while reading the file.

New tools

`multi_edit` — atomic multi-block search/replace on a single file. Pre-checks every `old_string` exists before applying anything; if any miss, no edits are written. Supports per-edit `replace_all`. Approval class: `edits`. Builds on the existing applySearchReplaceBlocks engine.
`glob_files` — file listing by glob pattern, sorted by modification time newest-first. Returns mtime + size per file. Limit clamped 1–1000. Closes the gap with Claude Code's Glob — `search_pathnames_only` is substring-only and unsorted.
`todo_write` — model-managed task list, per-session. Enforces a single `in_progress` task at a time. Stored array exposed via `ToolsService.getLatestTodos()` for future UI rendering.

Audit deliverable

Companion audit document at `TOOLS_AUDIT_2026-05-25.md` in repo root (not committed in this PR — intended as a working reference, not project doc). It includes the full 29-tool inventory, competitor matrix, design issues with the four "actor" tools (`extract_function` / `rename_symbol` / `automated_code_review` / `generate_tests` describe taking actions but only return data), and recommended follow-ups.

Verification

`npm run compile-check-ts-native` — clean
`npm run valid-layers-check` — no new violations (pre-existing IMainProcessService warnings are unchanged)
Manual trace: each new tool has a type entry, validator, impl, stringOfResult, prompt description, and approval-class entry where applicable

Follow-ups (not in this PR)

Unit tests for validators — requires extracting validators out of the `ToolsService` class so they can be exercised without full DI. Refactor preferred as separate PR.
Honest descriptions for `extract_function` / `rename_symbol` / `automated_code_review` / `generate_tests` — these tools describe actions they don't perform. Either rewrite descriptions or implement the actions. Discussion needed.
SSRF guard on `browse_url` — no private-IP allowlist; covered by the separate security review.
`run_nl_command` YOLO substring heuristic — naive and trivially bypassable; recommend removing until a real classifier exists.
UI surface for `todo_write` — sidebar widget consuming `getLatestTodos()`.

The validator destructured params.search_in_folder, but the LLM-facing parameter name (per prompts.ts) is include_pattern. Every call that specified include_pattern was silently ignored — the destructured value was undefined and the validator returned includePattern: null. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…meout If CommandDetection capability doesn't mount within 10s, the previous code skipped the resolve path entirely, leaving waitUntilDone as a dead promise. The only resolver was waitUntilInterrupt (idle timeout). Commands then appeared to complete but returned stale buffer contents read after the inactivity timeout fired. Surface a clear error so the model can diagnose and retry, instead of silently returning incorrect output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

A stale comment block for an unimplemented pathname_search tool was left above search_pathnames_only. It served no purpose and was easy to mistake for a live definition while reading the file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three new agent tools to close gaps versus Claude Code / Cursor / OpenCode: - multi_edit — atomic multi-block search/replace on a single file. Pre-checks every old_string exists in the file before applying anything; if any block would miss, no edits are written. Supports replace_all per edit. Builds on the existing applySearchReplaceBlocks engine. Approval class: 'edits'. - glob_files — file listing by glob pattern, sorted by modification time newest-first. Returns mtime + size per file. Limit clamped 1–1000. Closes the gap with Claude Code's Glob — search_pathnames_only is substring-only and unsorted. - todo_write — model-managed task list, per-session. Enforces a single in_progress task at a time. Returns acknowledgment + count; the stored array is exposed via ToolsService.getLatestTodos() for future UI rendering (separate PR). All three include validators that accept either array values or JSON-string values (LLMs sometimes serialize structured params as strings). Type contracts in toolsServiceTypes.ts, descriptions in prompts.ts, validators / impls / stringOfResult in toolsService.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Tajudeen and others added 4 commits May 25, 2026 05:17

Pterjudin marked this pull request as ready for review May 25, 2026 04:32

Pterjudin merged commit 65a340f into main May 25, 2026
12 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): audit, fix three bugs, add multi_edit / glob_files / todo_write#52

feat(tools): audit, fix three bugs, add multi_edit / glob_files / todo_write#52
Pterjudin merged 4 commits into
mainfrom
feat/tools-audit-2026-05-25

Pterjudin commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Pterjudin commented May 25, 2026

Summary

Bug fixes

New tools

Audit deliverable

Verification

Follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant