Skip to content

feat(tools): audit, fix three bugs, add multi_edit / glob_files / todo_write#52

Merged
Pterjudin merged 4 commits into
mainfrom
feat/tools-audit-2026-05-25
May 25, 2026
Merged

feat(tools): audit, fix three bugs, add multi_edit / glob_files / todo_write#52
Pterjudin merged 4 commits into
mainfrom
feat/tools-audit-2026-05-25

Conversation

@Pterjudin
Copy link
Copy Markdown

Summary

Outcome of a full read-through audit of the agent tool surface. Fixes three verified bugs and adds three missing tools that close gaps versus Claude Code / Cursor / OpenCode.

Bug fixes

  • search_pathnames_only silently dropped `include_pattern` — validator destructured `params.search_in_folder` (which the model never sends) while the LLM-facing parameter name is `include_pattern`. Every include_pattern argument was discarded, returning unfiltered results.
  • terminalToolService silent degradation on missing CommandDetection — when shell-integration capability failed to mount within 10s, the resolve path was skipped, leaving `waitUntilDone` as a dead promise. Commands appeared to complete but returned stale buffer contents. Now throws a clear error instead.
  • Removed dead pathname_search comment block in prompts.ts — easy to mistake for a live definition while reading the file.

New tools

  • `multi_edit` — atomic multi-block search/replace on a single file. Pre-checks every `old_string` exists before applying anything; if any miss, no edits are written. Supports per-edit `replace_all`. Approval class: `edits`. Builds on the existing applySearchReplaceBlocks engine.
  • `glob_files` — file listing by glob pattern, sorted by modification time newest-first. Returns mtime + size per file. Limit clamped 1–1000. Closes the gap with Claude Code's Glob — `search_pathnames_only` is substring-only and unsorted.
  • `todo_write` — model-managed task list, per-session. Enforces a single `in_progress` task at a time. Stored array exposed via `ToolsService.getLatestTodos()` for future UI rendering.

Audit deliverable

Companion audit document at `TOOLS_AUDIT_2026-05-25.md` in repo root (not committed in this PR — intended as a working reference, not project doc). It includes the full 29-tool inventory, competitor matrix, design issues with the four "actor" tools (`extract_function` / `rename_symbol` / `automated_code_review` / `generate_tests` describe taking actions but only return data), and recommended follow-ups.

Verification

  • `npm run compile-check-ts-native` — clean
  • `npm run valid-layers-check` — no new violations (pre-existing IMainProcessService warnings are unchanged)
  • Manual trace: each new tool has a type entry, validator, impl, stringOfResult, prompt description, and approval-class entry where applicable

Follow-ups (not in this PR)

  • Unit tests for validators — requires extracting validators out of the `ToolsService` class so they can be exercised without full DI. Refactor preferred as separate PR.
  • Honest descriptions for `extract_function` / `rename_symbol` / `automated_code_review` / `generate_tests` — these tools describe actions they don't perform. Either rewrite descriptions or implement the actions. Discussion needed.
  • SSRF guard on `browse_url` — no private-IP allowlist; covered by the separate security review.
  • `run_nl_command` YOLO substring heuristic — naive and trivially bypassable; recommend removing until a real classifier exists.
  • UI surface for `todo_write` — sidebar widget consuming `getLatestTodos()`.

Tajudeen and others added 4 commits May 25, 2026 05:17
The validator destructured params.search_in_folder, but the LLM-facing
parameter name (per prompts.ts) is include_pattern. Every call that
specified include_pattern was silently ignored — the destructured value
was undefined and the validator returned includePattern: null.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…meout

If CommandDetection capability doesn't mount within 10s, the previous code
skipped the resolve path entirely, leaving waitUntilDone as a dead promise.
The only resolver was waitUntilInterrupt (idle timeout). Commands then
appeared to complete but returned stale buffer contents read after the
inactivity timeout fired.

Surface a clear error so the model can diagnose and retry, instead of
silently returning incorrect output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A stale comment block for an unimplemented pathname_search tool was left
above search_pathnames_only. It served no purpose and was easy to mistake
for a live definition while reading the file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three new agent tools to close gaps versus Claude Code / Cursor / OpenCode:

- multi_edit — atomic multi-block search/replace on a single file. Pre-checks
  every old_string exists in the file before applying anything; if any block
  would miss, no edits are written. Supports replace_all per edit. Builds on
  the existing applySearchReplaceBlocks engine. Approval class: 'edits'.

- glob_files — file listing by glob pattern, sorted by modification time
  newest-first. Returns mtime + size per file. Limit clamped 1–1000. Closes
  the gap with Claude Code's Glob — search_pathnames_only is substring-only
  and unsorted.

- todo_write — model-managed task list, per-session. Enforces a single
  in_progress task at a time. Returns acknowledgment + count; the stored
  array is exposed via ToolsService.getLatestTodos() for future UI rendering
  (separate PR).

All three include validators that accept either array values or JSON-string
values (LLMs sometimes serialize structured params as strings). Type contracts
in toolsServiceTypes.ts, descriptions in prompts.ts, validators / impls /
stringOfResult in toolsService.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Pterjudin Pterjudin marked this pull request as ready for review May 25, 2026 04:32
@Pterjudin Pterjudin merged commit 65a340f into main May 25, 2026
12 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant