Skip to content

engine: rename pageindex→treewalk + auto strategy default (HAL-106)#35

Merged
hallelx2 merged 8 commits into
mainfrom
halleluyaholudele/hal-106-treewalk-rename
Jun 6, 2026
Merged

engine: rename pageindex→treewalk + auto strategy default (HAL-106)#35
hallelx2 merged 8 commits into
mainfrom
halleluyaholudele/hal-106-treewalk-rename

Conversation

@hallelx2

@hallelx2 hallelx2 commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Strips the borrowed PageIndex name from the engine and makes treewalk the canonical strategy.

What's in here

  • Rename pageindex → treewalk across code, API/docs (openapi.yaml), and config examples (config.example*.yaml), including env-var references. Zero pageindex references remain.
  • New auto strategy as the default: small docs (leaf content within the call budget) → single-pass; larger/complex docs with valid page metadata → treewalk; non-paged docs fall back to single-pass.
  • Hardened AutoStrategy selection paths to be nil-safe (pick/Select/SelectWithCost) and added tests for non-paged fallback + nil-safety.
  • Validation/tests/docs alignment updates for auto (config validation coverage and request-override docs).

Verification

  • go build ./... — OK
  • go vet — OK
  • package tests (retrieval / config / api / handler) — all green

Not in here (tracked separately)

  • Retiring agentic/chunked-tree + folding the dedicated endpoint → HAL-108
  • Unifying the section tree + TOC into one canonical map → HAL-109
  • HyDE → candidate-questions rename → HAL-107

Summary by CodeRabbit

  • New Features

    • Added TreeWalk retrieval strategy with dedicated /v1/answer/treewalk endpoint
    • Added auto strategy for dynamic retrieval method selection
  • Changes

    • Replaced PageIndex retrieval configuration with TreeWalk configuration
    • Default retrieval strategy changed to "auto"
    • Updated environment variable names for retrieval settings

Summary by CodeRabbit

  • New Features

    • Added POST /v1/answer/treewalk with streaming (SSE) and reasoning traces; introduced "treewalk" and a new "auto" retrieval strategy.
  • Behavior / Request Overrides

    • Per-request overrides supported (max_hops, max_pages_per_fetch, model) and stream=true for SSE.
  • Configuration Changes

    • Retrieval config renamed pageindex → treewalk; default strategy now "auto". Env vars moved to VLE_/VLS_RETRIEVAL_TREEWALK_*.
  • Chores / Bug Fixes

    • CI toolchain bumped to Go 1.25.11; improved resource cleanup to ignore close errors where appropriate.

hallelx2 added 2 commits June 5, 2026 22:42
Strip the borrowed PageIndex name — treewalk is now the native identifier
for Vectorless's tree-navigation strategy across the strategy id, Go types,
config keys, the /v1/answer/treewalk endpoint, and all tests. Pure rename;
go build, go vet, and package tests all green.

Remaining HAL-106 sub-tasks: promote treewalk to default strategy, dedup
the agentic overlap, and fold the dedicated endpoint into /v1/answer.
Smart default selection: route small documents (total leaf content within
the call budget) to single-pass for one fast cheap call, and larger or more
complex documents to treewalk for agentic navigation. New AutoStrategy in
pkg/retrieval with unit tests; wired into config (default value + validation)
and both binaries' strategy builders + the selectable-strategy set.

go build, go vet, and package tests all green.
Copilot AI review requested due to automatic review settings June 6, 2026 12:39

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @hallelx2, your pull request is larger than the review limit of 150000 diff characters

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Replaces PageIndex with TreeWalk across config, OpenAPI, server/router, handlers, and retrieval strategy; adds AutoStrategy; rewrites endpoint tests to /v1/answer/treewalk; applies many best‑effort Close() changes and serializes pdftable calls.

Changes

TreeWalk migration and Auto strategy

Layer / File(s) Summary
Config, defaults and OpenAPI
pkg/config/config.go, internal/config/config.go, internal/config/config_treewalk_test.go, pkg/config/config_test.go, openapi.yaml, config.example.yaml, config.server.example.yaml
Replaces PageIndex config with TreeWalkBlock, changes default retrieval.strategy"auto", adds TreeWalk env overrides and tests, updates OpenAPI to POST /v1/answer/treewalk and TreeWalk schemas.
Core retrieval implementations
pkg/retrieval/treewalk_strategy.go, pkg/retrieval/treewalk_strategy_test.go, pkg/retrieval/auto.go, pkg/retrieval/auto_test.go, pkg/retrieval/strategy.go
Implements TreeWalkStrategy, TreeWalkAction parser, TreeWalkEvent, computeTreeWalkTraceToken, and adds AutoStrategy to pick small vs large sub-strategies by token/budget; tests updated/renamed accordingly.
API handlers and router
internal/api/treewalk.go, internal/handler/answer_treewalk.go, internal/handler/router.go, internal/api/server.go, internal/handler/router_parity_test.go
Introduces /v1/answer/treewalk handler (SSE streaming and non-streaming), per-request strategy cloning/overrides, swaps deps to TreeWalkStrategy/TreeWalk, and updates router wiring and parity tests.
Server/engine startup and builders
cmd/server/main.go, cmd/engine/main.go
Constructs and injects TreeWalkStrategy when enabled & LLM present, updates buildStrategy/buildStrategySet to support "treewalk" and "auto" (SinglePass + TreeWalk), wires storage-backed page loader and optional DB TOC provider, and logs configured caps.
Handler tests and end-to-end coverage
internal/api/treewalk_test.go, internal/handler/query_strategy_test.go
Rewrites endpoint tests to exercise /v1/answer/treewalk, adds scripted LLM for TreeWalk tests, covers reasoning trace, SSE streaming, citation dedup/cap, per-request overrides, TOC fallback, error mappings (400/404/501), and replay persistence checks.
Best-effort resource-close and misc maintenance
many files (internal/*, pkg/*, cmd/*, pkg/parser/pdf.go, .github/workflows/ci.yml, .golangci.yml)
Converts many deferred Close() calls to best-effort patterns that discard Close() errors, serializes pdftable calls with a mutex, updates CI toolchain pin and indirect deps, and small lint/middleware/test adjustments.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

"I hopped through code with nimble feet,
Replaced old pages with a TreeWalk beat,
Auto picks the path, events light the way,
Citations tidy, tests now play,
A tiny rabbit cheers this tidy feat." 🐇

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch halleluyaholudele/hal-106-treewalk-rename

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the engine-wide rename from the borrowed “PageIndex” terminology to “TreeWalk”, updates the dedicated answer endpoint to /v1/answer/treewalk, and introduces an auto retrieval strategy as the new default to route small documents to single-pass and larger documents to tree-walking.

Changes:

  • Renames PageIndex → TreeWalk across retrieval strategy types, identifiers, config blocks, handlers, and tests.
  • Adds AutoStrategy and switches the default retrieval.strategy to auto.
  • Rewires both binaries and routers to mount /v1/answer/treewalk and to expose treewalk/auto as selectable strategies.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pkg/tree/tree.go Updates TOCNode docs to reference TreeWalk terminology.
pkg/retrieval/treewalk_strategy.go Renames PageIndexStrategy/Event/Action → TreeWalkStrategy/Event/Action and updates identifiers/prompts/log strings.
pkg/retrieval/treewalk_strategy_test.go Renames and updates tests to exercise TreeWalkStrategy.
pkg/retrieval/strategy.go Updates Result.PagesRead comment to cite treewalk as the page-based example.
pkg/retrieval/auto.go Introduces AutoStrategy routing between “Small” and “Large” strategies.
pkg/retrieval/auto_test.go Adds unit tests for AutoStrategy routing and naming.
pkg/ingest/toc_builder.go Updates TOC builder docs/comments to reference TreeWalk pipeline naming.
pkg/ingest/minimal_mode_test.go Updates minimal-mode test commentary to reference TreeWalk coverage.
pkg/ingest/ingest.go Updates ingest docs/comments referencing the page-based strategy endpoint and TOC stage naming.
pkg/config/config.go Renames retrieval.pageindex → retrieval.treewalk, switches default retrieval.strategy to auto, updates env overrides and validation.
pkg/config/config_test.go Updates defaults/validation/env override tests for TreeWalk + auto default.
internal/handler/router.go Renames wiring from /answer/pageindex to /answer/treewalk and updates Deps fields accordingly.
internal/handler/router_parity_test.go Updates deployed router parity assertions for /v1/answer/treewalk.
internal/handler/query.go Updates strategy override docs/comments and keeps override resolution by map lookup.
internal/handler/query_strategy_test.go Updates override acceptance test from pageindex → treewalk.
internal/handler/answer_treewalk.go Renames the multi-tenant deployed handler to AnswerTreeWalkHandler and rewires event tracing/citations.
internal/config/config.go Forwards TreeWalk env overrides into embedded engine config.
internal/config/config_treewalk_test.go Adds tests for TreeWalk max-citations env pass-through behavior.
internal/config/config_pageindex_test.go Removes PageIndex env forwarding tests (superseded by TreeWalk equivalents).
internal/api/treewalk.go Renames standalone engine handler and supporting helpers to TreeWalk equivalents.
internal/api/treewalk_test.go Updates standalone API tests for /v1/answer/treewalk behavior.
internal/api/server.go Mounts /v1/answer/treewalk and renames Deps fields for TreeWalk.
cmd/server/main.go Switches default strategy to auto; wires selectable strategies set including treewalk and auto; rewires dedicated answer endpoint strategy instance.
cmd/engine/main.go Switches default strategy to auto; rewires dedicated answer endpoint strategy instance and strategy builder cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/retrieval/auto.go
Comment on lines +52 to +56
func (a *AutoStrategy) pick(t *tree.Tree, budget ContextBudget) Strategy {
threshold := a.SinglePassMaxTokens
if threshold <= 0 {
threshold = budget.Available()
}
Comment thread pkg/retrieval/auto.go
Comment on lines +70 to +73
// Select implements Strategy.
func (a *AutoStrategy) Select(ctx context.Context, t *tree.Tree, query string, budget ContextBudget) ([]tree.SectionID, error) {
return a.pick(t, budget).Select(ctx, t, query, budget)
}
Comment thread pkg/retrieval/auto.go
Comment on lines +78 to +82
func (a *AutoStrategy) SelectWithCost(ctx context.Context, t *tree.Tree, query string, budget ContextBudget) (*Result, error) {
s := a.pick(t, budget)
if cs, ok := s.(CostStrategy); ok {
return cs.SelectWithCost(ctx, t, query, budget)
}
Comment thread internal/handler/query.go
Comment on lines 78 to 82
// Strategy optionally overrides the configured retrieval strategy
// for THIS request only. One of: chunked-tree, pageindex, agentic,
// for THIS request only. One of: chunked-tree, treewalk, agentic,
// single-pass. Empty uses the server default. This lets a caller
// (e.g. the benchmark harness) A/B strategies against the same
// running engine without a redeploy. Unknown values return 400.
Comment thread pkg/config/config_test.go Outdated
Comment on lines 655 to 656
for _, s := range []string{"single-pass", "chunked-tree", "agentic", "treewalk"} {
cfg := Default()
Comment thread pkg/config/config.go
Comment on lines 1209 to 1213
switch c.Retrieval.Strategy {
case "single-pass", "chunked-tree", "agentic", "pageindex":
case "auto", "single-pass", "chunked-tree", "agentic", "treewalk":
default:
return fmt.Errorf("unknown retrieval.strategy: %q", c.Retrieval.Strategy)
}

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/api/treewalk.go (1)

472-508: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove unused helper or add test coverage.

The treeWalkTraceTokenFromCitations function is unused according to static analysis (U1000) and the comment admits it's "unused at the moment." Unused code increases maintenance burden and can drift from the actual implementation.

Either:

  1. Add test coverage that uses this function to verify trace token computation, or
  2. Remove it (can always restore from git history when needed)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/api/treewalk.go` around lines 472 - 508, The function
treeWalkTraceTokenFromCitations is flagged as unused; either remove it or add
test coverage that calls it: if removing, delete the entire function (including
its references to tree.DocumentID, model, ranges, retrieval.SystemPromptVersion)
and run static checks; if keeping, add a unit test that imports the same
package, calls treeWalkTraceTokenFromCitations with representative DocumentID,
model and ranges inputs, and asserts the returned hex string matches a
precomputed expected value (or verifies deterministic properties), ensuring the
test exercises the sha256/hex path so the function is no longer unused.

Source: Linters/SAST tools

internal/handler/answer_treewalk.go (1)

277-299: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist replay data on the SSE path too.

The non-stream flow stores the final payload through writeJSONWithReplay, but the streaming flow only emits the final "answer" event. That leaves streamed trace_token values orphaned in practice, because nothing ever writes the replay entry behind them.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/handler/answer_treewalk.go` around lines 277 - 299, The streaming
path currently only calls emitSSE("answer", final) and never persists the replay
entry, so trace_token values from the stream are never written; update the
streaming end in the handler (where buildCitations produces citations and final
is assembled) to call the same persistence helper used by the non-stream flow
(writeJSONWithReplay) with the final payload (including trace_token) before or
immediately after emitSSE so the replay entry is created; ensure you pass the
same context/body identifiers and any required replay metadata to
writeJSONWithReplay so replay lookups for trace_token succeed.
🧹 Nitpick comments (1)
internal/config/config.go (1)

329-333: ⚡ Quick win

Clarify fallback wording to match actual precedence behavior.

On Line 333, the comment says garbled/negative input “leaves the engine default (3)”. In this function, parse failure/no-op preserves the current configured value (which can come from YAML), not always default 3.

Suggested comment-only fix
-	// VLE_; a garbled or negative value leaves the engine default (3).
+	// VLE_; a garbled or negative value leaves the current configured
+	// value unchanged (default is 3 when unset elsewhere).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/config/config.go` around lines 329 - 333, Update the comment on the
TreeWalk final-citation cap to accurately reflect precedence and fallback: state
that VLS_ overrides VLE_, and that a garbled or negative environment value is a
no-op that preserves the currently configured value (which may come from YAML or
a prior config), rather than always reverting to the engine default of 3; locate
the comment near the TreeWalk cap handling in internal/config/config.go
(references: VLS_ and VLE_) and revise the wording accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/handler/answer_treewalk.go`:
- Around line 260-262: serveStream currently forwards the whole
retrieval.TreeWalkEvent via strat.OnEvent/emitSSE regardless of
body.IncludeReasoning, leaking Reasoning/Answer; change the strat.OnEvent
handler (and the emit path) to check the request flag (body.IncludeReasoning)
and, when false, send a redacted payload that omits or zeroes sensitive fields
(e.g., TreeWalkEvent.Reasoning, TreeWalkEvent.Answer) or build a separate
minimal event type to emit; ensure the decision is done inside the strat.OnEvent
closure (or before calling emitSSE) so raw TreeWalkEvent is never streamed
unless IncludeReasoning is true.

In `@pkg/retrieval/auto.go`:
- Around line 52-68: The pick method on AutoStrategy can return nil if both
a.Small and a.Large are nil, causing Select/SelectWithCost to panic; fix by
validating strategies at construction or adding defensive checks: update NewAuto
to require non-nil Small or Large and return an error (preferred), or change
pick to return a non-nil fallback (e.g., prefer a.Small then a.Large) and update
Select and SelectWithCost to detect a nil return and return an error instead of
dereferencing; reference symbols: AutoStrategy, pick, NewAuto, Select,
SelectWithCost.

---

Outside diff comments:
In `@internal/api/treewalk.go`:
- Around line 472-508: The function treeWalkTraceTokenFromCitations is flagged
as unused; either remove it or add test coverage that calls it: if removing,
delete the entire function (including its references to tree.DocumentID, model,
ranges, retrieval.SystemPromptVersion) and run static checks; if keeping, add a
unit test that imports the same package, calls treeWalkTraceTokenFromCitations
with representative DocumentID, model and ranges inputs, and asserts the
returned hex string matches a precomputed expected value (or verifies
deterministic properties), ensuring the test exercises the sha256/hex path so
the function is no longer unused.

In `@internal/handler/answer_treewalk.go`:
- Around line 277-299: The streaming path currently only calls emitSSE("answer",
final) and never persists the replay entry, so trace_token values from the
stream are never written; update the streaming end in the handler (where
buildCitations produces citations and final is assembled) to call the same
persistence helper used by the non-stream flow (writeJSONWithReplay) with the
final payload (including trace_token) before or immediately after emitSSE so the
replay entry is created; ensure you pass the same context/body identifiers and
any required replay metadata to writeJSONWithReplay so replay lookups for
trace_token succeed.

---

Nitpick comments:
In `@internal/config/config.go`:
- Around line 329-333: Update the comment on the TreeWalk final-citation cap to
accurately reflect precedence and fallback: state that VLS_ overrides VLE_, and
that a garbled or negative environment value is a no-op that preserves the
currently configured value (which may come from YAML or a prior config), rather
than always reverting to the engine default of 3; locate the comment near the
TreeWalk cap handling in internal/config/config.go (references: VLS_ and VLE_)
and revise the wording accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eaee24cb-1189-4825-8bba-e42087fe803f

📥 Commits

Reviewing files that changed from the base of the PR and between f150e5b and 610b265.

📒 Files selected for processing (24)
  • cmd/engine/main.go
  • cmd/server/main.go
  • internal/api/server.go
  • internal/api/treewalk.go
  • internal/api/treewalk_test.go
  • internal/config/config.go
  • internal/config/config_pageindex_test.go
  • internal/config/config_treewalk_test.go
  • internal/handler/answer_treewalk.go
  • internal/handler/query.go
  • internal/handler/query_strategy_test.go
  • internal/handler/router.go
  • internal/handler/router_parity_test.go
  • pkg/config/config.go
  • pkg/config/config_test.go
  • pkg/ingest/ingest.go
  • pkg/ingest/minimal_mode_test.go
  • pkg/ingest/toc_builder.go
  • pkg/retrieval/auto.go
  • pkg/retrieval/auto_test.go
  • pkg/retrieval/strategy.go
  • pkg/retrieval/treewalk_strategy.go
  • pkg/retrieval/treewalk_strategy_test.go
  • pkg/tree/tree.go
💤 Files with no reviewable changes (1)
  • internal/config/config_pageindex_test.go

Comment on lines +260 to 262
strat.OnEvent = func(ev retrieval.TreeWalkEvent) {
emitSSE(ev.Type, ev)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't stream raw TreeWalk events unless reasoning was requested.

serveStream ignores body.IncludeReasoning and forwards the full retrieval.TreeWalkEvent as the SSE payload. That exposes intermediate Reasoning/Answer fields to any streaming caller, which breaks the endpoint's opt-in reasoning contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/handler/answer_treewalk.go` around lines 260 - 262, serveStream
currently forwards the whole retrieval.TreeWalkEvent via strat.OnEvent/emitSSE
regardless of body.IncludeReasoning, leaking Reasoning/Answer; change the
strat.OnEvent handler (and the emit path) to check the request flag
(body.IncludeReasoning) and, when false, send a redacted payload that omits or
zeroes sensitive fields (e.g., TreeWalkEvent.Reasoning, TreeWalkEvent.Answer) or
build a separate minimal event type to emit; ensure the decision is done inside
the strat.OnEvent closure (or before calling emitSSE) so raw TreeWalkEvent is
never streamed unless IncludeReasoning is true.

Comment thread pkg/retrieval/auto.go
… rename)

Addresses the Copilot + CodeRabbit review on #35:
- auto: non-paged docs (no PageStart/PageEnd — Markdown/HTML/DOCX/TXT) now route
  to single-pass, never treewalk (which navigates by page range) + nil-safe
  pick/Select/SelectWithCost so a misconfigured Auto can't panic.
- complete the rename in non-Go files: openapi.yaml + config.example*.yaml
  (incl. the uppercase env-var refs) — zero pageindex remains anywhere.
- query.go: list 'auto' in the strategy-override doc comment.
- config_test: exercise 'auto' as a valid Validate() strategy.
- tests: non-paged fallback + nil-safety.

Out of scope (filed separately): treewalk SSE streams reasoning regardless of
IncludeReasoning (pre-existing behaviour, not introduced here).
@hallelx2

hallelx2 commented Jun 6, 2026

Copy link
Copy Markdown
Owner Author

Thanks @copilot-pull-request-reviewer and @coderabbitai — addressed in dd6697b:

Fixed (in scope):

  • auto.go — non-paged docs (no PageStart/PageEnd) now route to single-pass, never treewalk (treewalk navigates by page range); pick/Select/SelectWithCost are nil-safe. Added tests for both.
  • Rename completed in non-Go files — openapi.yaml + config.example*.yaml (incl. uppercase env-var refs). Zero pageindex remains anywhere now.
  • query.go — strategy-override comment now lists auto.
  • config_test.go — Validate() test now exercises auto.

Deferred (out of scope for this rename PR, tracked as HAL-115):

  • treewalk serveStream forwarding reasoning regardless of IncludeReasoning is pre-existing behaviour, not introduced here. Filed as its own issue to fix deliberately with tests.

Build + go vet + package tests all green.

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

@hallelx2 Thanks for the detailed summary! The in-scope fixes look solid — particularly the auto.go improvements (non-paged docs routing to single-pass, nil-safe pick/Select/SelectWithCost) and completing the rename across the non-Go config/OpenAPI files. Good call deferring the serveStream + IncludeReasoning behaviour to a dedicated issue with proper test coverage.

Would you like me to:

  1. Open a GitHub issue for HAL-115 (treewalk serveStream forwards reasoning events regardless of IncludeReasoning) as a follow-up so it's tracked publicly in this repo?
  2. Trigger a re-review of the latest commit (dd6697b) to verify everything looks good?

Just let me know!

🐇✨

…ump, lint

- toc_builder: fix the data race in verifyTitlesConcurrent — each goroutine
  now accumulates a local Usage and folds it under the mutex, instead of
  sharing &localUse across errgroup goroutines (concurrent usage.add writes).
- deps: golang.org/x/net v0.53.0 -> v0.55.0 (GO-2026-5030); add toolchain
  go1.25.11 + pin CI go-version (stdlib GO-2026-5039 / GO-2026-5037).
- lint: //lint:ignore U1000 the four staged helpers with reasons; drop the
  genuinely-unused idempotencyCache.mu field.

These CI failures were pre-existing (surfaced on PR #35, not caused by the
rename). go build + staticcheck green locally; -race verified in CI.

Closes HAL-116.
@hallelx2

hallelx2 commented Jun 6, 2026

Copy link
Copy Markdown
Owner Author

Pushed 45edc12 — greens the pre-existing CI baseline (these failures were on main, not introduced by the rename; tracked as HAL-116):

  • test data race — fixed a real race in toc_builder.verifyTitlesConcurrent (per-goroutine Usage merged under the lock).
  • govulncheckx/net→v0.55.0 + toolchain go1.25.11 for the stdlib CVEs.
  • lint//lint:ignore U1000 the 4 staged helpers (with reasons); removed the unused idempotencyCache.mu.

go build + staticcheck clean locally; -race verified by CI. Closes HAL-116.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
openapi.yaml (1)

306-310: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Finish the rename in the endpoint description.

Line 310 still says page-index responses uniformly, so the spec does not actually complete the PageIndex → TreeWalk rename.

Suggested diff
-            /v1/replay accepts page-index responses uniformly.
+            /v1/replay accepts TreeWalk responses uniformly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openapi.yaml` around lines 306 - 310, The endpoint description still uses the
old term "page-index responses uniformly" — update the wording in the
Differentiators block for the /v1/replay endpoint to complete the PageIndex →
TreeWalk rename (e.g., replace "page-index responses uniformly" with "TreeWalk
responses uniformly" or another consistent "TreeWalk" phrasing), and scan the
same description for any remaining PageIndex occurrences to ensure all
references (PageIndex, page-index) are renamed consistently to TreeWalk across
that endpoint's documentation.
config.example.yaml (1)

98-113: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The example config still opts users out of the new default.

This block still advertises chunked-tree as the default and leaves strategy: "chunked-tree" in place. Anyone bootstrapping from this file will miss the new auto routing behavior the PR is introducing.

Suggested diff
-  # strategy: single-pass | chunked-tree | agentic | treewalk
+  # strategy: auto | single-pass | chunked-tree | agentic | treewalk
   #
+  #   auto:         choose single-pass for small/non-paged docs and
+  #                 treewalk for larger paged docs.
   #   single-pass:  whole tree in one LLM call; fastest, smallest docs.
   #   chunked-tree: split the tree, reason over slices in parallel, merge.
-  #                 The default. Scales to any tree size by trading
+  #                 Scales to any tree size by trading
   #                 context for parallelism.
   #   agentic:      iterative outline → expand → read → done loop.
   #                 Picks per-section IDs via a tool-using model.
   #   treewalk:    TreeWalk-style page-based agentic loop. Three
   #                 tools (get_document_structure / get_pages / done);
   #                 the model navigates by INCLUSIVE PAGE RANGE
   #                 rather than by section ID. Best for paginated
   #                 documents (SEC filings, academic PDFs) where the
   #                 per-section interface is too noisy.
-  strategy: "chunked-tree"
+  strategy: "auto"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config.example.yaml` around lines 98 - 113, Update the example config to use
the new default routing by replacing the strategy value "chunked-tree" with
"auto" and adjust the surrounding comment to describe the "auto" behavior (and
that it is now the default) so users bootstrapping from config.example.yaml opt
into the new automatic routing; locate the strategy key (strategy:
"chunked-tree") and the explanatory comment block listing strategy options and
update both the value and the description text to reference "auto" as the
default and briefly summarize what "auto" does.
pkg/config/config_test.go (1)

783-807: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Isolate the bad-env test from the VLS_ alias too.

applyEnvOverrides() reads VLE_RETRIEVAL_TREEWALK_MAX_CITATIONS before VLS_RETRIEVAL_TREEWALK_MAX_CITATIONS. If the VLE_ value here is garbage but the ambient VLS_ alias is set, this test stops observing the default 3 and becomes environment-dependent.

Suggested diff
 func TestTreeWalkEnvOverrideRejectsBad(t *testing.T) {
 	prevHops := os.Getenv("VLE_RETRIEVAL_TREEWALK_MAX_HOPS")
 	prevLimit := os.Getenv("VLE_RETRIEVAL_TREEWALK_PAGE_CONTENT_LIMIT")
 	prevCits := os.Getenv("VLE_RETRIEVAL_TREEWALK_MAX_CITATIONS")
+	prevVLSCits := os.Getenv("VLS_RETRIEVAL_TREEWALK_MAX_CITATIONS")
 	defer func() {
 		os.Setenv("VLE_RETRIEVAL_TREEWALK_MAX_HOPS", prevHops)
 		os.Setenv("VLE_RETRIEVAL_TREEWALK_PAGE_CONTENT_LIMIT", prevLimit)
 		os.Setenv("VLE_RETRIEVAL_TREEWALK_MAX_CITATIONS", prevCits)
+		os.Setenv("VLS_RETRIEVAL_TREEWALK_MAX_CITATIONS", prevVLSCits)
 	}()
 
 	os.Setenv("VLE_RETRIEVAL_TREEWALK_MAX_HOPS", "abc")
 	os.Setenv("VLE_RETRIEVAL_TREEWALK_PAGE_CONTENT_LIMIT", "not-a-number")
 	os.Setenv("VLE_RETRIEVAL_TREEWALK_MAX_CITATIONS", "lots")
+	os.Unsetenv("VLS_RETRIEVAL_TREEWALK_MAX_CITATIONS")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/config/config_test.go` around lines 783 - 807,
TestTreeWalkEnvOverrideRejectsBad depends on the ambient VLS_ alias and
applyEnvOverrides reads VLE_ before VLS_, so when VLE_ is garbage but VLS_ is
set the test is flaky; update the test to also capture and restore the VLS_
environment variables (e.g. VLS_RETRIEVAL_TREEWALK_MAX_HOPS,
VLS_RETRIEVAL_TREEWALK_PAGE_CONTENT_LIMIT, VLS_RETRIEVAL_TREEWALK_MAX_CITATIONS)
like the existing VLE_ saves, and explicitly unset them (or set to empty) before
calling Default() and applyEnvOverrides() so the test always exercises
applyEnvOverrides with only the VLE_ garbage values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openapi.yaml`:
- Around line 666-668: The OpenAPI enum for QueryResponse.strategy omits the
newly supported "auto" strategy; update the schema so the enum for the
"strategy" property on QueryResponse includes "auto" (i.e., enum: [single-pass,
chunked-tree, agentic, treewalk, auto]) to match server behavior and avoid
client/schema validation errors when the handler emits "auto".

---

Outside diff comments:
In `@config.example.yaml`:
- Around line 98-113: Update the example config to use the new default routing
by replacing the strategy value "chunked-tree" with "auto" and adjust the
surrounding comment to describe the "auto" behavior (and that it is now the
default) so users bootstrapping from config.example.yaml opt into the new
automatic routing; locate the strategy key (strategy: "chunked-tree") and the
explanatory comment block listing strategy options and update both the value and
the description text to reference "auto" as the default and briefly summarize
what "auto" does.

In `@openapi.yaml`:
- Around line 306-310: The endpoint description still uses the old term
"page-index responses uniformly" — update the wording in the Differentiators
block for the /v1/replay endpoint to complete the PageIndex → TreeWalk rename
(e.g., replace "page-index responses uniformly" with "TreeWalk responses
uniformly" or another consistent "TreeWalk" phrasing), and scan the same
description for any remaining PageIndex occurrences to ensure all references
(PageIndex, page-index) are renamed consistently to TreeWalk across that
endpoint's documentation.

In `@pkg/config/config_test.go`:
- Around line 783-807: TestTreeWalkEnvOverrideRejectsBad depends on the ambient
VLS_ alias and applyEnvOverrides reads VLE_ before VLS_, so when VLE_ is garbage
but VLS_ is set the test is flaky; update the test to also capture and restore
the VLS_ environment variables (e.g. VLS_RETRIEVAL_TREEWALK_MAX_HOPS,
VLS_RETRIEVAL_TREEWALK_PAGE_CONTENT_LIMIT, VLS_RETRIEVAL_TREEWALK_MAX_CITATIONS)
like the existing VLE_ saves, and explicitly unset them (or set to empty) before
calling Default() and applyEnvOverrides() so the test always exercises
applyEnvOverrides with only the VLE_ garbage values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ca0e5052-7ad9-42c3-ab14-a52aac0e4eca

📥 Commits

Reviewing files that changed from the base of the PR and between 610b265 and 45edc12.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • .github/workflows/ci.yml
  • config.example.yaml
  • config.server.example.yaml
  • go.mod
  • internal/api/treewalk.go
  • internal/handler/query.go
  • internal/middleware/cors.go
  • internal/middleware/idempotency.go
  • openapi.yaml
  • pkg/config/config_test.go
  • pkg/ingest/ingest.go
  • pkg/ingest/toc_builder.go
  • pkg/parser/pdf.go
  • pkg/retrieval/auto.go
  • pkg/retrieval/auto_test.go
✅ Files skipped from review due to trivial changes (5)
  • config.server.example.yaml
  • internal/middleware/cors.go
  • internal/handler/query.go
  • pkg/parser/pdf.go
  • pkg/ingest/ingest.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/retrieval/auto.go
  • internal/api/treewalk.go

Comment thread openapi.yaml
Comment on lines 666 to +668
strategy:
type: string
enum: [single-pass, chunked-tree, agentic, pageindex]
enum: [single-pass, chunked-tree, agentic, treewalk]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add auto to QueryResponse.strategy.

auto is now a supported first-class retrieval strategy, but the response enum still excludes it. That leaves the published contract narrower than the server/config behavior and can break generated clients or schema validation if the handler serializes the selected strategy name directly.

Suggested diff
-          enum: [single-pass, chunked-tree, agentic, treewalk]
+          enum: [auto, single-pass, chunked-tree, agentic, treewalk]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
strategy:
type: string
enum: [single-pass, chunked-tree, agentic, pageindex]
enum: [single-pass, chunked-tree, agentic, treewalk]
strategy:
type: string
enum: [auto, single-pass, chunked-tree, agentic, treewalk]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openapi.yaml` around lines 666 - 668, The OpenAPI enum for
QueryResponse.strategy omits the newly supported "auto" strategy; update the
schema so the enum for the "strategy" property on QueryResponse includes "auto"
(i.e., enum: [single-pass, chunked-tree, agentic, treewalk, auto]) to match
server behavior and avoid client/schema validation errors when the handler emits
"auto".

hallelx2 added 2 commits June 6, 2026 15:28
- pkg/parser: serialize pdftable.OpenBytes behind a package mutex (openPDFBytes).
  pdftable mutates package-level state on open and is not concurrency-safe; the
  -race detector flagged concurrent opens (real in prod — ingest workers parse
  in parallel). Stopgap; the proper fix belongs in pdftable (Foundational Libs).
- convert the unused-helper suppressions from staticcheck //lint:ignore to
  golangci-lint //nolint:unused,staticcheck (CI lint uses golangci-lint).

go build + go vet + parser tests green.
…heck (HAL-117)

Completes the errcheck/ineffassign cleanup across the repo, including the
renamed files (treewalk.go / answer_treewalk.go) the off-main pass couldn't reach:
- errcheck: _, _ = fmt.Fprintf on the SSE writers; unchecked returns handled
  across cmd/benchmark, handlers, connecthandler, parser, storage, queue, db, ingest.
- ineffassign: drop the ineffectual 'written += remaining' before break in
  treewalk_strategy.
- staticcheck: real TLS-disabled assertion (was an empty branch) in config_test;
  De Morgan's rewrite in retrieval_test; drop embedded-field selector in decompose_test.
- misspell: ignore-rule for 'strat' (our 'strategy' abbreviation) in .golangci.yml.
- formatting normalized via the golangci formatters.

golangci-lint run: 0 issues. go build + unit tests green. Closes HAL-117.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/benchmark/main.go`:
- Around line 104-110: Benchmark currently records durations for failed requests
(e.g., the REST call via restGET and gRPC via healthClient.Check) which skews
latency stats; change each measured block (places using restGET,
healthClient.Check and similar REST/gRPC calls that append to
r.rest/r.grpc/r.rerpc etc.) to capture and check the error/result and only
append time.Since(start) when the call succeeded, otherwise increment a failure
counter (e.g., r.restFailures/r.grpcFailures or a shared successes/failures map)
so latency arrays contain only successful-sample durations and failures are
tracked separately; update all similar blocks (the other ranges mentioned) to
follow the same pattern using the existing variables like ctx,
connect.NewRequest, r.rest and r.grpc to locate the spots to change.
- Line 80: The json.Unmarshal call in cmd/benchmark/main.go is ignoring errors
which can leave sectionID empty and cause the "Get Section" benchmark to be
silently skipped; change the unmarshalling of treeResp into treeData (the
json.Unmarshal(treeResp, &treeData) call) to check the returned error, and on
failure log an explicit setup error (including the error detail) and mark the
benchmark run as failed or abort further section selection (e.g., return a
non-zero exit or set a setupFailure flag) so the missing sectionID is not
treated as a successful run; ensure downstream code that uses sectionID (section
selection / "Get Section" benchmark) respects this failure signal.

In `@internal/api/server.go`:
- Around line 395-399: The ReadAll result is used even when io.ReadAll returns
an error; update the rc handling (the block after d.Storage.Get returning err ==
nil where variables rc, raw, content are used) to check the error returned by
io.ReadAll, close rc best-effort, and discard the section by setting content =
"" (or skip adding the section) when ReadAll fails so truncated/corrupted bytes
aren’t used; apply the same fix to the other similar blocks referenced (the
occurrences around lines 538-542, 835-839, 1077-1081) to consistently handle
ReadAll errors.

In `@internal/handler/answer.go`:
- Around line 169-173: The code ignores the error returned by io.ReadAll after
h.storage.Get, which can yield truncated content; change the block around
h.storage.Get / io.ReadAll to check the read error (the second return from
io.ReadAll), ensure rc is closed (best-effort) even on read failure, and if
io.ReadAll returns an error do not use the partial raw content — either skip
adding this section to content or return/propagate an error; update the handling
where variables rc, raw, getErr are used so only successfully-read data is
assigned to content.

In `@internal/handler/documents.go`:
- Around line 428-431: The code currently ignores errors from io.ReadAll and may
return truncated section content; change the read to capture the error (raw,
readErr := io.ReadAll(rc)) and if readErr != nil then ensure rc is closed and
propagate a read/storage error instead of serializing partial bytes — e.g.,
log/report readErr and return an appropriate HTTP error (or skip including the
section body) so that h.storage.Get / sec.ContentRef reads never produce a 200
with truncated data; update uses of rc.Close() accordingly to always close in
both success and error paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 89368ae7-3f90-426d-899e-b14165d27cc8

📥 Commits

Reviewing files that changed from the base of the PR and between 45edc12 and 7407704.

📒 Files selected for processing (42)
  • .golangci.yml
  • cmd/benchmark/main.go
  • cmd/engine/main.go
  • cmd/server/main.go
  • internal/api/replay_test.go
  • internal/api/server.go
  • internal/api/treewalk.go
  • internal/api/treewalk_test.go
  • internal/connecthandler/documents.go
  • internal/connecthandler/query.go
  • internal/handler/answer.go
  • internal/handler/answer_treewalk.go
  • internal/handler/documents.go
  • internal/handler/query.go
  • internal/handler/query_multi.go
  • internal/handler/query_strategy_test.go
  • internal/handler/query_stream.go
  • internal/handler/query_stream_multi.go
  • internal/middleware/cors.go
  • internal/middleware/idempotency.go
  • pkg/cache/cache_test.go
  • pkg/config/config_test.go
  • pkg/db/db.go
  • pkg/ingest/hyde.go
  • pkg/ingest/ingest.go
  • pkg/ingest/minimal_mode_test.go
  • pkg/ingest/toc_builder.go
  • pkg/parser/chunk_test.go
  • pkg/parser/docx.go
  • pkg/parser/pdf.go
  • pkg/queue/asynq.go
  • pkg/queue/asynq_test.go
  • pkg/queue/qstash.go
  • pkg/queue/river_test.go
  • pkg/retrieval/agentic.go
  • pkg/retrieval/decompose_test.go
  • pkg/retrieval/retrieval_test.go
  • pkg/retrieval/single_pass.go
  • pkg/retrieval/treewalk_strategy.go
  • pkg/storage/local.go
  • pkg/storage/s3_test.go
  • pkg/tree/compact_test.go
💤 Files with no reviewable changes (1)
  • pkg/retrieval/treewalk_strategy.go
✅ Files skipped from review due to trivial changes (13)
  • .golangci.yml
  • pkg/retrieval/agentic.go
  • internal/handler/query_multi.go
  • pkg/retrieval/single_pass.go
  • pkg/queue/qstash.go
  • pkg/tree/compact_test.go
  • internal/middleware/cors.go
  • pkg/queue/asynq.go
  • internal/handler/query_stream_multi.go
  • pkg/ingest/hyde.go
  • pkg/parser/docx.go
  • internal/connecthandler/query.go
  • pkg/parser/chunk_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • internal/handler/query.go
  • pkg/ingest/minimal_mode_test.go
  • internal/middleware/idempotency.go
  • pkg/ingest/toc_builder.go
  • internal/handler/query_strategy_test.go
  • internal/api/treewalk_test.go
  • cmd/engine/main.go
  • internal/api/treewalk.go
  • pkg/config/config_test.go

Comment thread cmd/benchmark/main.go
} `json:"sections"`
}
json.Unmarshal(treeResp, &treeData)
_ = json.Unmarshal(treeResp, &treeData) // benchmark best-effort parse

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle tree parse failure explicitly before section benchmark selection.

If json.Unmarshal fails, sectionID remains empty and “Get Section” benchmarking is silently skipped. Log/track this as a failed setup step so results aren’t misread as complete.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/benchmark/main.go` at line 80, The json.Unmarshal call in
cmd/benchmark/main.go is ignoring errors which can leave sectionID empty and
cause the "Get Section" benchmark to be silently skipped; change the
unmarshalling of treeResp into treeData (the json.Unmarshal(treeResp, &treeData)
call) to check the returned error, and on failure log an explicit setup error
(including the error detail) and mark the benchmark run as failed or abort
further section selection (e.g., return a non-zero exit or set a setupFailure
flag) so the missing sectionID is not treated as a successful run; ensure
downstream code that uses sectionID (section selection / "Get Section"
benchmark) respects this failure signal.

Comment thread cmd/benchmark/main.go
Comment on lines +104 to 110
_, _ = restGET(base + "/v1/health")
r.rest = append(r.rest, time.Since(start))

// gRPC (Connect)
start = time.Now()
healthClient.Check(ctx, connect.NewRequest(&v1.HealthCheckRequest{}))
_, _ = healthClient.Check(ctx, connect.NewRequest(&v1.HealthCheckRequest{}))
r.grpc = append(r.grpc, time.Since(start))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don’t include failed calls in latency samples.

All benchmark paths discard request errors, so network/protocol failures are still timed and recorded as “successful” latency. This skews median/p95 and can invert REST vs gRPC conclusions. Track failures separately and only append durations for successful calls (or report success-rate alongside latency).

Also applies to: 121-126, 137-144, 156-163, 185-194

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/benchmark/main.go` around lines 104 - 110, Benchmark currently records
durations for failed requests (e.g., the REST call via restGET and gRPC via
healthClient.Check) which skews latency stats; change each measured block
(places using restGET, healthClient.Check and similar REST/gRPC calls that
append to r.rest/r.grpc/r.rerpc etc.) to capture and check the error/result and
only append time.Since(start) when the call succeeded, otherwise increment a
failure counter (e.g., r.restFailures/r.grpcFailures or a shared
successes/failures map) so latency arrays contain only successful-sample
durations and failures are tracked separately; update all similar blocks (the
other ranges mentioned) to follow the same pattern using the existing variables
like ctx, connect.NewRequest, r.rest and r.grpc to locate the spots to change.

Comment thread internal/api/server.go
Comment on lines 395 to 399
rc, _, err := d.Storage.Get(r.Context(), sec.ContentRef)
if err == nil {
raw, _ := io.ReadAll(rc)
rc.Close()
_ = rc.Close() // best-effort close
content = string(raw)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Discard section content when storage reads fail.

These paths ignore io.ReadAll errors and still use whatever bytes were returned. A short read will quietly surface truncated text in /v1/sections, /v1/query, /v1/answer, and /v1/query/multi, which can ground answers on corrupted evidence instead of failing or dropping the section.

Representative fix
- raw, _ := io.ReadAll(rc)
- _ = rc.Close() // best-effort close
- content = string(raw)
+ raw, readErr := io.ReadAll(rc)
+ _ = rc.Close() // best-effort close
+ if readErr == nil {
+     content = string(raw)
+ }

Also applies to: 538-542, 835-839, 1077-1081

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/api/server.go` around lines 395 - 399, The ReadAll result is used
even when io.ReadAll returns an error; update the rc handling (the block after
d.Storage.Get returning err == nil where variables rc, raw, content are used) to
check the error returned by io.ReadAll, close rc best-effort, and discard the
section by setting content = "" (or skip adding the section) when ReadAll fails
so truncated/corrupted bytes aren’t used; apply the same fix to the other
similar blocks referenced (the occurrences around lines 538-542, 835-839,
1077-1081) to consistently handle ReadAll errors.

Comment on lines 169 to 173
rc, _, getErr := h.storage.Get(r.Context(), sec.ContentRef)
if getErr == nil {
raw, _ := io.ReadAll(rc)
rc.Close()
_ = rc.Close() // best-effort close
content = string(raw)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't synthesize from partially read section bodies.

io.ReadAll errors are ignored here, so a short storage read still becomes citation / answer input. That can return a grounded answer from truncated evidence instead of skipping the section or failing the request.

Representative fix
- raw, _ := io.ReadAll(rc)
- _ = rc.Close() // best-effort close
- content = string(raw)
+ raw, readErr := io.ReadAll(rc)
+ _ = rc.Close() // best-effort close
+ if readErr != nil {
+     h.logger.Warn("answer: section content read failed", "section_id", sec.ID, "err", readErr)
+     continue
+ }
+ content = string(raw)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/handler/answer.go` around lines 169 - 173, The code ignores the
error returned by io.ReadAll after h.storage.Get, which can yield truncated
content; change the block around h.storage.Get / io.ReadAll to check the read
error (the second return from io.ReadAll), ensure rc is closed (best-effort)
even on read failure, and if io.ReadAll returns an error do not use the partial
raw content — either skip adding this section to content or return/propagate an
error; update the handling where variables rc, raw, getErr are used so only
successfully-read data is assigned to content.

Comment on lines 428 to +431
rc, _, getErr := h.storage.Get(r.Context(), sec.ContentRef)
if getErr == nil {
raw, _ := io.ReadAll(rc)
rc.Close()
_ = rc.Close() // best-effort close

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid returning partial section content on read errors.

This handler ignores io.ReadAll errors and still serializes whatever bytes were fetched. A short storage read becomes a 200 with truncated section text, which is indistinguishable from a real section body to clients.

Representative fix
- raw, _ := io.ReadAll(rc)
- _ = rc.Close() // best-effort close
- content = string(raw)
+ raw, readErr := io.ReadAll(rc)
+ _ = rc.Close() // best-effort close
+ if readErr == nil {
+     content = string(raw)
+ }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/handler/documents.go` around lines 428 - 431, The code currently
ignores errors from io.ReadAll and may return truncated section content; change
the read to capture the error (raw, readErr := io.ReadAll(rc)) and if readErr !=
nil then ensure rc is closed and propagate a read/storage error instead of
serializing partial bytes — e.g., log/report readErr and return an appropriate
HTTP error (or skip including the section body) so that h.storage.Get /
sec.ContentRef reads never produce a 200 with truncated data; update uses of
rc.Close() accordingly to always close in both success and error paths.

hallelx2 added 2 commits June 6, 2026 17:24
…6/117)

- lint: the CI 'lint' job runs BOTH golangci-lint and standalone staticcheck
  (honnef), and the latter reads //lint:ignore, not golangci's //nolint. Rather
  than juggle two directive formats, delete the 4 genuinely-unused funcs
  (treeWalkTraceTokenFromCitations, originMatches, isTimeout, isEncryptedPDFError)
  — both linters agree they're dead. (Re-add with callers when HAL-73/HAL-112 land.)
- race: pdftable.ExtractTables also mutates package-level state — a second pdftable
  race distinct from OpenBytes; serialize it on the same mutex. Stopgap; the root
  fix is HAL-118 (make pdftable concurrency-safe).

golangci-lint + standalone staticcheck: 0 issues. go build + unit tests green.
The ubuntu-only 'Verify go.mod/go.sum are tidy' CI step requires the canonical
format go mod tidy emits: a blank line between the 'go' and 'toolchain'
directives. My earlier manual insertion of the toolchain line omitted it.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/parser/pdf.go (1)

1605-1637: ⚠️ Potential issue | 🔴 Critical

Fix: safeExtractTables can deadlock the global pdftableOpenMu on timeout/hang or panic

  • safeExtractTables locks pdftableOpenMu immediately before page.ExtractTables(settings), but on timeout the caller returns while the goroutine keeps running; if ExtractTables wedges, the global lock stays held and blocks later openPDFBytes/table extraction.
  • If page.ExtractTables panics, the goroutine’s recover runs but the mutex unlock is not deferred, so pdftableOpenMu.Unlock() is skipped and the mutex remains permanently locked.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/parser/pdf.go` around lines 1605 - 1637, safeExtractTables currently
locks pdftableOpenMu and then calls page.ExtractTables without deferring the
unlock, which can permanently lock the mutex on panic or timeout; fix by moving
pdftableOpenMu.Lock() into the goroutine and immediately calling defer
pdftableOpenMu.Unlock() (before invoking page.ExtractTables) so the mutex is
always released on return/recover, and ensure the goroutine still sends result
on the done channel; keep references to pdftableOpenMu, safeExtractTables,
page.ExtractTables, and the done/result channel when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/parser/pdf.go`:
- Around line 1605-1637: safeExtractTables currently locks pdftableOpenMu and
then calls page.ExtractTables without deferring the unlock, which can
permanently lock the mutex on panic or timeout; fix by moving
pdftableOpenMu.Lock() into the goroutine and immediately calling defer
pdftableOpenMu.Unlock() (before invoking page.ExtractTables) so the mutex is
always released on return/recover, and ensure the goroutine still sends result
on the done channel; keep references to pdftableOpenMu, safeExtractTables,
page.ExtractTables, and the done/result channel when making this change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 26e5083c-3a99-44a3-a0fd-78159420f3dd

📥 Commits

Reviewing files that changed from the base of the PR and between 7407704 and 18b5c1d.

📒 Files selected for processing (5)
  • go.mod
  • internal/api/treewalk.go
  • internal/middleware/cors.go
  • pkg/ingest/ingest.go
  • pkg/parser/pdf.go
💤 Files with no reviewable changes (3)
  • internal/middleware/cors.go
  • pkg/ingest/ingest.go
  • internal/api/treewalk.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod

@hallelx2 hallelx2 merged commit 2051693 into main Jun 6, 2026
8 checks passed
@hallelx2 hallelx2 deleted the halleluyaholudele/hal-106-treewalk-rename branch June 6, 2026 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants