refactor: act on the OCH duplication scan (providers, defineTool, cross-package single-sourcing)#280
Merged
Merged
Conversation
Complete the single-file SQLite migration cleanup: delete dead code left by the two-backend removal and scrub every prior-backend name from shipping source, so `duckdb`/`ladybug`/`lbug`/`kuzu` appear only in decision history (ADRs, CHANGELOGs) and removal prose. Dead code deleted (~864 LOC): - storage/src/schema-ddl.ts: whole file, superseded by inline DDL, zero callers - cli analyze.ts: 345-LOC wide-column row decoder, superseded by payload-JSON rehydration through the store's typed listNodes/listEdges finders - storage/src/test-utils/conformance.ts: 448-LOC assertIGraphStoreConformance suite with zero callers; drop the false "SqliteStore opts into this suite" claim in interface.ts. The load-bearing assertGraphParity/rebuildFromStore parity harness is kept (sqlite-parity.test.ts uses it). Traces scrubbed to zero in live .ts source (9 files) plus 3 CI/acceptance scripts and 8 per-package READMEs/docs that misdescribed current behavior. Tests: fix one dead assertion (interface.test.ts store-path values), migrate ~40 fixture path literals and comments across 20 test files, and keep the sqlite-adapter.test.ts guard that asserts no .lbug/.duckdb sidecar ever reappears. No test file deleted. Enforcement: check-banned-strings.sh now hard-bans the four prior-backend names in packages/*/src (excluding tests + test-utils), fixes its own stale "LadybugDB is the default backend" comments, so the traces cannot regrow. Also fixes pack-determinism-audit.sh gating on the nonexistent .codehub/duck.db (acceptance gate 16 was a silent permanent SKIP; now gates on store.sqlite). Carries an in-flight fix: re-ingest cached scan.sarif on the analyze scan-skip fast-path so a replace-mode bulkLoad no longer wipes prior findings to zero (+ analyze-findings-survival.test.ts regression test). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e guard
Two correctness bugs from the tech-debt audit (both "green tests, wrong
production behavior"):
B1 [P0] — MCP pack_codebase shipped a hollow pack + divergent packHash.
callRealPackEngine called generatePack({...}, { store }), omitting the
provenance bundle (commit, repoOriginUrl, chunkerFiles, grammarCommits) the
CLI wires. The manifest preimage binds commit/repoOriginUrl/fileHash/
chonkie_version/grammar_commits, so an MCP-triggered pack produced an empty
ast-chunks.jsonl, a byte-range-free context-bom, commit="", and a packHash
that silently diverged from the CLI's for the same repo+commit. Fix: move
resolvePackProvenance from cli/commands/code-pack.ts into @opencodehub/pack
(new provenance.ts) so both entry points call it; MCP now passes ...provenance.
provenance.test.ts pins that omitting provenance changes the packHash and that
equal provenance yields an equal packHash across entry points.
B2 [P1] — traverse recursive CTE used an unanchored instr(path, id) cycle
guard. Node ids have no disambiguating suffix, so one id being a SUBSTRING of
another already on the path (Class:a.ts:Foo in Class:a.ts:FooBar) was read as
a revisit and pruned, dropping the node and its whole subtree — silently
under-reporting blast radius in impact/api_impact/verdict. Fix: anchor both
operands on comma delimiters so only a whole id counts as a revisit.
B2b [P1] — the final SELECT grouped by node_id with a bare path column, so
SQLite picked an arbitrary tied row on equal-depth paths (diamond graphs),
making the reported predecessor/path nondeterministic. Fix: rank by
(depth, path) via ROW_NUMBER and keep rank 1 (lexicographically smallest).
traverse-substring-id.test.ts covers both; verified it fails against the
pre-fix query and passes after. B5 (pack-determinism-audit.sh gating on the
nonexistent duck.db) was fixed in the prior commit.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-of-concept
Introduce @opencodehub/core-ops — a transport-free "capability" layer the MCP
tools and CLI commands share, so the byte-identical resolve/finder/filter/
projection logic (audit findings D4/D7) lives once and each surface is a thin
adapter that only maps the plain Output into its own transport.
New package @opencodehub/core-ops (deps: analysis + storage + core-types; leaf,
no cycle — both cli and mcp already depend on all three):
- capability.ts: Capability<Input,Output> = { id, execute(input, ctx) }, where
ctx carries the already-open { graph, temporal } store + resolved repoName.
Input validation and repo/store lifecycle stay at each surface's boundary for
now (the resolvers differ — MCP carries AMBIGUOUS_REPO the CLI does not);
unifying them behind a StoreProvider + defineTool/defineCommand factories is
the next step. CapabilityStore is the single seam the deferred A1 accessor
collapse will flip.
- string-or.ts: the one canonical stringOr (kills the D7 copy in both surfaces).
- caps/findings.ts: findingsCapability — the shared reader (listFindings push-
down of severity+ruleId), TS post-finder (severity="none", scanner, filePath
substring), and row projection. Zod-free: each surface keeps its own schema
(MCP raw-shape for the SDK, CLI commander flags), which was never the
duplicated part.
- caps/findings.test.ts: 6 unit tests over a fake CapabilityStore — the first
isolated coverage of that shared logic.
Rewire both surfaces to call findingsCapability, unchanged public behavior:
- cli/commands/findings.ts 111->63 LOC; keeps runFindings(opts) + storeFactory
seam; findings.test.ts stays green (3/3).
- mcp/tools/list-findings.ts 191->147 LOC; keeps runListFindings /
registerListFindingsTool signatures + the full next_steps/staleness envelope.
Verified: core-ops + cli findings tests green; every edited file typecheck-clean
and biome-clean. (The mcp package cannot be test-run in this sandbox — its zod
install is corrupted, unrelated to this change; list-findings.ts itself
typechecks clean.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Records the new workspace package (its analysis/storage/core-types workspace links + typescript/@types/node dev deps) in pnpm-lock.yaml. Companion to 51ba06e, which added the package but not its lockfile entry. Run via `pnpm install --no-frozen-lockfile`; no other dependency changed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ix D15 schema staleness Lift NODE_COLUMNS + RELATION_COLUMNS to a new core-types/node-columns.ts and have storage/column-encode.ts and mcp/resources/repo-schema.ts import the one copy; point storage/relations.ts at core-types RELATION_TYPES. Fixes the D15 staleness bug where the MCP schema resource advertised only 26 of 73 logical node columns to SQL-authoring agents (now 73). Adds drift-guard tests asserting storage rosters deep-equal core-types. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed capabilities Fold the register-tool + withStore + try/catch + next-steps/staleness envelope boilerplate (audit A8) into one defineTool factory, parameterized by a capability + args-projector + presenter. Convert dependencies/license_audit/project_profile to thin adapters over new core-ops capabilities; retire their local stringOr copies. Wire names, tool count (29), and annotations unchanged, so the server contract test stays green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extract the near-identical extractCalls/extractDefinitions/extractHeritage loop shells into extractCallsGeneric + extractDefinitionsGeneric + extractHeritageRefBased in extract-helpers.ts, each config-driven per provider (receiver strategies, kind resolvers, promote/exported rules, heritage rules). Hoist findNameInside + qualifiedForCapture (13 copies). python defs + bespoke heritage (csharp/go/ts/java) kept custom; extractImports left per-language. Adds a characterization harness (16 providers x 4 extractors, value-locking golden) so graphHash byte-identity is enforced on every conversion. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…apability Lift the byte-identical fetchProcessParticipation reader out of the CLI and MCP context implementations into a shared contextCapability. resolveTarget + CALLS traversal stay per-surface (they genuinely differ); the MCP presenter (owner/cochange/confidence/buckets) stays MCP-side. Output envelopes unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nguage Replace the duplicated inferLanguageFromFile extension switch in cross-file/mro/ accesses with the existing detectLanguage helper (fixes a latent bug where cross-file's local switch silently omitted cobol) and extract the cross-file/mro incremental carry-forward loop into incremental-helper.carryForwardEdges. Adds an extension-parity pin + cobol/multi-dot determinism fixtures. Documents in DART_QUERY why dart calls are intentionally uncaptured (grammar has no invocation node). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ount Seven durable lessons from the duplication-scan remediation (characterization harness, config-factory collapse, grammar-reachability, jscpd overcounting, dead-guard unification, template-literal DSL comment). Correct ROADMAP edge-kind count 23->25. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
packages/core-ops landed without its scope being enumerated, so commitlint rejects the two core-ops-scoped commits already on this branch. Add the scope per the config's own 'extend when new workspace packages land' note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Merged
theagenticguy
pushed a commit
that referenced
this pull request
Jul 4, 2026
🤖 Automated release via release-please --- <details><summary>root: 0.10.8</summary> ## [0.10.8](root-v0.10.7...root-v0.10.8) (2026-07-04) ### Refactoring * act on the OCH duplication scan (providers, defineTool, cross-package single-sourcing) ([#280](#280)) ([337dc5e](337dc5e)) </details> <details><summary>cli: 0.10.8</summary> ## [0.10.8](cli-v0.10.7...cli-v0.10.8) (2026-07-04) ### Refactoring * act on the OCH duplication scan (providers, defineTool, cross-package single-sourcing) ([#280](#280)) ([337dc5e](337dc5e)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Acts on the OCH duplication scan (
jscpd— 3,450 dup lines, 4.38%). Net −1,110 source lines across the ingestion providers, the MCP tool layer, the pipeline phases, and cross-package rosters — withgraphHashbyte-identity preserved throughout (a new characterization harness enforces it) and every scan finding addressed.What changed, by finding
extractCalls/extractDefinitions/extractHeritageloop shells intoextractCallsGeneric/extractDefinitionsGeneric/extractHeritageRefBasedinextract-helpers.ts, each config-driven per provider (receiver strategies, kind resolvers, promote/exported rules, heritage rules). Hoisted the two byte-identical private helpers (13 copies).pythondefs and the bespoke heritage providers (csharp/go/ts/java) stay custom;extractImportsis irreducibly per-language and left alone.defineToolfactory folding the register +withStore+ try/catch + next-steps/staleness boilerplate (audit A8). Converteddependencies/license_audit/project_profile(and shared thecontextPROCESS_STEP reader) to@opencodehub/core-opscapabilities. Wire names, tool count (29), and annotations unchanged.cross-file/mro/accessesthrough the existing canonicaldetectLanguage(fixes a latent bug: cross-file's local switch silently omitted cobol) and extracted the incremental carry-forward loop intocarryForwardEdges.NODE_COLUMNS+ the relation roster intocore-types— fixes the D15 staleness bug where the MCP schema resource advertised only 26 of 73 logical node columns to SQL-authoring agents (now 73).Safety
characterization.test.ts): 16 providers × 4 extractors, value-locking golden, negative-self-check-proven. Gated every provider conversion.incremental-determinism(graphHash full-vs-incremental) 4/4 throughout;mise run checkgreen; two independent adversarial reviews found no drift.DART_QUERY: dart's tree-sitter grammar has no invocation node, so a sound single-query capture is impossible and an unsound one would pollute CALLS-derived features with field-read false positives. Same class as the roadmap's Rust/Swift SCIP gaps.Test plan
mise run check(lint + typecheck + all package tests + banned-strings) — exit 0characterization.test.ts65/65 ·incremental-determinism.test.ts4/4server.test.ts29-tool roster contract intact🤖 Generated with Claude Code