chore(types): prune unused bundled generated modules#960
Conversation
The generated_poc/bundled tree is built from self-contained inlined JSON schemas, so each bundled message re-emits its own copy of the shared provenance/verification/asset sub-schemas. That accounts for ~92% of the generated tree (~27.8k of ~30k class definitions), yet only one bundled module is imported anywhere: protocol/get_adcp_capabilities_response (via adcp.types.capabilities). consolidate_exports.py already excludes bundled from the public namespace, and the tree is not loaded at import adcp. Remove the 95 unreferenced bundled modules and add a post-generation prune step to generate_types.py so regeneration does not recreate them. Generation runs unchanged; the step only deletes unreferenced output, leaving every retained module byte-identical. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Right cleanup, correctly fenced. The bundled tree is generated dead weight that consolidate_exports.py already excludes from the public namespace — deleting it changes the committed source tree without touching the wire or the public API.
Things I checked
- Only one bundled module has a real importer, and it survives.
src/adcp/types/capabilities.py(L38–L128) imports solely frombundled/protocol/get_adcp_capabilities_response.py, which is inBUNDLED_KEEP. That module inlines its full$refgraph — zero intra-bundledimports — so deleting its 95 siblings can't dangle it. - No other runtime importer of
bundled.*.client.py:267and_generated.py:1888importget_adcp_capabilities_requestfrom the non-bundledprotocol/dir, not the deletedbundled/protocol/one. Thetests/test_migrate_v3_to_v4.py:142,781hits are codemod string fixtures (bundled.x), not imports. - Retained module is byte-identical.
get_adcp_capabilities_response.pyis not in the diff — only deletions plus the script. The "generation runs unchanged" claim holds; this is prune-after-generate, not a hand-edit to generated source. - Prune logic is sound.
relative_to(bundled_dir) in BUNDLED_KEEPis exact, parts-anchoredPathequality — no prefix false-match over-deletes the kept module. Empty-dir cleanup is deepest-first (key=len(d.parts), reverse=True);bundled/__init__.pyis kept so the root never empties. Idempotent, with anexists()guard.code-reviewer: clean. - Ordering is correct. Prune runs after post-generation fixes, before consolidation.
consolidate_exports.pyfiltersparts[0] != "bundled"(L234) and prefers non-bundled in its first-seen dedup, so removing the dead copies first can only reduce work — nothing downstream reads a pruned module. - No public-API surface change, no semver signal needed.
chore(types):is the right prefix.
Follow-ups (non-blocking — file as issues)
- Generation→prune integration not exercised end-to-end. Per the PR's own scope note,
make regenerate-schemashasn't been run to validate the generate-then-prune pass; the end-state and the prune function were verified directly (suite green, idempotent). The shipped artifact — the committed tree — is validated; the generator path that reproduces it is not. Run it once before the next regen relies on it. - One-time
SCHEMA_DELTAS.mdnoise.diff_generated_types.snapshot()walks the whole tree, so the first post-merge regeneration diffs the old full-bundled committed tree against the freshly-pruned one and emits a large one-time "removed" delta. Self-corrects on every subsequent run. Worth a line in the PR body so the next regenerator isn't alarmed. bundled/media_buyis still generated then pruned (PR already flags this). That split-generation pass is now dead weight inmake regenerate-schemasand can be skipped.
Minor nits (non-blocking)
- Stale docstring cross-ref.
src/adcp/decisioning/refine.py:57has a:class:reference toadcp.types.generated_poc.bundled.media_buy, which this PR deletes. Docstring only — no runtime impact — but the Sphinx xref now points at nothing. Repoint it at the non-bundledRefinementApplied.statussource.
27,791 fewer class definitions for one retained module is a notable ratio to have been carrying. Approving on the strength of the import audit plus code-reviewer clean.
The @adcp/sdk adcp-3.1 npm dist-tag is not published, so every storyboard runner job fails at install with ETARGET "No matching version found". Track the latest stable release via the latest dist-tag and keep the sticky adcp-3.0 leg as a backwards-compat floor. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Clean fix. Deletes 27.8k generated class definitions that were already excluded from the public namespace and never loaded at import adcp — the right cleanup, and it's data-driven (one bundled module is actually consumed, the other 95 are dead weight).
Things I checked
- Only one bundled module is load-bearing.
src/adcp/types/capabilities.py:38is the sole importer of the bundled tree, and it imports exactlybundled/protocol/get_adcp_capabilities_response— whichBUNDLED_KEEPretains. That module is self-contained (imports onlyadcp.types.base,_str_enum,pydantic— no intra-bundled refs), so deleting its 95 siblings can't break it. - Consolidation output is unchanged.
consolidate_exports.py:234and:270hard-filterparts[0] != "bundled", so the bundled tree never entered the consolidated namespace, the collision map, or the first-seen dedup ordering. Pruning beforeconsolidate_exports.pyruns is a no-op for its output. - Prune ordering is correct.
prune_unused_bundled_modules()runs afterapply_post_generation_fixes()and before consolidation (scripts/generate_types.py:610-614).post_generate_fixes.pypatchesbundled/media_buy/*, so it must run before the prune deletes them — it does. - Prune is idempotent and safe. Empty-dir cleanup only
rmdirs dirs wherenot any(directory.iterdir());bundled/andbundled/protocol/keep retained files and survive.restore_unchanged_files()skips non-existent paths, so it won'tgit checkout-resurrect pruned output. Re-running over a pruned tree removes 0. - No public-API / semver impact. Nothing reachable from
adcp.*changes — bundled was never on the public surface.chore(types):is the right prefix; no!needed. _generated.pydoes not import bundled — confirms the tree was never loaded at import time, as the PR claims.
Follow-ups (non-blocking — file as issues)
latestdist-tag is a moving target. Switching the storyboard matrixadcp-3.1→latestmeans a new@adcp/sdkpublish can turn the blockingstoryboard-required-gatered with no change in this repo. Intended (the comment says so) and the stickyadcp-3.0leg stays a reproducible floor — but worth a line in the PR body acknowledging the external-publish coupling.- Dead-weight split-generation pass. Per the PR body, regen still generates
bundled/media_buythen prunes it. Skipping that split-generation pass would speed upmake regenerate-schemas. Good follow-up.
Minor nits (non-blocking)
- Dangling docstring cross-ref.
src/adcp/decisioning/refine.py:57has a:class:ref toadcp.types.generated_poc.bundled.media_buy'sRefinementApplied.status— a package this PR deletes. There's no Sphinx config in the repo so nothing resolves it at build or runtime; purely cosmetic. Repoint it atRefinementStatusor the wire-enum doc on a future pass.
Test-plan note
The PR body is honest that a full make regenerate-schemas round-trip (generation→prune integration) hasn't been run end-to-end — only the end-state deletion plus a green suite and idempotent-prune were verified directly. The prune is the kind of thing that only bites on a clean regen, so exercising that path before the next schema bump is the real validation. Non-blocking: the retained tree is byte-identical and 5,824/5,824 pass.
Approving on the strength of the consolidation-exclusion invariant plus the self-contained kept module — the deletion can't reach anything live.
bokelley
left a comment
There was a problem hiding this comment.
Reviewed the generated bundled-module pruning and live import surface. Local focused checks passed: tests/test_code_generation.py, tests/test_decisioning_capabilities_submodule.py, and tests/test_asset_aliases_stable.py all passed (89 tests), plus a syntax/import smoke for _generated.py and the retained bundled capabilities module. I also attempted a full local scripts/generate_types.py run; it remained active in datamodel-code-generator for ~13 minutes, so I stopped it and relied on the successful CI schema-validation job for the full regeneration environment.
What this does, in plain terms
The library auto-generates Python type files from JSON schemas into
src/adcp/types/generated_poc/. One sub-folder,bundled/, had grown todominate that tree: ~92% of all generated classes lived there.
That folder is generated from "bundled" schemas, which inline their entire
reference graph. So every bundled message file re-emitted its own private copy
of the same shared sub-schemas (provenance, verification, assets, etc.) over and
over (e.g.
VerifyAgentappeared 3,762 times).The catch: almost none of it is used. Of ~97 bundled modules, exactly one
is imported by real code:
bundled/protocol/get_adcp_capabilities_response.py(via
adcp.types.capabilities, whose typed sub-models only exist in the inlinedform).
consolidate_exports.pyalready excludes the bundled tree from the publicnamespace, and it isn't even loaded at
import adcp.So this PR deletes the 95 unreferenced bundled modules and teaches the generator
not to recreate them.
What changed
bundled/modules (kept only the one consumed module +its package
__init__files).prune_unused_bundled_modules()step toscripts/generate_types.py(runs after post-generation fixes, before consolidation). Generation itself is
unchanged; the step only removes the unreferenced output afterward, so every
retained module stays byte-identical. Keep-set lives in the
BUNDLED_KEEPconstant.
Impact
make lint,make typecheck, and the fullpytestsuite all pass.Scope notes
import adcpmemory (the bundled tree was lazy and never loaded). Import-timeRSS is a separate concern handled by the
defer_buildwork in the el-pasoworktree.
bundled/media_buyand then prunes it. That split-generation pass is now dead weight and can be
skipped to speed up
make regenerate-schemas.make regenerate-schemasrun to exercise the generation→pruneintegration end-to-end has not been run yet; the end-state and the prune
function were verified directly (deletion + full suite green, idempotent prune).
🤖 Generated with Claude Code