Add minimal wheel build mode by JacobSzwejbka · Pull Request #19899 · pytorch/executorch

JacobSzwejbka · 2026-05-30T05:25:23Z

Adds an opt-in EXECUTORCH_BUILD_MINIMAL=1 wheel build mode for packaging the Python EXIR export path and flatc without runtime pybindings, kernels, backend packages, headers, examples, or devtools.

Also adds a pull-request CI smoke test that builds the minimal wheel, checks that excluded runtime/backend content is absent, installs it in a clean venv, and exports MobileNetV2 to a .pte.

Local result: minimal macOS arm64 wheel is 2.02 MiB compressed ~9 uncompressed; MobileNetV2 .pte export is 13,995,880 bytes (13.35 MiB). The same MobileNetV2 export using the published executorch 1.3.1 wheel also produced a 13,995,880 byte .pte.

Authored with Claude.

pytorch-bot · 2026-05-30T05:25:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19899

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job, 4 Unrelated Failures, 1 Unclassified Failure

As of commit 6322992 with merge base fb3f6eb ():

NEW FAILURE - The following job has failed:

Lint / link-check / lint-urls (gh)
Process completed with exit code 1.

UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:

trunk / test-arm-backend-vkml (test_pytest_models_vkml) / linux-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
RuntimeError: Command docker exec -t 14fe4342f02ecf2fec1d775d34d0c0fbf4a53482a81add5f1690d311e089b9d9 /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-arm-backend-ethos-u (test_pytest_models_ethos_u85) / linux-job (gh)
##[error]The operation was canceled.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-static-llama-qnn-linux (stories_110m) / linux-job (gh) (trunk failure)
test_llama_stories_110m
pull / unittest / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.
trunk / unittest-release / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-05-30T05:26:12Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

JacobSzwejbka · 2026-05-30T05:44:04Z

@claude review this PR

claude · 2026-05-30T05:44:48Z

Claude finished @JacobSzwejbka's task in 2m 7s —— View job

Reviewing PR #19899: Add minimal wheel build mode

Read changed files and gather context
Review setup.py changes
Review .ci/scripts/test_minimal_wheel.sh
Review .github/workflows/pull.yml changes
Review exir/_serialize/_flatbuffer.py changes
Review README-wheel.md changes
Post review feedback

Summary

Well-structured PR that introduces an opt-in EXECUTORCH_BUILD_MINIMAL=1 wheel build mode. The separation of concerns is clean — minimal build disables all backends/kernels/pybindings and only builds flatbuffers_ep, then the package list is trimmed to just the export path. The smoke test is thorough (content verification + end-to-end MobileNetV2 export). A few observations below.

Feedback

1. Potential race in CI: `cmake_cache.is_enabled()` after minimal build (setup.py:889–917)

After --target flatbuffers_ep is built, the code falls through to additional cmake_cache.is_enabled(...) checks (lines 889–917). Since _minimal_cmake_flags() explicitly sets all those options to OFF, this is safe in practice — but the if minimal_build branch only covers the build target selection, not the subsequent target additions. Consider adding an early return or else guard to make it immediately clear that minimal builds skip all subsequent target additions:

if minimal_build:
    cmake_build_args += ["--target", "flatbuffers_ep"]
else:
    if cmake_cache.is_enabled("EXECUTORCH_BUILD_PYBIND"):
        cmake_build_args += ["--target", "portable_lib"]
        ...
    # rest of target additions

This would prevent any future additions below the if/elif from accidentally applying to minimal builds.

2. `_is_minimal_build()` called at module level (setup.py:932)

setup_kwargs = {}
if _is_minimal_build():
    setup_kwargs["packages"] = _minimal_packages()

This is evaluated at import time. If someone imports setup.py for inspection (unlikely but possible in tooling), the env var state at import time determines the result. This is fine for the setup.py bdist_wheel use case, just noting the coupling.

3. Test script: missing cleanup on failure (.ci/scripts/test_minimal_wheel.sh:15)

The script does rm -rf of the venvs at the start but doesn't trap for cleanup on failure. In CI this doesn't matter since the runner is ephemeral, but for local development it leaves .venv-minimal-build and .venv-minimal-test behind. Minor — not blocking.

4. Test script: `setup.py bdist_wheel` is deprecated (.ci/scripts/test_minimal_wheel.sh:36)

EXECUTORCH_BUILD_MINIMAL=1 python setup.py bdist_wheel

setup.py bdist_wheel is deprecated as of setuptools 70+ and will eventually be removed. The modern equivalent is pip wheel --no-build-isolation --no-deps . or python -m build --wheel. Since the rest of ExecuTorch CI may still use setup.py bdist_wheel, this is consistent with existing practice — but worth a note for future migration. The README documentation says pip wheel or pip install which is good.

5. `_flatbuffer.py`: silent fallback order could mask packaging bugs (exir/_serialize/_flatbuffer.py:272–274)

for package, resource_name in (
    (__package__, _FLATC_RESOURCE_NAME),
    ("executorch.data.bin", "flatc"),
):

The fallback to executorch.data.bin is the right approach for the minimal wheel (where the old flatbuffers-flatc resource doesn't exist in the _serialize package). However, if both locations somehow contain a flatc binary (e.g. a packaging mistake), the first one always wins silently. This is fine behavior, just noting that the ordering is important — existing full wheels get their existing behavior unchanged, and the minimal wheel uses the second path.

6. CI trigger conditions: `extension/flat_tensor` path check (.github/workflows/pull.yml:58)

The trigger uses contains(needs.changed-files.outputs.changed-files, 'extension/flat_tensor') — this will also match extension/flat_tensor_test or any other path containing that prefix. Since contains() is a substring match on the full file list, this is probably fine because there's unlikely to be a path that matches incorrectly, but extension/flat_tensor/ (with trailing slash) would be slightly more precise.

7. `_minimal_packages()`: `extension.flat_tensor` included but CMake flag is OFF (setup.py:115, 151–152)

_minimal_cmake_flags() sets EXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=OFF but _minimal_packages() includes executorch.extension.flat_tensor and executorch.extension.flat_tensor.*. This means the Python source files for flat_tensor are shipped without the C++ extension being built. If the Python code in extension/flat_tensor imports native bindings, this could cause import errors. If it's pure Python or only used for AoT serialization, this is fine. Worth verifying that users of the minimal wheel won't hit an ImportError when this module is imported transitively.

Overall

This is a clean, well-tested PR. The approach of gating on a single env var and explicitly disabling all non-essential CMake flags is the right pattern. The smoke test validates both negative (excluded content absent) and positive (export works end-to-end) properties. The 2 MiB wheel size vs. the full wheel is a nice win for export-only use cases.

The items above are minor — none are blocking. The most actionable one is item 7 (verifying flat_tensor doesn't pull in native deps when imported from the minimal wheel).

shoumikhin · 2026-05-30T06:26:50Z

Thanks for putting this together. The overall approach looks right: it is opt-in, defined upstream, and uses an allowlist of what to keep instead of a list of what to delete, so it will not quietly grow again on future versions. The keep set is correct, including the two parts that are easy to miss (extension/pytree and extension/flat_tensor), and reproducing the exact same .pte bytes is good evidence it works.

A few things I think we should improve:

The wheel still declares all the heavy dependencies. The minimal build trims the package files but not the dependency list in pyproject (coremltools, scikit-learn, pandas, hydra-core, omegaconf, plus the test-only deps expecttest/hypothesis/kgb/parameterized). So a normal pip install of the minimal wheel still pulls all of those in, which is most of the real install size. The CI test only passes because it uses pip install --no-deps. We already know the true minimal set, since it is the list installed in test_minimal_wheel.sh (flatbuffers, numpy, packaging, pyyaml, ruamel.yaml, sympy, tabulate, typing-extensions, torch). Could we set install_requires to that in minimal mode via the existing setup_kwargs, drop --no-deps from the test, and add a check that the wheel metadata does not list the heavy deps?
flatc keeps the wheel platform specific. The minimal build still ships flatc and compiles the flatbuffers target, so the wheel is not pure Python and the build still needs cmake and a compiler. On main the .pte serialization is pure Python, and the MobileNetV2 test never actually calls flatc (it is only needed for external-weight .ptd export). If we drop flatc, the minimal wheel could be pure Python (py3-none-any), so one wheel would work for every platform including Jetson, buildable with no toolchain. The tradeoff is losing external .ptd export. Could go either way, but it would be good to confirm the intent.
This is a build mode, not a published package. The minimal wheel has the same name and version as the full one, so pip install torch_tensorrt will still resolve to the full wheel on PyPI. For downstream consumers to get the slim version automatically we would need either a separate published distribution or for the consumer to build and bundle this. Probably worth a tracking issue.

Minor:

The new CI job is Linux only (no aarch64/Jetson), and because it installs with --no-deps the dependency resolution is never tested.
The smoke test exports through exir directly. Running it once through the real torch_tensorrt output_format="executorch" path would cover the actual consumer.
The README could note that the minimal wheel currently still installs the full set of Python dependencies and is platform specific, so readers do not assume "minimal" means a small install yet.

None of this blocks the direction. Item 1 is small and the highest value.

JacobSzwejbka · 2026-05-30T17:26:50Z

This is a build mode, not a published package. The minimal wheel has the same name and version as the full one, so pip install torch_tensorrt

I thought they wanted the minimal build for the custom packages where they build the entire thing themself. So building from source is fine.

Add export-only wheel build mode

058efc3

JacobSzwejbka requested a review from larryliu0820 as a code owner May 30, 2026 05:25

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 30, 2026

JacobSzwejbka added 2 commits May 29, 2026 22:31

Rename export-only wheel mode to minimal

f074e3c

Fix minimal wheel smoke test

6322992

JacobSzwejbka marked this pull request as draft May 30, 2026 05:38

JacobSzwejbka changed the title ~~Add export-only wheel build mode~~ Add minimal wheel build mode May 30, 2026

JacobSzwejbka marked this pull request as ready for review May 30, 2026 05:44

JacobSzwejbka requested review from mergennachin and shoumikhin May 30, 2026 05:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add minimal wheel build mode#19899

Add minimal wheel build mode#19899
JacobSzwejbka wants to merge 3 commits into
mainfrom
export-only-wheel

JacobSzwejbka commented May 30, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented May 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

JacobSzwejbka commented May 30, 2026

Uh oh!

claude Bot commented May 30, 2026 •

edited

Loading

Uh oh!

shoumikhin commented May 30, 2026

Uh oh!

JacobSzwejbka commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JacobSzwejbka commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19899

❌ 1 New Failure, 1 Cancelled Job, 4 Unrelated Failures, 1 Unclassified Failure

Uh oh!

github-actions Bot commented May 30, 2026

This PR needs a release notes: label

Uh oh!

JacobSzwejbka commented May 30, 2026

Uh oh!

claude Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewing PR #19899: Add minimal wheel build mode

Summary

Feedback

1. Potential race in CI: cmake_cache.is_enabled() after minimal build (setup.py:889–917)

2. _is_minimal_build() called at module level (setup.py:932)

3. Test script: missing cleanup on failure (.ci/scripts/test_minimal_wheel.sh:15)

4. Test script: setup.py bdist_wheel is deprecated (.ci/scripts/test_minimal_wheel.sh:36)

5. _flatbuffer.py: silent fallback order could mask packaging bugs (exir/_serialize/_flatbuffer.py:272–274)

6. CI trigger conditions: extension/flat_tensor path check (.github/workflows/pull.yml:58)

7. _minimal_packages(): extension.flat_tensor included but CMake flag is OFF (setup.py:115, 151–152)

Overall

Uh oh!

shoumikhin commented May 30, 2026

Uh oh!

JacobSzwejbka commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JacobSzwejbka commented May 30, 2026 •

edited

Loading

pytorch-bot Bot commented May 30, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude Bot commented May 30, 2026 •

edited

Loading

1. Potential race in CI: `cmake_cache.is_enabled()` after minimal build (setup.py:889–917)

2. `_is_minimal_build()` called at module level (setup.py:932)

4. Test script: `setup.py bdist_wheel` is deprecated (.ci/scripts/test_minimal_wheel.sh:36)

5. `_flatbuffer.py`: silent fallback order could mask packaging bugs (exir/_serialize/_flatbuffer.py:272–274)

6. CI trigger conditions: `extension/flat_tensor` path check (.github/workflows/pull.yml:58)

7. `_minimal_packages()`: `extension.flat_tensor` included but CMake flag is OFF (setup.py:115, 151–152)