Skip to content

Add minimal wheel build mode#19899

Open
JacobSzwejbka wants to merge 3 commits into
mainfrom
export-only-wheel
Open

Add minimal wheel build mode#19899
JacobSzwejbka wants to merge 3 commits into
mainfrom
export-only-wheel

Conversation

@JacobSzwejbka
Copy link
Copy Markdown
Contributor

@JacobSzwejbka JacobSzwejbka commented May 30, 2026

Adds an opt-in EXECUTORCH_BUILD_MINIMAL=1 wheel build mode for packaging the Python EXIR export path and flatc without runtime pybindings, kernels, backend packages, headers, examples, or devtools.

Also adds a pull-request CI smoke test that builds the minimal wheel, checks that excluded runtime/backend content is absent, installs it in a clean venv, and exports MobileNetV2 to a .pte.

Local result: minimal macOS arm64 wheel is 2.02 MiB compressed ~9 uncompressed; MobileNetV2 .pte export is 13,995,880 bytes (13.35 MiB). The same MobileNetV2 export using the published executorch 1.3.1 wheel also produced a 13,995,880 byte .pte.

Authored with Claude.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 30, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19899

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job, 4 Unrelated Failures, 1 Unclassified Failure

As of commit 6322992 with merge base fb3f6eb (image):

NEW FAILURE - The following job has failed:

UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 30, 2026
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@JacobSzwejbka JacobSzwejbka marked this pull request as draft May 30, 2026 05:38
@JacobSzwejbka JacobSzwejbka changed the title Add export-only wheel build mode Add minimal wheel build mode May 30, 2026
@JacobSzwejbka
Copy link
Copy Markdown
Contributor Author

@claude review this PR

@JacobSzwejbka JacobSzwejbka marked this pull request as ready for review May 30, 2026 05:44
@claude
Copy link
Copy Markdown

claude Bot commented May 30, 2026

Claude finished @JacobSzwejbka's task in 2m 7s —— View job


Reviewing PR #19899: Add minimal wheel build mode

  • Read changed files and gather context
  • Review setup.py changes
  • Review .ci/scripts/test_minimal_wheel.sh
  • Review .github/workflows/pull.yml changes
  • Review exir/_serialize/_flatbuffer.py changes
  • Review README-wheel.md changes
  • Post review feedback

Summary

Well-structured PR that introduces an opt-in EXECUTORCH_BUILD_MINIMAL=1 wheel build mode. The separation of concerns is clean — minimal build disables all backends/kernels/pybindings and only builds flatbuffers_ep, then the package list is trimmed to just the export path. The smoke test is thorough (content verification + end-to-end MobileNetV2 export). A few observations below.


Feedback

1. Potential race in CI: cmake_cache.is_enabled() after minimal build (setup.py:889–917)

After --target flatbuffers_ep is built, the code falls through to additional cmake_cache.is_enabled(...) checks (lines 889–917). Since _minimal_cmake_flags() explicitly sets all those options to OFF, this is safe in practice — but the if minimal_build branch only covers the build target selection, not the subsequent target additions. Consider adding an early return or else guard to make it immediately clear that minimal builds skip all subsequent target additions:

if minimal_build:
    cmake_build_args += ["--target", "flatbuffers_ep"]
else:
    if cmake_cache.is_enabled("EXECUTORCH_BUILD_PYBIND"):
        cmake_build_args += ["--target", "portable_lib"]
        ...
    # rest of target additions

This would prevent any future additions below the if/elif from accidentally applying to minimal builds.

2. _is_minimal_build() called at module level (setup.py:932)

setup_kwargs = {}
if _is_minimal_build():
    setup_kwargs["packages"] = _minimal_packages()

This is evaluated at import time. If someone imports setup.py for inspection (unlikely but possible in tooling), the env var state at import time determines the result. This is fine for the setup.py bdist_wheel use case, just noting the coupling.

3. Test script: missing cleanup on failure (.ci/scripts/test_minimal_wheel.sh:15)

The script does rm -rf of the venvs at the start but doesn't trap for cleanup on failure. In CI this doesn't matter since the runner is ephemeral, but for local development it leaves .venv-minimal-build and .venv-minimal-test behind. Minor — not blocking.

4. Test script: setup.py bdist_wheel is deprecated (.ci/scripts/test_minimal_wheel.sh:36)

EXECUTORCH_BUILD_MINIMAL=1 python setup.py bdist_wheel

setup.py bdist_wheel is deprecated as of setuptools 70+ and will eventually be removed. The modern equivalent is pip wheel --no-build-isolation --no-deps . or python -m build --wheel. Since the rest of ExecuTorch CI may still use setup.py bdist_wheel, this is consistent with existing practice — but worth a note for future migration. The README documentation says pip wheel or pip install which is good.

5. _flatbuffer.py: silent fallback order could mask packaging bugs (exir/_serialize/_flatbuffer.py:272–274)

for package, resource_name in (
    (__package__, _FLATC_RESOURCE_NAME),
    ("executorch.data.bin", "flatc"),
):

The fallback to executorch.data.bin is the right approach for the minimal wheel (where the old flatbuffers-flatc resource doesn't exist in the _serialize package). However, if both locations somehow contain a flatc binary (e.g. a packaging mistake), the first one always wins silently. This is fine behavior, just noting that the ordering is important — existing full wheels get their existing behavior unchanged, and the minimal wheel uses the second path.

6. CI trigger conditions: extension/flat_tensor path check (.github/workflows/pull.yml:58)

The trigger uses contains(needs.changed-files.outputs.changed-files, 'extension/flat_tensor') — this will also match extension/flat_tensor_test or any other path containing that prefix. Since contains() is a substring match on the full file list, this is probably fine because there's unlikely to be a path that matches incorrectly, but extension/flat_tensor/ (with trailing slash) would be slightly more precise.

7. _minimal_packages(): extension.flat_tensor included but CMake flag is OFF (setup.py:115, 151–152)

_minimal_cmake_flags() sets EXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=OFF but _minimal_packages() includes executorch.extension.flat_tensor and executorch.extension.flat_tensor.*. This means the Python source files for flat_tensor are shipped without the C++ extension being built. If the Python code in extension/flat_tensor imports native bindings, this could cause import errors. If it's pure Python or only used for AoT serialization, this is fine. Worth verifying that users of the minimal wheel won't hit an ImportError when this module is imported transitively.


Overall

This is a clean, well-tested PR. The approach of gating on a single env var and explicitly disabling all non-essential CMake flags is the right pattern. The smoke test validates both negative (excluded content absent) and positive (export works end-to-end) properties. The 2 MiB wheel size vs. the full wheel is a nice win for export-only use cases.

The items above are minor — none are blocking. The most actionable one is item 7 (verifying flat_tensor doesn't pull in native deps when imported from the minimal wheel).

@shoumikhin
Copy link
Copy Markdown
Contributor

Thanks for putting this together. The overall approach looks right: it is opt-in, defined upstream, and uses an allowlist of what to keep instead of a list of what to delete, so it will not quietly grow again on future versions. The keep set is correct, including the two parts that are easy to miss (extension/pytree and extension/flat_tensor), and reproducing the exact same .pte bytes is good evidence it works.

A few things I think we should improve:

  1. The wheel still declares all the heavy dependencies. The minimal build trims the package files but not the dependency list in pyproject (coremltools, scikit-learn, pandas, hydra-core, omegaconf, plus the test-only deps expecttest/hypothesis/kgb/parameterized). So a normal pip install of the minimal wheel still pulls all of those in, which is most of the real install size. The CI test only passes because it uses pip install --no-deps. We already know the true minimal set, since it is the list installed in test_minimal_wheel.sh (flatbuffers, numpy, packaging, pyyaml, ruamel.yaml, sympy, tabulate, typing-extensions, torch). Could we set install_requires to that in minimal mode via the existing setup_kwargs, drop --no-deps from the test, and add a check that the wheel metadata does not list the heavy deps?

  2. flatc keeps the wheel platform specific. The minimal build still ships flatc and compiles the flatbuffers target, so the wheel is not pure Python and the build still needs cmake and a compiler. On main the .pte serialization is pure Python, and the MobileNetV2 test never actually calls flatc (it is only needed for external-weight .ptd export). If we drop flatc, the minimal wheel could be pure Python (py3-none-any), so one wheel would work for every platform including Jetson, buildable with no toolchain. The tradeoff is losing external .ptd export. Could go either way, but it would be good to confirm the intent.

  3. This is a build mode, not a published package. The minimal wheel has the same name and version as the full one, so pip install torch_tensorrt will still resolve to the full wheel on PyPI. For downstream consumers to get the slim version automatically we would need either a separate published distribution or for the consumer to build and bundle this. Probably worth a tracking issue.

Minor:

  • The new CI job is Linux only (no aarch64/Jetson), and because it installs with --no-deps the dependency resolution is never tested.
  • The smoke test exports through exir directly. Running it once through the real torch_tensorrt output_format="executorch" path would cover the actual consumer.
  • The README could note that the minimal wheel currently still installs the full set of Python dependencies and is platform specific, so readers do not assume "minimal" means a small install yet.

None of this blocks the direction. Item 1 is small and the highest value.

@JacobSzwejbka
Copy link
Copy Markdown
Contributor Author

This is a build mode, not a published package. The minimal wheel has the same name and version as the full one, so pip install torch_tensorrt

I thought they wanted the minimal build for the custom packages where they build the entire thing themself. So building from source is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants