File-backed mmap for XNNPACK packed weights (#19862)#19862
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19862
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
|
|
@doggeral has exported this pull request. If you are a Meta employee, you can view the originating Diff in D106673663. |
This PR needs a
|
Summary: Add file-backed mmap support to `XNNWeightsCache` so that packed weight allocations go to a `MAP_SHARED` file instead of dirty heap. After `msync(MS_ASYNC)`, pages become clean file-backed and drop out of iOS `phys_footprint`. ## How it works 1. `set_packed_cache_path()` configures the cache file path via `BackendOptions` 2. `initialize_for_runtime()` opens the cache file 3. Each `reserve_space()` call extends the file via `ftruncate` and creates a `MAP_SHARED` mmap region — XNNPACK packs weights directly into file-backed pages 4. `finalize_for_runtime()` calls `msync(MS_ASYNC)` on newly added regions only (incremental sync), making pages clean 5. On Windows, mmap is unavailable — all code paths fall back to heap allocation automatically (`packed_file_fd_` stays -1) ## Expected savings ~400 MB packed weights move from dirty heap to clean file-backed pages (0 `phys_footprint` on iOS). Differential Revision: D106673663
24af41f to
5e3da53
Compare
Summary: Add file-backed mmap support to `XNNWeightsCache` so that packed weight allocations go to a `MAP_SHARED` file instead of dirty heap. After `msync(MS_ASYNC)`, pages become clean file-backed and drop out of iOS `phys_footprint`. ## How it works 1. `set_packed_cache_path()` configures the cache file path via `BackendOptions` 2. `initialize_for_runtime()` opens the cache file 3. Each `reserve_space()` call extends the file via `ftruncate` and creates a `MAP_SHARED` mmap region — XNNPACK packs weights directly into file-backed pages 4. `finalize_for_runtime()` calls `msync(MS_ASYNC)` on newly added regions only (incremental sync), making pages clean 5. On Windows, mmap is unavailable — all code paths fall back to heap allocation automatically (`packed_file_fd_` stays -1) ## Expected savings ~400 MB packed weights move from dirty heap to clean file-backed pages (0 `phys_footprint` on iOS). Differential Revision: D106673663
Summary: Add file-backed mmap support to `XNNWeightsCache` so that packed weight allocations go to a `MAP_SHARED` file instead of dirty heap. After `msync(MS_ASYNC)`, pages become clean file-backed and drop out of iOS `phys_footprint`. ## How it works 1. `set_packed_cache_path()` configures the cache file path via `BackendOptions` 2. `initialize_for_runtime()` opens the cache file 3. Each `reserve_space()` call extends the file via `ftruncate` and creates a `MAP_SHARED` mmap region — XNNPACK packs weights directly into file-backed pages 4. `finalize_for_runtime()` calls `msync(MS_ASYNC)` on newly added regions only (incremental sync), making pages clean 5. On Windows, mmap is unavailable — all code paths fall back to heap allocation automatically (`packed_file_fd_` stays -1) ## Expected savings ~400 MB packed weights move from dirty heap to clean file-backed pages (0 `phys_footprint` on iOS). Differential Revision: D106673663
5e3da53 to
3086aa3
Compare
Summary: Add file-backed mmap support to `XNNWeightsCache` so that packed weight allocations go to a `MAP_SHARED` file instead of dirty heap. After `msync(MS_ASYNC)`, pages become clean file-backed and drop out of iOS `phys_footprint`. ## How it works 1. `set_packed_cache_path()` configures the cache file path via `BackendOptions` 2. `initialize_for_runtime()` opens the cache file 3. Each `reserve_space()` call extends the file via `ftruncate` and creates a `MAP_SHARED` mmap region — XNNPACK packs weights directly into file-backed pages 4. `finalize_for_runtime()` calls `msync(MS_ASYNC)` on newly added regions only (incremental sync), making pages clean 5. On Windows, mmap is unavailable — all code paths fall back to heap allocation automatically (`packed_file_fd_` stays -1) ## Expected savings ~400 MB packed weights move from dirty heap to clean file-backed pages (0 `phys_footprint` on iOS). Differential Revision: D106673663
3086aa3 to
a1cdd60
Compare
Summary:
Add file-backed mmap support to
XNNWeightsCacheso that packed weight allocations go to aMAP_SHAREDfile instead of dirty heap. Aftermsync(MS_ASYNC), pages become clean file-backed and drop out of iOSphys_footprint.How it works
set_packed_cache_path()configures the cache file path viaBackendOptionsinitialize_for_runtime()opens the cache filereserve_space()call extends the file viaftruncateand creates aMAP_SHAREDmmap region — XNNPACK packs weights directly into file-backed pagesfinalize_for_runtime()callsmsync(MS_ASYNC)on newly added regions only (incremental sync), making pages cleanpacked_file_fd_stays -1)Expected savings
~400 MB packed weights move from dirty heap to clean file-backed pages (0
phys_footprinton iOS).Differential Revision: D106673663