fix: stream uploads from disk instead of holding binaries in memory#9
Merged
Conversation
The upload pipeline materialized the whole binary up to three times:
prepareFileForUpload read it into a File (with .app dirs zipped in
memory first), the TUS path copied it again via Buffer.from(await
file.arrayBuffer()), and the Backblaze simple path made a third copy.
A 500 MB binary peaked at 2.18 GB RSS; large iOS zips could OOM CI
runners.
Everything now streams from disk via a small UploadSource descriptor
({ diskPath, name, size, contentType }):
- .app dirs are zipped to a temp file through yazl's output stream
(cleaned up in a finally) instead of being buffered
- SHA-256 is computed from a read stream
- the Supabase TUS upload feeds tus-js-client a createReadStream with
uploadSize — it buffers one 6 MB chunk at a time
- the Backblaze large path uses the existing readFileChunk disk reads
(previously dead code — the in-memory File always shadowed it)
- the Backblaze simple path keeps one transient buffer, bounded by the
server only choosing that strategy for small binaries (S3 pre-signed
PUTs reject chunked encoding, ruling out a stream body)
Peak RSS uploading a 500 MB binary: 2.18 GB before, 165 MB after.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fourth deferred item from the #3 review: the upload pipeline held the entire binary in memory up to three times simultaneously —
prepareFileForUploadread it into aFile(zipping.appdirs in memory first), the Supabase TUS path copied it again viaBuffer.from(await file.arrayBuffer())(ironic for a "resumable/chunked" protocol), and the Backblaze simple path made a third copy. Large iOS zips could OOM constrained CI runners.Everything now streams from disk via a small
UploadSourcedescriptor ({ diskPath, name, size, contentType }):.appdirectories are zipped to a temp file through yazl's output stream (removed in afinally) instead of being buffered wholecreateReadStreamwithuploadSize— it buffers one 6 MB chunk at a timereadFileChunkdisk reads, which were dead code before (the in-memoryFilealways shadowed them)Measured result
Peak RSS uploading a 500 MB binary (
/usr/bin/time -l, same machine, same mock API):Test plan
pnpm test122/122 passing (covers the dedup short-circuit and the full--ignore-sha-checkupload attempt)pnpm buildandpnpm lintclean; also verified clean under--strictso it composes with fix: enable TypeScript strict mode and type-check tests in CI #7🤖 Generated with Claude Code