feat(importer): store blog media as gitsheets attachments#109
Merged
Conversation
Capture each referenced blog media item's bytes at import time, store as a gitsheets attachment scoped to the owning blog-posts record, rewrite the body's media URLs to /api/attachments/:key. Filename format: <caption-slug>-<MediaID>.<ext> or image-<MediaID>.<ext> when caption is empty. Runtime resizing deferred to #108. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this PR, blog post bodies referenced media via `https://codeforphilly.org/thumbnail/<id>/<dim>` — every image broke at cutover when laddr decommissions. This PR captures the original bytes at import time as attachments scoped to the owning blog post and rewrites body references to /api/attachments/blog-posts/<slug>/ <filename>. Filename derivation: caption non-empty: slugify(caption).slice(0, 80) + '-' + mediaId + '.' + ext (caption-derived, readable) caption empty: 'image-' + mediaId + '.' + ext Ext comes from response Content-Type (image/jpeg → .jpg, image/png → .png, etc.). The /api/attachments/:key route already infers Content-Type from the file extension at serve time, so the extension is load-bearing. Translator changes: - translateBlogPost now returns { record, mediaAssets } rather than just BlogPost. mediaAssets describes the planned attachments (mediaId, captionSlug, ownerSlug, sourceUrl). - Markdown body emits placeholder URLs (`cfp-media:<mediaId>`) for Media items; importer's pre-fetch phase substitutes them with final URLs once the file extension is known. - Embed HTML gets regex-scanned for codeforphilly.org/(thumbnail| media)/<id>/... URLs — those become placeholders too. Third-party URLs (YouTube iframes, external sites) pass through untouched. Importer changes: - After translation, fetches every distinct (mediaId, owner) pair in parallel with concurrency=4. Failed fetches log a warning and drop the asset; the post still imports with the rest of its body. - Inside the transact callback, for each post: BlobObject.writes each artifact's bytes into the git object DB, calls tx['blog-posts'].setAttachments(record, blobs), then upserts the record (with placeholder URLs already substituted). Runtime thumbnail resizing is deferred to #108 — this PR stores originals only; downsizing on demand is the SPA's problem today. Tests: 11 translator cases including placeholder emission, caption fallback, interleaved item classes, Order sorting, unknown class warning, Embed URL scan + third-party pass-through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All 6 validation checkboxes ticked. Notes covers the translator return-shape refactor ripple, the doubled payload from include=*, the cross-post MediaID dedup via git object DB hashing, and the split-join substitution choice. Follow-ups: runtime thumbnail service (#108), featuredImageKey wiring (deferred), lazy body loading (#45). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Before this PR, blog post bodies referenced media via `https://codeforphilly.org/thumbnail//` — every image breaks at cutover when laddr decommissions. This PR captures the original bytes at import time as attachments scoped to the owning blog post and rewrites body references to `/api/attachments/blog-posts//`.
Filenames are human-readable when captions are present:
Runtime thumbnail resizing (so a 200×200 card doesn't pull a full original) is deferred to #108.
How it works
Failed media fetches log a warning and drop just the asset — the post still imports with the rest of its body.
Test plan
Ship plan
After merge:
🤖 Generated with Claude Code