importer: scrape blog-post bodies from laddr's HTML (JSON endpoint doesn't surface Body)

## What's wrong

The cutover-blog importer landed in #84 / PR #101 and successfully imports 138 blog posts (titles, authors, summaries, publish dates), but **bodies come through empty**.

Root cause: laddr's \`/blog?format=json\` endpoint doesn't include a \`Body\` field at all — not in the list response, not on \`?include=Body\`, not on single-post fetches via \`/blog/<id>?format=json\`. The body only exists in the **rendered HTML** at \`https://codeforphilly.org/blog/<numeric-id>\` inside \`<div class=\"article-body\">\`.

Verified 2026-05-30 against the live laddr site:

\`\`\`
$ curl -s 'https://codeforphilly.org/blog?format=json&limit=1' | jq '.data[0] | keys'
[
  \"AuthorID\", \"Class\", \"ContextClass\", \"ContextID\", \"Created\",
  \"CreatorID\", \"Handle\", \"ID\", \"Modified\", \"ModifierID\",
  \"Published\", \"RevisionID\", \"Status\", \"Summary\", \"Title\", \"Visibility\"
]
\`\`\`

No \`Body\`. Same for \`?include=Body\` and single-post fetches.

## Fix shape

\`translateBlogPost\` in \`apps/api/scripts/import-laddr/translators.ts\` needs a second pass after the JSON fetch:

1. For each imported post, GET \`https://<source-host>/blog/<legacyId>\` (the numeric-id URL, which 200s; the slug URL 404s because laddr's slug format uses underscores, mismatched with our kebab-case importer output)
2. Parse the HTML, extract the \`<div class=\"article-body\">\` content
3. Run that HTML through a markdown-converter (turndown or similar) so the content-typed sheet stays in its declared markdown format
4. Assign to \`record.body\`

Alternative: import bodies as raw HTML rather than markdown — either store as a plain TOML string (drop the markdown content-type) or embed HTML blocks within markdown (legal in CommonMark).

## Impact today

The deployed sandbox shows the blog index + detail screens with titles, authors, summaries, and post dates — the *structure* is correct and visitors can see what posts exist. The body region renders the summary then an empty area. Acceptable as a v1 deploy; not acceptable as the cutover state for production.

## Out of scope here

The slug mismatch between the importer's kebab-case slugs and laddr's underscore slugs is a separate redirect concern — currently the legacy URL redirects don't cover \`/blog/<slug>\`. Worth tracking but doesn't block this body-fetch work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

importer: scrape blog-post bodies from laddr's HTML (JSON endpoint doesn't surface Body) #106

What's wrong

Fix shape

Impact today

Out of scope here

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

importer: scrape blog-post bodies from laddr's HTML (JSON endpoint doesn't surface Body) #106

Description

What's wrong

Fix shape

Impact today

Out of scope here

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions