fix(ci): decode+unzip OLRC archive before transforming in sync (closes #199)#204
Merged
Merged
Conversation
OlrcFetcher.fetchXml() returns base64-encoded ZIP bytes (the archive containing the USLM .xml), but the sync workflow's "Transform statutes" step passed that value straight into transformer.transformToFiles(), which expects raw XML. Every title failed to parse, so the weekly sync transformed 0 sections and never committed anything. Add a tested, pure-Node `extractXmlFromZip(zip: Buffer)` to the fetcher package (lifted from the proven logic in scripts/fetch-title.ts; uses inflateRawSync for stored/deflate entries) and wire the workflow to decode base64 → unzip → transform, failing the title cleanly if the archive has no .xml entry. New unit tests cover stored (method 0), deflate (method 8 round-trip), and no-.xml-entry cases. fetcher: 126 tests pass; full monorepo builds. Closes #199 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
williamzujkowski
added a commit
that referenced
this pull request
Jun 23, 2026
…214) * chore: remove unused @civic-source/pipeline and observability packages Both packages were built, tested, and documented but had zero importers: nothing in the repo (workflows, scripts, apps, other packages) imported @civic-source/pipeline or @civic-source/observability. The live sync-law.yml inlines its own fetch→transform→annotate logic and never called orchestrate(); pipeline didn't even depend on observability. Per the repo's YAGNI principle, remove the speculative packages rather than carry the maintenance/build surface. orchestrate() also still carried the pre-#204 base64-ZIP bug and lacked the data-repo git/commit/push half, so it was not a drop-in for the workflow anyway. - Delete packages/pipeline and packages/observability (+ their tests). - Drop their rows from README.md and ARCHITECTURE.md; relabel the ARCHITECTURE data-flow diagram's orchestration box "Pipeline" → "Workflow" (the GitHub Actions sync now fills that role). - Regenerate pnpm-lock.yaml. Build drops 8→6 packages; full build + tests green; frozen-lockfile install clean. Closes #208 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: de-stale README dev section after package removal Removing pipeline/observability made "267 tests across 8 packages" inaccurate; replace the brittle hardcoded count with a non-numeric note. Also correct the toolchain line (Node 22.x/pnpm 9.x → Node 24.x/pnpm 11.x) to match the #183 migration, mirroring the earlier CLAUDE.md fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #199.
Problem
OlrcFetcher.fetchXml()returns base64-encoded ZIP bytes (comment: "caller will extract XML from the ZIP"), butsync-law.yml's "Transform statutes" step fed that string straight intotransformer.transformToFiles(), which expects raw USLM XML. Result: every title failed to parse →sections=0, nothing ever committed. The weekly sync was effectively inert (this was masked until #194 fixed the import resolution that prevented the step from running at all).Fix
packages/fetcher/src/zip.ts→extractXmlFromZip(zip: Buffer): string | null. Pure-Node ZIP local-file-header walker lifted from the proven logic inscripts/fetch-title.ts; returns stored (method 0) bytes directly and inflates deflate (method 8) viainflateRawSync. Exported from the package index.sync-law.ymlTransform step now:extractXmlFromZip(Buffer.from(xmlResult.value, 'base64'))→ null-guards (counts a failed title) →transformToFiles(xml). Unchanged-skip, failure counting, and output writing all preserved. The existing per-titletry/catchstill covers any malformed-archive throw.Tests
New
packages/fetcher/src/__tests__/zip.test.tsbuilds ZIPs in memory and covers:deflateRawSync/inflateRawSync.xmlentry →nullVerification:
pnpm --filter @civic-source/fetcher test→ 126 passed;pnpm build→ 8/8.Follow-up
A CI smoke test that transforms a real fixture end-to-end (proposed in #194) would guard against both this and the earlier resolution bug. Tracked there.
🤖 Generated with Claude Code