Fix MAS Data-Not-Collected misclassification (fixes #1, pending live verification)#5
Draft
adamXbot wants to merge 1 commit into
Draft
Fix MAS Data-Not-Collected misclassification (fixes #1, pending live verification)#5adamXbot wants to merge 1 commit into
adamXbot wants to merge 1 commit into
Conversation
A Mac App Store app that declared "Data Not Collected" was rendered as
"developer has not provided any details" — the opposite state.
Root cause: in AppStorePrivacyLabelFetcher.parse(html:), a positive
".provided" result was only returned when one of three rigid
structured-JSON paths in privacyTypeItems(in:) matched. A sparse
"Data Not Collected" payload those paths miss fell through to a
whole-page regex (hasNoDetailsCopy) that matches "No Details Provided"
— a phrase that also appears in App Store page chrome — yielding a
false .noDetailsProvided. There was no positive path for the
"Data Not Collected" declaration.
Fix (additive, so the working happy path can't regress). New ordering
in parse(html:): structured items -> deep JSON sweep -> positive
"data not collected" text -> hasNoDetailsCopy -> parseFailure.
1. privacyTypeItemsDeep(in:): a graph-wide fallback that recursively
walks the decoded JSON for any object whose `identifier` is one of
the four canonical privacy-type ids AND that carries an item-shaped
key (title/detail/categories/purposes), so a bare enum/schema
listing can't be mistaken for a declaration. Deduped by identifier.
2. hasDataNotCollectedCopy(html:): a positive text fallback, checked
BEFORE hasNoDetailsCopy, matching the specific phrase "does not
collect any data from this app" — specific enough to avoid
colliding with page chrome, unlike the terse "Data Not Collected"
heading.
No UI change needed; the Dashboard already renders a lone
DATA_NOT_COLLECTED label correctly once the fetcher returns it.
Tests: new AppStorePrivacyLabelFetcherTests covering structured labels,
deep-JSON not-collected, text-fallback not-collected, not-collected
winning over "No Details Provided" boilerplate (the exact reported
failure), genuine no-details, and unknown-layout parse failure.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
13b014e to
f974354
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1.
The bug
A Mac App Store app that declared "Data Not Collected" was rendered as "developer has not provided any details" — the opposite state. Apps with genuinely-populated labels rendered correctly, so the regression was isolated to the sparse not-collected payload.
Root cause
In
AppStorePrivacyLabelFetcher.parse(html:), a positive.providedresult was only returned when one of three rigid structured-JSON paths inprivacyTypeItems(in:)matched. A sparse "Data Not Collected" payload those paths miss fell through to a whole-page regex (hasNoDetailsCopy) that matches the phrase "No Details Provided" — which also appears in App Store page chrome — yielding a false.noDetailsProvided. There was no positive path for the "Data Not Collected" declaration at all.The fix (additive — the working happy path can't regress)
New ordering in
parse(html:):privacyTypeItemsDeep(in:)— a graph-wide fallback that recursively walks the decoded JSON for any object whoseidentifieris one of the four canonical privacy-type ids (sourced fromPrivacyLabels.TypeIdentifier.allCases) and that carries an item-shaped key (title/detail/categories/purposes), so a bare enum/schema listing can't be mistaken for a real declaration. Deduped by identifier. The structured check becomesprivacyTypeItems(in:) ?? privacyTypeItemsDeep(in:).hasDataNotCollectedCopy(html:)— a positive text fallback, checked beforehasNoDetailsCopy, matching the specific phrase "does not collect any data from this app" — specific enough to avoid colliding with page chrome, unlike the terse "Data Not Collected" heading.No UI change is needed — the Dashboard already renders a lone
DATA_NOT_COLLECTEDlabel correctly once the fetcher returns it.Tests
New
Tests/privacycommandCoreTests/AppStorePrivacyLabelFetcherTests.swiftdrives the pureparse(html:)against canned HTML, covering all four outcomes:.provided.provided+isExplicitlyNotCollected.provided+isExplicitlyNotCollected.provided+isExplicitlyNotCollected(this is the exact reported failure).noDetailsProvided.parseFailureVerification
swift build✅DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer swift test✅ — all 48 tests pass (6 new), run from theprivacycommand/package dir (CLT lacks XCTest, so the full Xcode toolchain is required).These tests use synthetic HTML mirroring Apple's documented JSON shapes. The live 2026 App Store JSON shape for a real "Data Not Collected" app is NOT pinned by any fixture in this PR. The deep-JSON and text fallbacks are designed defensively, but they're validated against shapes I constructed, not against Apple's current bytes.
Before merging / closing #1, please capture a real page for a known not-collected MAS app and add a fixture test from the actual bytes:
Then feed those bytes through
AppStorePrivacyLabelFetcher.parse(html:)in a new test asserting.provided+isExplicitlyNotCollected. That closes the loop between "matches the shapes I expect" and "matches what Apple actually ships."🤖 Generated with Claude Code