views-platform · Polichinel · Jun 3, 2026 · Jun 2, 2026 · Jun 2, 2026 · Jun 2, 2026
diff --git a/docs/falsification_campaign.md b/docs/falsification_campaign.md
@@ -0,0 +1,198 @@
+# Falsification Campaign: Business Logic Verification
+
+## Purpose
+
+Before making any further changes to this repository, we must prove that we:
+1. Understand the business logic
+2. Can demonstrate the code works correctly
+3. Have sufficient tests (unit, functional, regression, integration, end-to-end) to catch drift and breakage
+
+This document defines a sequence of falsifiable claims. Each claim is audited via `/falsify`, findings are registered via `/register-risk`, and then we move to the next claim.
+
+## CRITICAL CONSTRAINT: READ-ONLY CAMPAIGN
+
+**This campaign does NOT modify source code.** Its purpose is to **catalogue what we know, what we don't know, and what is broken** — not to fix anything.
+
+- **Do NOT modify** `mapping.py`, `unfao.py`, or any source file
+- **Do NOT modify** existing tests to make them pass
+- **Do** write NEW test stubs (falsification stubs are documentation of gaps, not fixes)
+- **Do** update the risk register (that IS the catalogue)
+- **Do** update CICs/ADRs if a finding reveals the documentation is wrong about the code's behavior — the documentation should describe what the code DOES, not what we wish it did
+- **"Fix" means:** register the finding, write a failing test stub, update docs to match reality. That's it.
+- **Source code fixes happen AFTER the campaign**, informed by the complete catalogue of findings
+
+If a claim is FALSIFIED, we register the findings, note them in the progress tracker, and move to the next claim. We do NOT enter a fix-and-reaudit loop. The goal is a complete inventory, not incremental repair.
+
+## Execution Protocol
+
+For each claim:
+1. Run `/falsify <claim text>`
+2. Confirm the proposed probes
+3. After the verdict, run `/register-risk` to capture any findings
+4. Update the progress tracker with the verdict
+5. Move to the next claim regardless of verdict (SURVIVED, CONTESTED, or FALSIFIED)
+
+## Campaign Structure
+
+The claims are organized in 4 layers. Each layer builds on the previous. A falsification in an earlier layer blocks later layers.
+
+---
+
+## Layer 1 — Do we understand the business logic?
+
+### Claim 1.1
+```
+The PriogridCountryMapper CIC (sections 3, 4, 5, and 6) accurately describes
+the mapper's actual behavior for all documented operations: the largest-overlap
+algorithm, the make_valid preprocessing, the zero-area guards, the cache modes,
+the spatial index usage, and the return types of find_country_for_gid,
+find_admin1_for_gid, find_admin2_for_gid, and enrich_dataframe_with_pg_info.
+```
+
+### Claim 1.2
+```
+The UNFAOPostProcessorManager CIC accurately describes the pipeline's actual
+behavior: the 4-stage sequence (read, transform, validate, save), the two data
+sources (ViewsER for historical, Appwrite for forecast), the enrichment via
+PriogridCountryMapper, the null validation with log-before-raise, and the
+upload to the UN FAO Appwrite bucket with enrichment_description metadata.
+```
+
+### Claim 1.3
+```
+ADR-011's description of the area-majority algorithm, the sequential allocation
+rule, the FAO-confirmed constraints, and the precomputed lookup table strategy
+is consistent with what the code in mapping.py implements and what the FAO
+release notes (Pre-release Note 02 Topic A, Release Note 01 Topic C, Release
+Note 02 Topic A) specify. No contradiction exists between the three sources.
+```
+
+---
+
+## Layer 2 — Does the code produce correct results?
+
+### Claim 2.1
+```
+For PRIO-GRID cells that are entirely within a single country (no border
+overlap), the mapper assigns them to that country with an overlap_ratio
+very close to 1.0, and the method field reads "largest overlap".
+```
+
+### Claim 2.2
+```
+For PRIO-GRID cells that span two countries, the mapper assigns them to
+the country with the larger area share. The overlap_ratio in the result
+reflects the actual geometric proportion of the cell covered by the
+assigned country. The cell is NOT assigned to the country containing
+the centroid if a different country has more area overlap.
+```
+
+### Claim 2.3
+```
+The sequential allocation invariant holds in the current code: every cell's
+Admin 1 and Admin 2 assignments are within the country assigned at Admin 0.
+find_admin1_for_gid first calls find_country_for_gid, then filters admin
+regions to that country before computing overlap. No cross-border subnational
+assignment is possible.
+```
+
+### Claim 2.4
+```
+The enrichment output schema produced by enrich_dataframe_with_pg_info
+(column names and types) matches exactly what _append_metadata in unfao.py
+expects in its filter_cols list. The 3-step derivation (shapefile column
+to dict key to prefixed DataFrame column) produces the correct names for
+all 9 metadata columns.
+```
+
+---
+
+## Layer 3 — Do we have sufficient tests?
+
+### Claim 3.1
+```
+The unit tests in test_mapping.py cover all 7 CIC guarantees from
+PriogridCountryMapper section 3 and all 6 CIC failure modes from section 6,
+with at least one test per guarantee and one test per failure mode. No CIC
+promise is untested.
+```
+
+### Claim 3.2
+```
+The validation tests in test_validation.py verify that the validation logic
+rejects every category of invalid input (missing columns, null values in
+each of the 9 required columns, NaN values) and accepts valid input. The
+test logic matches the actual _validate() implementation in unfao.py — if
+the real code diverges from the test replica, a developer would notice.
+```
+
+### Claim 3.3
+```
+A regression test exists for every code fix applied in this session (C-01
+null validation, C-18 warning suppression removal, C-20 zero-area guards,
+make_valid preprocessing, batch failure tracking) such that reverting any
+single fix would cause at least one test to fail.
+```
+
+### Claim 3.4
+```
+An integration test exists that enriches a DataFrame through the mapper
+(enrich_dataframe_with_pg_info) and then checks the output against the
+manager's filter_cols list and null validation rules — testing the full
+mapper-to-manager data path in a single test.
+```
+
+### Claim 3.5
+```
+An end-to-end test exists or can be constructed that exercises the full
+pipeline from _read() through _save(), verifying that the output parquet
+file has the correct schema (all 9 metadata columns present and non-null)
+and that the upload metadata includes the enrichment_description timestamp.
+```
+
+---
+
+## Layer 4 — Can we detect drift and breakage?
+
+### Claim 4.1
+```
+The CI pipeline (GitHub Actions run_pytest.yml) runs all tests on every
+push to development and main, and the pytest step is active (not commented
+out). A test failure would block the workflow.
+```
+
+### Claim 4.2
+```
+If a future contributor changes the assignment algorithm from area-majority
+to centroid-based (e.g., replacing the overlap loop with a point-in-polygon
+check), at least one existing test in test_mapping.py fails.
+```
+
+### Claim 4.3
+```
+If the enrichment output schema changes (a column in filter_cols is renamed,
+added, or removed in either the mapper's _process_pg_batch or the manager's
+_append_metadata), at least one existing test fails.
+```
+
+---
+
+## Progress Tracker
+
+| Claim | Status | Round | Findings | Date |
+|-------|--------|-------|----------|------|
+| 1.1 | FALSIFIED | 1 | 3H/1S: cache guarantee contradicts C-05, return types underspecified (~170 keys not 4), CIC says WARNING code uses DEBUG, admin1 return has undocumented keys. All are CIC-doc issues — underlying code already in register. | 2026-06-02 |
+| 1.2 | FALSIFIED | 1 | 2H: 3/7 raises lack logger.error (CIC claims ADR-008 compliance), 18 os.getenv() pass None without validation (CIC claims boundary failure). Already in C-19, C-13. | 2026-06-02 |
+| 1.3 | SURVIVED | 1 | clean — ADR-011 consistent with code and FAO notes, schema divergence covered in prerequisite section | 2026-06-02 |
+| 2.1 | SURVIVED | 1 | clean — interior cells: correct country, overlap_ratio=1.0, method="largest overlap" | 2026-06-02 |
+| 2.2 | SURVIVED | 1 | clean — border cell: correct country (AAA=60%), overlap_ratio=0.60. Observation: synthetic data can't produce centroid-area disagreement. | 2026-06-02 |
+| 2.3 | SURVIVED | 1 | clean — all 4 cells: admin1/admin2 consistent with admin0 country. No cross-border subnational assignment. | 2026-06-02 |
+| 2.4 | SURVIVED | 1 | clean — all 9 filter_cols present in enriched output. 15 extra columns also present but filtered by _append_metadata. | 2026-06-02 |
+| 3.1 | CONTESTED | 1 | 0H/5S: no test for make_valid, spatial index, invalid geometry, zero-area cell, missing columns. 8 of 13 CIC items tested, 5 gaps. | 2026-06-02 |
+| 3.2 | CONTESTED | 1 | 0H/2S: no wrong-type test, validation tests use replica not real _validate(). Good coverage otherwise (36 parametrized tests). | 2026-06-02 |
+| 3.3 | FALSIFIED | 1 | 4H: no regression test for C-18 (warning removal), C-20 (zero-area), make_valid, C-21 (batch tracking). Only C-01 (null validation) has regression coverage. | 2026-06-02 |
+| 3.4 | FALSIFIED | 1 | 1H: no integration test exists for mapper→filter_cols→validate path. Mapper and manager tested in isolation only. | 2026-06-02 |
+| 3.5 | FALSIFIED | 1 | 1H: no E2E pipeline test. views-pipeline-core not in test env blocks direct instantiation. | 2026-06-02 |
+| 4.1 | CONTESTED | 1 | 0H/1S: pytest step active, runs on push. But cannot verify it's a required status check for PR merges (GitHub settings, not repo). | 2026-06-02 |
+| 4.2 | SURVIVED | 1 | clean — test_cell_fully_in_country_a asserts method=="largest overlap", would fail if algorithm changed to centroid. overlap_ratio key existence also sensitive. | 2026-06-02 |
+| 4.3 | CONTESTED | 1 | 0H/2S: enrichment test checks 5 of 9 filter_cols, validation replica can drift from real code. Rename of existing column IS caught. | 2026-06-02 |
diff --git a/docs/falsification_loop_prompt.md b/docs/falsification_loop_prompt.md
@@ -0,0 +1,41 @@
+# Falsification Campaign Loop Prompt
+
+Copy the block below and paste it as a `/loop` prompt in Claude Code.
+It processes one claim at a time from `docs/falsification_campaign.md`.
+
+---
+
+## Loop Prompt
+
+```
+Read docs/falsification_campaign.md. Find the first claim with status PENDING in the Progress Tracker.
+
+If no PENDING claims remain, say "Campaign complete — all claims processed" and stop.
+
+Otherwise, for the current claim:
+
+CRITICAL: This is a READ-ONLY campaign. Do NOT modify mapping.py, unfao.py, or any source file. Do NOT modify existing tests. The ONLY files you may create or modify are: falsification test stub files (new), the risk register, CICs/ADRs (to correct documentation that misdescribes actual code behavior), and the progress tracker.
+
+1. Run /falsify with the claim text (the content inside the ``` block under that claim heading). Follow the full falsification protocol: present probes, wait for confirmation, execute, classify, generate test stubs, report verdict.
+
+2. After the /falsify verdict, immediately run /register-risk to register any findings from the falsification. Follow the full registration protocol: extract, deduplicate, assign tiers, append, report.
+
+3. Update the Progress Tracker row for this claim in docs/falsification_campaign.md:
+   - Set Status to the /falsify verdict (SURVIVED, CONTESTED, or FALSIFIED)
+   - Set Round to 1
+   - Set Findings to a short summary (e.g., "2H/1S" for 2 hard + 1 soft, or "clean" for SURVIVED)
+   - Set Date to today
+
+4. Do NOT fix source code. Do NOT re-run the claim. Do NOT enter a fix-and-reaudit loop. Register the findings, update the tracker, and STOP this iteration. The next loop invocation will pick up the next PENDING claim.
+
+5. Report what was found and what the verdict was.
+
+Important:
+- Process ONE claim per loop iteration
+- NEVER modify source code (mapping.py, unfao.py, __init__.py, conftest.py, test_mapping.py, test_validation.py)
+- Always present probes and wait for user confirmation before executing
+- Always register risks after the falsification
+- Always update the progress tracker
+- The claim text is inside the ``` code block under each claim heading in falsification_campaign.md
+- "Fix" means: register finding + write test stub + update docs to match reality. NOT change source code.
+```
diff --git a/reports/technical_risk_register.md b/reports/technical_risk_register.md
@@ -30,11 +30,12 @@
 **Highest tier:** 1 (C-16)
 **Fix strategy:** Extract `CacheStrategy` interface with disk/memory implementations. Thread-lock in memory impl. Shapefile hash in disk cache keys.
 **Resolution scope:** Full
+**⚠ CONTINGENT ON ADR-011:** If precomputed lookup table replaces mapping.py, this entire cluster is eliminated. Defer until ADR-011 is executed.
 
 ### Cluster B: Silent error hiding architecture
 **Root cause:** The codebase suppresses problem signals at three levels — global warning filter, DEBUG-level exception logging with `continue`, and raises without preceding logs. The impact propagates through a delivery chain with no correction mechanism.
-**Entries:** C-12, C-18, C-19, C-20, C-21, C-22, D-03, D-04
-**Highest tier:** 1 (C-12, C-20)
+**Entries:** C-12, C-19, C-20, C-21, C-22, D-03 (resolved), D-04 (resolved), C-18 (resolved)
+**Highest tier:** 2 (C-12, C-21)
 **Fix strategy (5/9 done):** ✅ Replace global warning suppression with targeted filter. ◻ Promote geometry errors from DEBUG to WARNING. ✅ Add `make_valid()` preprocessing. ◻ Narrow exception scope. ✅ Zero-area guard clause (all 7 sites). ◻ `logger.error` before all raises (3 of ~23 done). ✅ Surface batch failures to caller (both methods). ◻ Enrichment provenance in upload (timestamp added, no shapefile version). ◻ Post-delivery correction procedure (C-22).
 **Resolution scope:** Full (code mechanisms) + Partial (operational impact — C-22 requires process documentation). **Note:** If D-05 resolves toward mapper elimination, remaining code fixes become moot.
 
@@ -44,13 +45,22 @@
 **Highest tier:** 2 (C-02)
 **Fix strategy:** Lazy initialization or removal of module-level call. Add `mapper` constructor parameter to manager.
 **Resolution scope:** Full
+**⚠ CONTINGENT ON ADR-011:** If precomputed lookup table replaces mapping.py, this entire cluster is eliminated. Defer until ADR-011 is executed.
 
 ### Cluster D: Mapper-manager boundary contract
 **Root cause:** No explicit contract declares what columns the mapper produces and the manager consumes.
 **Entries:** C-04, C-17
 **Highest tier:** 2 (C-04)
 **Fix strategy:** Define `ENRICHMENT_SCHEMA` constant. Harmonize forward/reverse thresholds. Add end-to-end integration test.
 **Resolution scope:** Full
+**⚠ CONTINGENT ON ADR-011:** If precomputed lookup table replaces mapping.py, the boundary simplifies to a Parquet schema. C-17 is eliminated; C-04 is eliminated (no runtime forward/reverse divergence).
+
+### Cluster F: CIC-code drift (documentation describes aspirational, not actual behavior)
+**Root cause:** CICs were written as design contracts and never validated against the code. Multiple guarantees are false.
+**Entries:** Campaign findings 1.1, 1.2 — affecting CIC PriogridCountryMapper §3/§5/§6 and CIC UNFAOPostProcessorManager §3/§6
+**Highest tier:** Not a code risk — documentation accuracy risk
+**Fix strategy:** Update CICs to describe actual code behavior. Specifically: (1) cache guarantee needs C-05 caveat, (2) return types need full key listing, (3) §6 log level should say DEBUG not WARNING, (4) ADR-008 compliance claim needs qualifying, (5) env var boundary validation claim needs qualifying. Pure documentation, no code changes.
+**Resolution scope:** Full
 
 ### Cluster E: Replace runtime mapper with precomputed lookup table
 **Root cause:** The area-majority algorithm is a confirmed FAO requirement (D-05 resolved), but it doesn't need 3,100 lines of geopandas runtime code — a one-time precomputation produces a ~65K-row Parquet lookup table that replaces the entire mapper with a dictionary join.
@@ -222,7 +232,7 @@ See also C-06 (code duplication within the same class amplifies the SRP violatio
 | Field | Value |
 |-------|-------|
 | ID | C-12 |
-| Tier | 1 |
+| Tier | 2 |
 | Source | `expert-review` (2026-06-02) |
 | Trigger | When Natural Earth or GAUL shapefiles contain an invalid polygon for a country that is the correct assignment for a PRIO-GRID cell, verify that the geometry error is surfaced — currently the cell is silently assigned to the next-best country |
 | Location | `views_postprocessing/unfao/mapping/mapping.py:658-660` (also duplicated at ~line 1192, 1428 in admin1/admin2 branches) |
@@ -235,7 +245,9 @@ The mapper's own `_validate_naturalearth_data` (line 473-477) detects invalid ge
 
 Critically, the manager's `_validate()` method — the only safety gate before upload — is structurally incapable of catching Cluster B's primary failure mode. `_validate()` checks column presence and null counts. A misassigned cell has a valid ISO code (just the wrong one), non-null values, and all required columns present. Validation passes. The safety net has a hole shaped exactly like the failure mode it should catch: wrong-but-valid geographic assignments are invisible to every automated check in the pipeline.
 
-See also C-08 (a different mechanism for wrong-country at high latitudes), D-05 (if mapper eliminated, this concern is moot).
+Tier recalibrated from 1 to 2 during post-campaign review-rr (2026-06-02): `make_valid()` now applied at load time mitigates the root cause. With valid geometries, intersection failures are limited to precision edge cases. The 7 DEBUG handlers remain but are defense-in-depth, not the primary failure path. Campaign Claim 2.1-2.4 confirmed correct algorithm behavior.
+
+See also C-08 (a different mechanism for wrong-country at high latitudes), D-05 (if mapper eliminated, this concern is moot). Contingent on ADR-011.
 
 ---
 
@@ -432,7 +444,7 @@ See also C-17 (implicit column naming between mapper and manager — a different
 | ID | D-01 |
 | Source | `expert-review` (2026-06-02) |
 | Perspectives | Martin/GoF (extract Strategy pattern now), Feathers/Beck (characterize disk-cache branch with tests first) |
-| Resolution | Unresolved — resolve before starting Cluster A cache strategy extraction. **Contingent on D-05:** if mapper eliminated, this disagreement is moot. |
+| Resolution | **⚠ CONTINGENT ON ADR-011.** If precomputed lookup table replaces mapping.py, no cache to refactor — this disagreement is moot. Only resolve if mapper is kept. |
 
 ---
 
@@ -443,7 +455,7 @@ See also C-17 (implicit column naming between mapper and manager — a different
 | ID | D-02 |
 | Source | `expert-review` (2026-06-02) |
 | Perspectives | Nygard (threading.Lock), Beck (remove ThreadPoolExecutor), Hickey (sequential is simpler), Ousterhout (keep threading for speed) |
-| Resolution | Unresolved — resolve before addressing C-16. **Contingent on D-05:** if mapper eliminated, this disagreement is moot. |
+| Resolution | **⚠ CONTINGENT ON ADR-011.** If precomputed lookup table replaces mapping.py, no threading — this disagreement is moot. Only resolve if mapper is kept. |
 
 ---