Context
The UNFAOPostProcessorManager._append_metadata() method (unfao.py:140-163) calls PriogridCountryMapper.enrich_dataframe_with_pg_info(), then selects exactly 9 columns via filter_cols, re-indexes, and joins back onto the original dataset. This is the single most important boundary in the codebase: the mapper produces enrichment data, and the manager consumes it.
Currently, the mapper and manager are tested in isolation:
test_mapping.py tests the mapper's spatial logic with synthetic shapefiles
test_validation.py tests the validation logic with a replicated function
No test verifies the boundary between them. If the mapper changes a column name (e.g., the 3-step derivation: shapefile ISO_A3 → dict key iso_a3 → prefixed country_iso_a3), the mapper tests pass, the validation tests pass, and production breaks.
This is the #1 priority test identified by the falsification campaign (Claim 3.4, FALSIFIED) and the prioritization review.
Why now
ADR-011 plans to replace the runtime mapper with a precomputed Parquet lookup table. When that happens, this integration test becomes the acceptance test for the new enricher — it verifies that the replacement produces the same output the manager expects. Writing it BEFORE the replacement ensures we have a safety net during the transition.
Requirements
Test: test_integration_mapper_to_manager_filter_cols
Setup:
- Use the existing
mapper fixture from conftest.py (synthetic shapefiles)
- Create a DataFrame with the columns
_append_metadata expects as input: a multi-index of (month_id, priogrid_gid) with at least 3 rows covering cells in different countries
Act:
- Call
mapper.enrich_dataframe_with_pg_info() with the same parameters _append_metadata uses:
pg_id_col=<entity_id_column_name>
time_id_col=<time_id_column_name>
only_metadata=True
batch_size=1000
- Select the exact
filter_cols list from the enriched result (copy the list from unfao.py:146-157):
filter_cols = [
"month_id", "priogrid_gid",
"pg_xcoord", "pg_ycoord", "country_iso_a3",
"admin1_gaul1_code", "admin1_gaul1_name",
"admin1_gaul0_code", "admin1_gaul0_name",
"admin2_gaul2_code", "admin2_gaul2_name",
]
Assert:
- All 9 metadata columns (excluding the index columns) are present in the enriched DataFrame
- None of the 9 columns contain null values for GIDs that exist in the synthetic shapefiles
country_iso_a3 contains valid ISO codes from the synthetic data ("AAA" or "BBB")
admin1_gaul1_code is an integer (not float, not string)
pg_xcoord and pg_ycoord are floats
- The enriched DataFrame has the same row count as the input
Test: test_integration_enrichment_then_validation
Setup:
- Same as above — enrich a DataFrame through the mapper
Act:
- After enrichment, run the validation logic (the same checks
_validate() performs) against the enriched DataFrame
Assert:
- All 9 required metadata columns are present
- No null values in any required column
- If this test passes, it means the mapper's output would pass
_validate() in production
Acceptance criteria
References
- Falsification campaign Claim 3.4 (FALSIFIED):
docs/falsification_campaign.md
- Test stub:
tests/test_falsification_campaign_3_4.py
- Risk register: C-17 (implicit column naming contract), Cluster D (mapper-manager boundary)
- CIC:
docs/CICs/PriogridCountryMapper.md §5, docs/CICs/UNFAOPostProcessorManager.md §4
- ADR-011: precomputed lookup table (this test becomes the acceptance test for the transition)
Context
The
UNFAOPostProcessorManager._append_metadata()method (unfao.py:140-163) callsPriogridCountryMapper.enrich_dataframe_with_pg_info(), then selects exactly 9 columns viafilter_cols, re-indexes, and joins back onto the original dataset. This is the single most important boundary in the codebase: the mapper produces enrichment data, and the manager consumes it.Currently, the mapper and manager are tested in isolation:
test_mapping.pytests the mapper's spatial logic with synthetic shapefilestest_validation.pytests the validation logic with a replicated functionNo test verifies the boundary between them. If the mapper changes a column name (e.g., the 3-step derivation: shapefile
ISO_A3→ dict keyiso_a3→ prefixedcountry_iso_a3), the mapper tests pass, the validation tests pass, and production breaks.This is the #1 priority test identified by the falsification campaign (Claim 3.4, FALSIFIED) and the prioritization review.
Why now
ADR-011 plans to replace the runtime mapper with a precomputed Parquet lookup table. When that happens, this integration test becomes the acceptance test for the new enricher — it verifies that the replacement produces the same output the manager expects. Writing it BEFORE the replacement ensures we have a safety net during the transition.
Requirements
Test:
test_integration_mapper_to_manager_filter_colsSetup:
mapperfixture fromconftest.py(synthetic shapefiles)_append_metadataexpects as input: a multi-index of(month_id, priogrid_gid)with at least 3 rows covering cells in different countriesAct:
mapper.enrich_dataframe_with_pg_info()with the same parameters_append_metadatauses:pg_id_col=<entity_id_column_name>time_id_col=<time_id_column_name>only_metadata=Truebatch_size=1000filter_colslist from the enriched result (copy the list from unfao.py:146-157):Assert:
country_iso_a3contains valid ISO codes from the synthetic data ("AAA" or "BBB")admin1_gaul1_codeis an integer (not float, not string)pg_xcoordandpg_ycoordare floatsTest:
test_integration_enrichment_then_validationSetup:
Act:
_validate()performs) against the enriched DataFrameAssert:
_validate()in productionAcceptance criteria
tests/test_integration.pyfilter_colslist is copied verbatim fromunfao.py(not hardcoded separately) — or imported from a shared constant if one is createdmappersession fixture, not a new mapper instanceReferences
docs/falsification_campaign.mdtests/test_falsification_campaign_3_4.pydocs/CICs/PriogridCountryMapper.md§5,docs/CICs/UNFAOPostProcessorManager.md§4