Context
enrich_dataframe_with_pg_info and enrich_dataframe_with_country_info track failed batches via failed_batches counter and summary ERROR log. But:
failed_batches and failed_gids are NOT included in the return value
- The caller receives a partial DataFrame with NaN rows and no programmatic signal
- This does not satisfy ADR-003's fail-loud requirement
The downstream _validate() catches the NaN but the error message says "null values in country_iso_a3" — misleading the operator toward spatial mapping issues rather than batch computation failures.
Requirements
Either:
- Option A: Raise an exception when
failed_batches > 0 (strict fail-loud)
- Option B: Return
failed_gids alongside the DataFrame (e.g., as a tuple or DataFrame attribute)
- Option C: Add a
strict=True parameter — when True, raise on batch failure; when False, log-only (current behavior)
Risk Register
C-21 (Tier 2). Part of Cluster B.
Note
If ADR-011 replaces the mapper with a Parquet lookup, batch failures become impossible (a merge can't fail per-batch). This concern becomes moot.
Context
enrich_dataframe_with_pg_infoandenrich_dataframe_with_country_infotrack failed batches viafailed_batchescounter and summary ERROR log. But:failed_batchesandfailed_gidsare NOT included in the return valueThe downstream
_validate()catches the NaN but the error message says "null values in country_iso_a3" — misleading the operator toward spatial mapping issues rather than batch computation failures.Requirements
Either:
failed_batches > 0(strict fail-loud)failed_gidsalongside the DataFrame (e.g., as a tuple or DataFrame attribute)strict=Trueparameter — when True, raise on batch failure; when False, log-only (current behavior)Risk Register
C-21 (Tier 2). Part of Cluster B.
Note
If ADR-011 replaces the mapper with a Parquet lookup, batch failures become impossible (a merge can't fail per-batch). This concern becomes moot.