Stage 0 — Baseline as-usual run on africa_me_legacy (production machine)
Part of the FAO global delivery plan — umbrella: #20. No code changes in this issue.
Why this exists
Two reasons, and the second is the one that makes this more than "just a test":
- Prove the infrastructure works as it usually does. This run exercises everything the development environment cannot: real Git-LFS shapefiles, the datafactory zarr fetch over HTTP, Appwrite credentials, and the full
_read → _transform → _validate → _save pipeline at the current africa_me_legacy coverage (13,110 cells). Green here means any later failure is caused by our changes, not by drifted infrastructure.
- Produce the ground-truth baseline for the engine swap. The enriched output of this run — produced by the current runtime mapper on real input — is the exact artifact the new lookup enricher will be diffed against in Stage 2 (#LINK_E). A real production output is the best possible verification data; this run creates it for free.
What to run
The pipeline unchanged, on the production machine, region africa_me_legacy, exactly as a normal delivery run — with one decision:
- Recommended: skip the final Appwrite upload (or treat it as a normal upload if operationally simpler — the bucket has no retention either way, see register C-25 narrative). Record which choice was made here.
What to archive (the baseline artifact)
Create a baseline/ directory (local to the production machine or committed as a release artifact — record where) containing:
- The enriched historical output parquet (post-
_transform, post-_validate)
- The enriched forecast output parquet
- A
baseline_schema.md recording, for both dataframes: exact column list, dtypes per column, index structure (MultiIndex names and dtypes), row counts
- Operational numbers: wall-clock runtime of the full run, peak memory if observable, and per-stage timings if logs allow
Item 3 is the acceptance reference for Stage 3 (#LINK_F: "schema identical to baseline") and Stage 4 (faoapi column-list verification). Item 4 is the comparison point for the global dry run (register C-32).
Definition of done
Notes
- If this run is NOT green, stop the plan and diagnose here first — that is this issue doing its job. Likely suspects are environment drift (credentials, LFS state, zarr reachability), not code: the same code delivered successfully before.
- The 5 africa_me ocean cells (gids 62356, 94776, 99027, 107733, 107742) have historically been part of the 13,110-cell input. If validation passes today, the current mapper assigns them somehow (Natural Earth detail) — worth noting in the baseline what values they carry, since the lookup-based future will exclude them via the
land_gaul region story (views-platform/views-datafactory#159).
Stage 0 — Baseline as-usual run on africa_me_legacy (production machine)
Part of the FAO global delivery plan — umbrella: #20. No code changes in this issue.
Why this exists
Two reasons, and the second is the one that makes this more than "just a test":
_read → _transform → _validate → _savepipeline at the current africa_me_legacy coverage (13,110 cells). Green here means any later failure is caused by our changes, not by drifted infrastructure.What to run
The pipeline unchanged, on the production machine, region
africa_me_legacy, exactly as a normal delivery run — with one decision:What to archive (the baseline artifact)
Create a
baseline/directory (local to the production machine or committed as a release artifact — record where) containing:_transform, post-_validate)baseline_schema.mdrecording, for both dataframes: exact column list, dtypes per column, index structure (MultiIndex names and dtypes), row countsItem 3 is the acceptance reference for Stage 3 (#LINK_F: "schema identical to baseline") and Stage 4 (faoapi column-list verification). Item 4 is the comparison point for the global dry run (register C-32).
Definition of done
baseline/artifacts archived; location recorded here and on the umbrella [UMBRELLA] FAO global delivery: lookup-based enrichment (ADR-011) + global coverage #20baseline_schema.mdcommitted to the repo (the schema reference is small and belongs in git even if the parquets do not)Notes
land_gaulregion story (views-platform/views-datafactory#159).