Idea
The case-optimization CI builds (5 benchmark variants per cluster/interface) differ only in the baked-in case-constants module: each --case-optimization variant regenerates the case fpp and recompiles the whole simulation target into its own hash-named staging dir. Under AMD flang — by far the slowest compiler in the matrix — this serial recompilation is the dominant cost of the "Case Opt | Frontier (AMD)" job (see the prebuild sharding PR #1582, which works around it by parallelizing across SLURM jobs).
A compiler cache (ccache, which supports flang/clang-derived frontends) should slash this cost twice over:
- Cross-variant: within one run, the overwhelming majority of translation units are identical across the 5 variants (only files that include the case-constants module differ), so variants 2..5 should be mostly cache hits.
- Cross-run: the self-hosted Frontier runners have persistent storage, so a warm cache survives between CI runs; PRs that don't touch
src/simulation would rebuild almost nothing.
Caveats / needed spike
- Fortran + ccache is not as turnkey as C/C++:
.mod files are compiler outputs that ccache does not manage natively, and flang's module hashing/reproducibility needs verification. A small spike should confirm (a) cache hits actually occur across case-opt variants under flang, and (b) .mod staleness can't poison a build (worst case must be a cache miss, never a wrong binary).
- Cache key must include compiler version + module environment so Frontier programming-environment updates invalidate cleanly.
- CMake integration is easy (
CMAKE_Fortran_COMPILER_LAUNCHER=ccache), but the toolchain would need to expose it (or set it for CI builds only).
Relation to other work
Idea
The case-optimization CI builds (5 benchmark variants per cluster/interface) differ only in the baked-in case-constants module: each
--case-optimizationvariant regenerates the case fpp and recompiles the whole simulation target into its own hash-named staging dir. Under AMD flang — by far the slowest compiler in the matrix — this serial recompilation is the dominant cost of the "Case Opt | Frontier (AMD)" job (see the prebuild sharding PR #1582, which works around it by parallelizing across SLURM jobs).A compiler cache (ccache, which supports flang/clang-derived frontends) should slash this cost twice over:
src/simulationwould rebuild almost nothing.Caveats / needed spike
.modfiles are compiler outputs that ccache does not manage natively, and flang's module hashing/reproducibility needs verification. A small spike should confirm (a) cache hits actually occur across case-opt variants under flang, and (b).modstaleness can't poison a build (worst case must be a cache miss, never a wrong binary).CMAKE_Fortran_COMPILER_LAUNCHER=ccache), but the toolchain would need to expose it (or set it for CI builds only).Relation to other work
m_rhsdecomposition (Decompose the remaining giant simulation modules (m_rhs first) using the validated split pattern #1577) — smaller translation units mean finer-grained cache hits and less recompilation per touched file.