test: always-run canary smoke set spanning physics modules#1590
test: always-run canary smoke set spanning physics modules#1590sbryngelson wants to merge 2 commits into
Conversation
Coverage-based test selection + sharding can skip a feature's tests on a given lane -- this is how #1556's silent viscous-GPU regression merged green. Tag one cheap, feature-dominant regression case per major physics module (viscous, non-Newtonian, surface tension, bubbles EE/EL, MHD/HLLD, hypoelasticity, chemistry, IBM, phase change, acoustic, body forces) as canary=True so select_tests always includes them on every PR/lane. Tagging existing cases means no new golden files. list_cases validates the canary trace set, so a renamed trace is a loud error rather than a silently dropped canary. Adds a coverage unit test for the bypass. Hyperelasticity has no cheap regression case yet (follow-up). Refs #1587, #1589.
There was a problem hiding this comment.
Pull request overview
This PR hardens the Python toolchain’s coverage-based test selection by introducing an always-run “canary” smoke set: a curated subset of existing regression cases (one per major physics module) that should be executed on every lane even when normal coverage overlap would skip them.
Changes:
- Add a
canary: boolflag to test case definitions/builders and tag a curated set of traces as canaries. - Update coverage-based selection to never skip canary cases during per-test selection.
- Add a unit test ensuring a canary case survives a coverage-miss scenario that would otherwise skip it.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| toolchain/mfc/test/test_coverage_unit.py | Adds a unit test validating that canary-tagged cases are selected even when their coverage does not overlap changes. |
| toolchain/mfc/test/coverage.py | Ensures canary cases are always included during per-test (rungs 5–7) selection. |
| toolchain/mfc/test/cases.py | Defines the canary trace set and validates/tag them in list_cases() (loud failure if a canary trace is missing). |
| toolchain/mfc/test/case.py | Adds canary to TestCase/TestCaseBuilder and propagates it when building runnable cases. |
| override_tol: Optional[float] = None | ||
| restart_check: bool = False | ||
| kind: str = "golden" | ||
| convergence_spec: Optional[dict] = None | ||
| canary: bool = False |
There was a problem hiding this comment.
Good catch — fixed in a335da2 (the convergence branch now forwards canary=self.canary too). For the record it isn't a live bug: no convergence case is in _CANARY_TRACES, and select_tests reads canary off the TestCaseBuilder before to_case() runs (selection happens prior to the [_.to_case() for _ in cases] conversion). But you're right that it's an inconsistent footgun, so the branches now match.
PR review (Copilot): to_case() forwarded canary on the golden branch but not the convergence branch. Not a live bug -- no convergence case is a canary, and select_tests reads canary off the builder before to_case() conversion -- but it removes a footgun: a convergence case tagged canary would otherwise silently lose the flag.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1590 +/- ##
=======================================
Coverage 60.94% 60.94%
=======================================
Files 82 82
Lines 19922 19922
Branches 2924 2924
=======================================
Hits 12141 12141
Misses 5805 5805
Partials 1976 1976 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Addresses the selection-robustness half of #1587 / #1589 (the fix itself is #1588).
Why
#1556's silent viscous-GPU regression merged green because CI's coverage-based test
selection (
--select-enforce) + sharding never ran a viscous case on the AMD-flang lanefor that diff — even though the suite already has very loud viscous tests (
Re = 1e-4). Thegap is selection, not loudness or missing tests: any feature that fails silently (viscosity,
surface tension, bubbles, MHD, elasticity, chemistry, IBM, …) can slip through if its tests
aren't in the selected/sharded subset.
What
A curated always-run "canary" smoke set — one cheap, feature-dominant existing regression
case per major physics module — tagged so
select_testsalways includes them, regardless ofcoverage overlap. A silent regression in any covered module then trips on every PR/lane.
test/case.py: add acanary: boolfield toTestCase/TestCaseBuilder.test/coverage.py:select_tests()always keeps canary cases (never rung-7 skipped).test/cases.py:_CANARY_TRACES(12 traces, one per module) tagged inlist_cases(), witha validation guard so a renamed/removed trace is a loud error, not a silently dropped
canary.
test/test_coverage_unit.py: unit test that a canary survives a coverage-miss that wouldotherwise skip it. (44/44 unit tests pass.)
Tagging existing cases → no new golden files. Net +60/−3 across 4 toolchain files.
The set (cheapest 1D/2D variant per module):
m_viscous)1D -> 1 Fluid(s) -> Viscousm_hb_function)1D -> 1 Fluid(s) -> Non-Newtonian -> nn=0.5m_surface_tension)2D -> 2 Fluid(s) -> capillary=T -> model_eqns=3m_qbmm)1D -> Bubbles -> QBMMm_bubbles_EL)2D -> Lagrange Bubbles -> One-way Couplingm_riemann_solver_hlld)1D -> MHD -> HLLDm_hypoelastic)1D -> Hypoelasticity -> 1 Fluid(s)1D -> Chemistry -> Perfect Reactorm_ibm)2D -> 1 Fluid(s) -> IBM -> Circle -> slipm_pressure_relaxation)1D -> Phase Change model 5 -> 2 Fluid(s) -> model equation -> 3m_acoustic_src)1D -> Acoustic Source -> Sine -> Frequencym_body_forces)1D -> BodyforcesVerification
./mfc.sh test -l(runs the canary validation) green; 12/12 cases tagged../mfc.sh precheckgreen (all 7 checks, incl. toolchain unit tests).lanes — the real end-to-end check. Worth a glance at total added CI time; the set is easy to
trim if needed (it's one
frozenset).Notes / follow-ups
coverage gap, tracked for follow-up rather than blocking here.
is its own CI job and all run, so canaries are always executed.
GPU_DECLAREstatic-array precheckrule) are separate.