test: always-run canary smoke set spanning physics modules by sbryngelson · Pull Request #1590 · MFlowCode/MFC

sbryngelson · 2026-06-13T17:01:04Z

Addresses the selection-robustness half of #1587 / #1589 (the fix itself is #1588).

Why

#1556's silent viscous-GPU regression merged green because CI's coverage-based test
selection (--select-enforce) + sharding never ran a viscous case on the AMD-flang lane
for that diff — even though the suite already has very loud viscous tests (Re = 1e-4). The
gap is selection, not loudness or missing tests: any feature that fails silently (viscosity,
surface tension, bubbles, MHD, elasticity, chemistry, IBM, …) can slip through if its tests
aren't in the selected/sharded subset.

What

A curated always-run "canary" smoke set — one cheap, feature-dominant existing regression
case per major physics module — tagged so select_tests always includes them, regardless of
coverage overlap. A silent regression in any covered module then trips on every PR/lane.

test/case.py: add a canary: bool field to TestCase / TestCaseBuilder.
test/coverage.py: select_tests() always keeps canary cases (never rung-7 skipped).
test/cases.py: _CANARY_TRACES (12 traces, one per module) tagged in list_cases(), with
a validation guard so a renamed/removed trace is a loud error, not a silently dropped
canary.
test/test_coverage_unit.py: unit test that a canary survives a coverage-miss that would
otherwise skip it. (44/44 unit tests pass.)

Tagging existing cases → no new golden files. Net +60/−3 across 4 toolchain files.

The set (cheapest 1D/2D variant per module):

module	canary trace
viscosity (`m_viscous`)	`1D -> 1 Fluid(s) -> Viscous`
non-Newtonian (`m_hb_function`)	`1D -> 1 Fluid(s) -> Non-Newtonian -> nn=0.5`
surface tension (`m_surface_tension`)	`2D -> 2 Fluid(s) -> capillary=T -> model_eqns=3`
bubbles EE/QBMM (`m_qbmm`)	`1D -> Bubbles -> QBMM`
bubbles EL (`m_bubbles_EL`)	`2D -> Lagrange Bubbles -> One-way Coupling`
MHD + HLLD (`m_riemann_solver_hlld`)	`1D -> MHD -> HLLD`
hypoelasticity (`m_hypoelastic`)	`1D -> Hypoelasticity -> 1 Fluid(s)`
chemistry	`1D -> Chemistry -> Perfect Reactor`
IBM (`m_ibm`)	`2D -> 1 Fluid(s) -> IBM -> Circle -> slip`
phase change / 6-eq (`m_pressure_relaxation`)	`1D -> Phase Change model 5 -> 2 Fluid(s) -> model equation -> 3`
acoustic source (`m_acoustic_src`)	`1D -> Acoustic Source -> Sine -> Frequency`
body forces (`m_body_forces`)	`1D -> Bodyforces`

Verification

./mfc.sh test -l (runs the canary validation) green; 12/12 cases tagged.
./mfc.sh precheck green (all 7 checks, incl. toolchain unit tests).
The canaries are existing, already-passing cases; this PR's CI exercises them across all
lanes — the real end-to-end check. Worth a glance at total added CI time; the set is easy to
trim if needed (it's one frozenset).

Notes / follow-ups

Hyperelasticity has no cheap dedicated regression case (only example-driven) — a genuine
coverage gap, tracked for follow-up rather than blocking here.
Sharding partitions before selection, so each canary runs in exactly one shard; every shard
is its own CI job and all run, so canaries are always executed.
Remaining Harden AMD gpu-omp CI coverage + lint against silent declare-target regressions (follow-up to #1588) #1589 items (metamorphic feature-on≠off check; GPU_DECLARE static-array precheck
rule) are separate.

Coverage-based test selection + sharding can skip a feature's tests on a given lane -- this is how #1556's silent viscous-GPU regression merged green. Tag one cheap, feature-dominant regression case per major physics module (viscous, non-Newtonian, surface tension, bubbles EE/EL, MHD/HLLD, hypoelasticity, chemistry, IBM, phase change, acoustic, body forces) as canary=True so select_tests always includes them on every PR/lane. Tagging existing cases means no new golden files. list_cases validates the canary trace set, so a renamed trace is a loud error rather than a silently dropped canary. Adds a coverage unit test for the bypass. Hyperelasticity has no cheap regression case yet (follow-up). Refs #1587, #1589.

Copilot

Pull request overview

This PR hardens the Python toolchain’s coverage-based test selection by introducing an always-run “canary” smoke set: a curated subset of existing regression cases (one per major physics module) that should be executed on every lane even when normal coverage overlap would skip them.

Changes:

Add a canary: bool flag to test case definitions/builders and tag a curated set of traces as canaries.
Update coverage-based selection to never skip canary cases during per-test selection.
Add a unit test ensuring a canary case survives a coverage-miss scenario that would otherwise skip it.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
toolchain/mfc/test/test_coverage_unit.py	Adds a unit test validating that canary-tagged cases are selected even when their coverage does not overlap changes.
toolchain/mfc/test/coverage.py	Ensures canary cases are always included during per-test (rungs 5–7) selection.
toolchain/mfc/test/cases.py	Defines the canary trace set and validates/tag them in `list_cases()` (loud failure if a canary trace is missing).
toolchain/mfc/test/case.py	Adds `canary` to `TestCase`/`TestCaseBuilder` and propagates it when building runnable cases.

sbryngelson · 2026-06-13T17:22:24Z

    override_tol: Optional[float] = None
    restart_check: bool = False
    kind: str = "golden"
    convergence_spec: Optional[dict] = None
+    canary: bool = False


Good catch — fixed in a335da2 (the convergence branch now forwards canary=self.canary too). For the record it isn't a live bug: no convergence case is in _CANARY_TRACES, and select_tests reads canary off the TestCaseBuilder before to_case() runs (selection happens prior to the [_.to_case() for _ in cases] conversion). But you're right that it's an inconsistent footgun, so the branches now match.

PR review (Copilot): to_case() forwarded canary on the golden branch but not the convergence branch. Not a live bug -- no convergence case is a canary, and select_tests reads canary off the builder before to_case() conversion -- but it removes a footgun: a convergence case tagged canary would otherwise silently lose the flag.

codecov · 2026-06-13T19:30:07Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.94%. Comparing base (ac5774e) to head (a335da2).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1590   +/-   ##
=======================================
  Coverage   60.94%   60.94%           
=======================================
  Files          82       82           
  Lines       19922    19922           
  Branches     2924     2924           
=======================================
  Hits        12141    12141           
  Misses       5805     5805           
  Partials     1976     1976

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI review requested due to automatic review settings June 13, 2026 17:01

Copilot started reviewing on behalf of sbryngelson June 13, 2026 17:01 View session

Copilot AI reviewed Jun 13, 2026

View reviewed changes

sbryngelson mentioned this pull request Jun 13, 2026

Fix silent viscosity loss on AMD flang GPU (host-capture Re_size in the Riemann solvers) #1588

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: always-run canary smoke set spanning physics modules#1590

test: always-run canary smoke set spanning physics modules#1590
sbryngelson wants to merge 2 commits into
masterfrom
test-canary-smoke-set

sbryngelson commented Jun 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

sbryngelson Jun 13, 2026

Uh oh!

codecov Bot commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

sbryngelson commented Jun 13, 2026

Why

What

Verification

Notes / follow-ups

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

sbryngelson Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 13, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants