Skip to content

test: always-run canary smoke set spanning physics modules#1590

Open
sbryngelson wants to merge 2 commits into
masterfrom
test-canary-smoke-set
Open

test: always-run canary smoke set spanning physics modules#1590
sbryngelson wants to merge 2 commits into
masterfrom
test-canary-smoke-set

Conversation

@sbryngelson

Copy link
Copy Markdown
Member

Addresses the selection-robustness half of #1587 / #1589 (the fix itself is #1588).

Why

#1556's silent viscous-GPU regression merged green because CI's coverage-based test
selection
(--select-enforce) + sharding never ran a viscous case on the AMD-flang lane
for that diff — even though the suite already has very loud viscous tests (Re = 1e-4). The
gap is selection, not loudness or missing tests: any feature that fails silently (viscosity,
surface tension, bubbles, MHD, elasticity, chemistry, IBM, …) can slip through if its tests
aren't in the selected/sharded subset.

What

A curated always-run "canary" smoke set — one cheap, feature-dominant existing regression
case per major physics module — tagged so select_tests always includes them, regardless of
coverage overlap. A silent regression in any covered module then trips on every PR/lane.

  • test/case.py: add a canary: bool field to TestCase / TestCaseBuilder.
  • test/coverage.py: select_tests() always keeps canary cases (never rung-7 skipped).
  • test/cases.py: _CANARY_TRACES (12 traces, one per module) tagged in list_cases(), with
    a validation guard so a renamed/removed trace is a loud error, not a silently dropped
    canary.
  • test/test_coverage_unit.py: unit test that a canary survives a coverage-miss that would
    otherwise skip it. (44/44 unit tests pass.)

Tagging existing cases → no new golden files. Net +60/−3 across 4 toolchain files.

The set (cheapest 1D/2D variant per module):

module canary trace
viscosity (m_viscous) 1D -> 1 Fluid(s) -> Viscous
non-Newtonian (m_hb_function) 1D -> 1 Fluid(s) -> Non-Newtonian -> nn=0.5
surface tension (m_surface_tension) 2D -> 2 Fluid(s) -> capillary=T -> model_eqns=3
bubbles EE/QBMM (m_qbmm) 1D -> Bubbles -> QBMM
bubbles EL (m_bubbles_EL) 2D -> Lagrange Bubbles -> One-way Coupling
MHD + HLLD (m_riemann_solver_hlld) 1D -> MHD -> HLLD
hypoelasticity (m_hypoelastic) 1D -> Hypoelasticity -> 1 Fluid(s)
chemistry 1D -> Chemistry -> Perfect Reactor
IBM (m_ibm) 2D -> 1 Fluid(s) -> IBM -> Circle -> slip
phase change / 6-eq (m_pressure_relaxation) 1D -> Phase Change model 5 -> 2 Fluid(s) -> model equation -> 3
acoustic source (m_acoustic_src) 1D -> Acoustic Source -> Sine -> Frequency
body forces (m_body_forces) 1D -> Bodyforces

Verification

  • ./mfc.sh test -l (runs the canary validation) green; 12/12 cases tagged.
  • ./mfc.sh precheck green (all 7 checks, incl. toolchain unit tests).
  • The canaries are existing, already-passing cases; this PR's CI exercises them across all
    lanes — the real end-to-end check. Worth a glance at total added CI time; the set is easy to
    trim if needed (it's one frozenset).

Notes / follow-ups

Coverage-based test selection + sharding can skip a feature's tests on a given lane -- this is how #1556's silent viscous-GPU regression merged green. Tag one cheap, feature-dominant regression case per major physics module (viscous, non-Newtonian, surface tension, bubbles EE/EL, MHD/HLLD, hypoelasticity, chemistry, IBM, phase change, acoustic, body forces) as canary=True so select_tests always includes them on every PR/lane. Tagging existing cases means no new golden files. list_cases validates the canary trace set, so a renamed trace is a loud error rather than a silently dropped canary. Adds a coverage unit test for the bypass. Hyperelasticity has no cheap regression case yet (follow-up). Refs #1587, #1589.
Copilot AI review requested due to automatic review settings June 13, 2026 17:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the Python toolchain’s coverage-based test selection by introducing an always-run “canary” smoke set: a curated subset of existing regression cases (one per major physics module) that should be executed on every lane even when normal coverage overlap would skip them.

Changes:

  • Add a canary: bool flag to test case definitions/builders and tag a curated set of traces as canaries.
  • Update coverage-based selection to never skip canary cases during per-test selection.
  • Add a unit test ensuring a canary case survives a coverage-miss scenario that would otherwise skip it.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
toolchain/mfc/test/test_coverage_unit.py Adds a unit test validating that canary-tagged cases are selected even when their coverage does not overlap changes.
toolchain/mfc/test/coverage.py Ensures canary cases are always included during per-test (rungs 5–7) selection.
toolchain/mfc/test/cases.py Defines the canary trace set and validates/tag them in list_cases() (loud failure if a canary trace is missing).
toolchain/mfc/test/case.py Adds canary to TestCase/TestCaseBuilder and propagates it when building runnable cases.

Comment on lines 375 to +379
override_tol: Optional[float] = None
restart_check: bool = False
kind: str = "golden"
convergence_spec: Optional[dict] = None
canary: bool = False

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in a335da2 (the convergence branch now forwards canary=self.canary too). For the record it isn't a live bug: no convergence case is in _CANARY_TRACES, and select_tests reads canary off the TestCaseBuilder before to_case() runs (selection happens prior to the [_.to_case() for _ in cases] conversion). But you're right that it's an inconsistent footgun, so the branches now match.

PR review (Copilot): to_case() forwarded canary on the golden branch but not the convergence branch. Not a live bug -- no convergence case is a canary, and select_tests reads canary off the builder before to_case() conversion -- but it removes a footgun: a convergence case tagged canary would otherwise silently lose the flag.
@codecov

codecov Bot commented Jun 13, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.94%. Comparing base (ac5774e) to head (a335da2).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1590   +/-   ##
=======================================
  Coverage   60.94%   60.94%           
=======================================
  Files          82       82           
  Lines       19922    19922           
  Branches     2924     2924           
=======================================
  Hits        12141    12141           
  Misses       5805     5805           
  Partials     1976     1976           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants