ci: split slow integration test jobs to reduce CI wall-clock by ~17%#3570
Conversation
PR SummaryLow Risk Overview The workflow replaces monolithic script invocations with narrower entries: EVM Module becomes Compat (
Reviewed by Cursor Bugbot for commit 72d923e. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3570 +/- ##
==========================================
- Coverage 59.17% 58.33% -0.84%
==========================================
Files 2215 2136 -79
Lines 183370 174701 -8669
==========================================
- Hits 108512 101919 -6593
+ Misses 65047 63696 -1351
+ Partials 9811 9086 -725
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
Verified the split preserves coverage — the six new scripts run exactly the same test files as the three they replace (EVM Module 4≡4, Interop 10≡10, dApp 3≡3), nothing dropped or duplicated. Wall-clock improvement looks real too. One thing worth confirming before merge: the old scripts ran each group sequentially against a single chain, whereas the split now runs them on separate fresh clusters. So if any test depended on on-chain state left behind by a sibling that ran earlier in the same script — e.g. AssociateTest/SeiEndpoints after EVMCompatability, or the steak/nft reorder — the split would break it silently. If each If test isolation isn't a concern here, LGTM. |
@bdchatham thanks for feedback. That is a valid concern but it seems that is not causing issue here. For dApp split: For |
Summary
Split 3 slow integration test jobs into parallel matrix entries, each targeting independent test files. No test logic changed — only how tests are distributed across runners.
EVM Interoperability (2 → 4 jobs):
EVM Interoperability (Pointer Tests)— 7 pointer-type Hardhat testsEVM Interoperability (Misc Tests)— SeiSolo, SetCodeTx, TransientStorageEVM Module (2 → 4 jobs):
EVM Module (Compat)— EVMCompatabilityTest.js (identified as 70–77% of runtime via timing instrumentation)EVM Module (Precompile & Endpoints)— EVMPrecompileTest.js, SeiEndpointsTest.js, AssociateTest.js + FlatKV stepsdApp Tests (1 → 2 jobs):
dApp Tests (Uniswap)— uniswapTest.js (219s, 62% of runtime)dApp Tests (NFT & Steak)— nftMarketplaceTests.js + SteakTests.js (133s combined)Results (measured over multiple CI runs)
Total: ~3.3 min saved (~17%) per CI run.
Approach
Used timing instrumentation (
[timing]log lines) onevm_tests.shanddapp_tests.shto identify the dominant test file in each job before splitting. Instrumentation removed after data was collected.Bug fix: AssociateTest flake on Autobahn
Fixed a pre-existing flake in
AssociateTest.jswhereverifyAssociationreadafterEvmimmediately afterassociateKeyStrictreturned, relying onwaitForBlocks(2)as a proxy for the EVM balance merge landing. On Autobahn this was insufficient. Replaced with awaitForConditionpoll that waits until the EVM balance actually reaches the expected value before asserting. Same failure reproduced on PR #3566 (pre-split, originalevm_tests.sh), confirming it is Autobahn-specific and unrelated to the job split.