bench: add VM-level mega-evm vs revm benchmarks#330
Open
RealiCZ wants to merge 5 commits into
Open
Conversation
…les, oracle) Add four VM-level criterion benchmarks isolating mega-evm's overhead over vanilla revm where it was previously unmeasured: - SALT dynamic-gas isolation (sstore/create): revm_pinned floor, rex5 over EmptyExternalEnv (SALT short-circuits to a constant), rex5_salt over a crowded TestExternalEnvs; the rex5_salt-rex5 gap is the SALT bucket-multiplier path cost. - REX5 rows added to the default comparison workloads (SPEC_IDS) and the three block-executor benches (now run at REX4 and REX5). - Vanilla-revm baseline (PRAGUE) added to all precompile groups in comp_cost so each precompile shows a mega-vs-revm gap. - Oracle real-data bench measuring the oracle SLOAD hit-vs-miss path. Supporting test-utils-gated additions: MegaContext::new_with_ext_envs, TestExternalEnvs::with_default_bucket_capacity, the MegaWithEnv subject, and the register_env_isolation helpers. Existing Mega/EmptyExternalEnv rows and all prior bench numbers are unchanged.
Each Subject::run now returns the total gas_used over the workload (threaded through run_workload), and run_subjects prints one 'gas\t<row>\t<gas_used>' line per row so a downstream page can derive MGas/s = gas / median-ns. The gas run lives inside the bench_function closure, so it executes only for rows Criterion's name/--test filter selects -- never in the registration path (which would run unfiltered work and could fail on a skipped row). Criterion invokes that closure many times (per warmup batch and per sample), so a Cell guard emits the gas line exactly once per row. Criterion Throughput is not used: it must be set before bench_function, and the bencher text output the CI parses carries no throughput field. Covers the shared register_* path (transact, revm_bench, mega_bench, and the SALT/oracle isolation benches); comp_cost already prints gas and the block-executor benches are out of scope.
The test/bench-only constructor was exercised only by benches (not counted by coverage), leaving its 8 lines uncovered. Add a unit test that builds a MegaContext over a TestExternalEnvs via new_with_ext_envs and checks the spec and that the supplied SALT env is wired through the dynamic-gas cache.
Removes the gas-per-row signal added for a future comparison page. With no consumer yet, its output format is undefined (group key, units, comp_cost revm row), so it is deferred to the page work where the format will be pinned down. The benchmark coverage itself is unaffected; Subject::run returns to (), and run_subjects no longer prints gas lines.
🧬 Mutation testing — ✅ PASSNothing to test — no mutants were generated on the changed lines. |
🧬 Mutation testing — ✅ PASSNothing to test — no mutants were generated (1 unviable, 0 timed out). |
Contributor
There was a problem hiding this comment.
✅ Clean
- Reviewed VM-level bench additions: SALT dynamic-gas isolation, REX5 rows, precompile vs revm baselines, oracle real-data bench, and the
MegaWithEnv+new_with_ext_envs+with_default_bucket_capacityenablers. - Production code change is limited to a
#[cfg(any(test, feature = "test-utils"))]thin wrapper around the existingnew_with_shared_ext_envs, and theTestExternalEnvs::default_bucket_capacityfield is additive (Nonepreserves prior behavior). Both are covered by new unit tests. - Bench wiring is internally consistent:
SPEC_IDSgainsrex5;register_env_isolation/register_env_isolation_mega_onlycorrectly avoid duplicate criterion IDs;bench_precompile_groupkeeps the revm row symmetric with the mega rows viablack_box(r)and anis_successassert, with bytecode-levelassert_stack_value(0, 1)already guarding precompile execution. - No new actionable findings. No prior automated review threads to recap.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds VM-level benchmarks that measure mega-evm's overhead over vanilla revm in areas the existing suite did not cover. All changes are additive: the existing subjects and prior bench numbers are unchanged.
What this adds
revm_pinned(floor), mega overEmptyExternalEnv(the SALT bucket path short-circuits to a constant), and mega over a crowdedTestExternalEnvs(real FNV-hash + map-lookup + multiply per touched bucket). The active − baseline gap is the SALT bucket-multiplier path cost, which no prior bench exercised. LOG is intentionally excluded — its storage gas uses a fixed 10× multiplier unrelated to SALT buckets.rex5vs revm row) and the three block-executor benches now run at both REX4 and REX5.comp_costgains a vanilla-revm baseline row (revm at PRAGUE, which has BLS12-381 and predates the EIP-7825 tx-gas cap), so each precompile shows a mega-vs-revm gap. The revm row is assert-guarded, so a precompile that does not actually execute fails loudly.get_oracle_storagepath with populated storage. The gap is the oracle SLOAD hit-vs-miss structural cost (hit early-returns; miss falls through to the journal SLOAD), not an isolated lookup — the path forks structurally so the lookup cannot be isolated further.MegaContextconstructor over a configurable external env (with a unit test), aTestExternalEnvsuniform-bucket-capacity setter, aMegaWithEnvsubject, and the shared env-isolation registration helpers.Scope
VM-level only. Comparing against other clients (geth / reth / nethermind / erigon / besu / nimbus) and full-node load testing require the complete MegaETH node and are out of scope here. The comparison page (per-commit trigger, time-series storage, regression alerting) and its MGas/s data source are a follow-up — the per-row gas signal that was initially included here is deferred to that work, where the output format can be pinned to a real consumer.
Testing
cargo bench -p mega-evm --bench <target> -- --test): mega_bench, comp_cost, block_bench, transact.new_with_ext_envsconstructor coverage test.cargo fmt --all --checkandcargo clippy --workspace --benches --all-features --lockedare clean.