Skip to content

Espresso 3a: Fallback batcher (re-hosted)#458

Open
QuentinI wants to merge 51 commits into
celo-rebase-18from
espresso/batcher-fallback
Open

Espresso 3a: Fallback batcher (re-hosted)#458
QuentinI wants to merge 51 commits into
celo-rebase-18from
espresso/batcher-fallback

Conversation

@QuentinI

@QuentinI QuentinI commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Relatively small change based on #445 - makes fallback batcher authenticate its transactions.

This is #448, re-hosted from an in-repo branch (now properly stacked).

Comment thread op-batcher/flags/flags.go
"(based on L1 tip time) and the verifier's gate (based on the containing " +
"L1 block's time). Has no effect outside the boundary window around the " +
"EspressoTime hardfork.",
Value: 5 * time.Minute,

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mirrored from #448 (unresolved review thread) to keep tracking it here after the re-host. Original location: op-batcher/flags/flags.go:172.

Original transcript:


@palango — 2026-06-02:

This is a nice idea, but we need to see how to set this value.

I think we either set this to something big and safe (1-2h) and accept that we're sending a couple of unused tx, or remove it at all and shutdown the batcher before the switch (causing more work for devops).

@QuentinI QuentinI Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palango technically it's not necessary at all. If the scenario it tries to prevent still happens, the batcher will just detect it here:

m.Warn("sequencer did not make expected progress",

This will cause at most several minutes of delay before it re-submits, now through the authenticated path. So there shouldn't be any need to shut the batcher down in any case.

//
// The contract's fallback path checks msg.sender against systemConfig.batcherHash(), so no
// separate signature is needed — the L1 transaction is already signed by the TxManager's key.
func (l *BatchSubmitter) sendTxWithFallbackAuth(txdata txData, isCancel bool, candidate *txmgr.TxCandidate, queue TxSender[txRef], receiptsCh chan txmgr.TxReceipt[txRef]) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mirrored from #448 (unresolved review thread) to keep tracking it here after the re-host. Original location: op-batcher/batcher/fallback_auth.go:42.

Original transcript:


@palango — 2026-06-02:

Reverted auth tx can be reported as success. sendTxWithFallbackAuth checks only err after l.Txmgr.Send and never checks verificationReceipt.Status in op-batcher/batcher/fallback_auth.go:85. txmgr.Send returns receipt, nil on receipt arrival in op-service/txmgr/txmgr.go:743, and derivation ignores failed auth receipts in op-node/rollup/derive/ batch_authenticator.go:85.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as part of the rework in 6f42f54

Comment thread op-batcher/batcher/espresso_driver.go Outdated
}
l.authGroup.Go(
func() error {
l.sendTxWithFallbackAuth(txdata, isCancel, candidate, queue, receiptsCh)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mirrored from #448 (unresolved review thread) to keep tracking it here after the re-host. Original location: op-batcher/batcher/espresso_driver.go:67.

Original transcript:


@palango — 2026-06-02:

The normal batch submission path sends every batch tx through txQueue.Send:

op-batcher/batcher/driver.go:1061

That queue is created with MaxPendingTransactions:

op-batcher/batcher/driver.go:515

and txmgr.Queue.Send explicitly assigns nonces synchronously so transactions
confirm in the order they are sent. This is important for Holocene, where
frames for a channel must arrive in order.

The fallback-auth path bypasses that queue. Once fallback auth is required,
dispatchAuthenticatedSendTx starts a goroutine in authGroup:

op-batcher/batcher/espresso_driver.go:65

and that goroutine calls l.Txmgr.Send directly for both the auth tx and the
batch inbox tx:

op-batcher/batcher/fallback_auth.go:85
op-batcher/batcher/fallback_auth.go:95

Txmgr.Send is concurrency-safe, but it only preserves the order in which
callers actually reach nonce assignment. With up to fallbackAuthGroupLimit = 128 goroutines racing, that order is no longer the publishing loop’s frame
order. As a result, batch inbox txs from the same channel can receive nonces
in a different order than the channel manager emitted them, and L1 inclusion
order follows those nonces.

That can violate Holocene strict frame ordering and cause derivation to drop
later/non-contiguous frames.

This is also a regression from the default config, where max-pending-tx
defaults to 1; operators who configured one-at- a-time submission no longer
get that behavior for fallback-authenticated batches.

The fix should preserve the original batch order across the whole auth+inbox
pair. Simply queueing inbox txs after concurrent auth confirmation is not
sufficient, because auth confirmations can complete out of order. The
fallback-auth path should either be serialized, or use an ordered mechanism
that keeps the original frame order while still ensuring each inbox tx is
posted only after its matching auth tx succeeds.

@QuentinI QuentinI Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a serious oversight.
Reworked in 6f42f54: queueing is serialized, the goroutine is spawned only to convert the authentication and batch submission receipt pair to a single receipt indicating transaction status. This should both respect the number of in-flight transactions allowed and preserve Holocene ordering.
Added some tests for this behaviour as well.

@QuentinI QuentinI force-pushed the espresso/derivation-pipeline branch from a587307 to 34c9d0d Compare June 17, 2026 16:21
@QuentinI QuentinI force-pushed the espresso/batcher-fallback branch from f840eee to 12b46a6 Compare June 17, 2026 16:22
@QuentinI QuentinI force-pushed the espresso/derivation-pipeline branch from 34c9d0d to 60f92ee Compare June 17, 2026 16:27
@QuentinI QuentinI force-pushed the espresso/batcher-fallback branch 4 times, most recently from 0f8d63c to 9330de1 Compare June 18, 2026 13:25
@QuentinI QuentinI force-pushed the espresso/derivation-pipeline branch from 8a45116 to 266fe6f Compare June 18, 2026 14:25
@QuentinI QuentinI force-pushed the espresso/batcher-fallback branch from 9330de1 to 1616378 Compare June 18, 2026 14:25
QuentinI and others added 17 commits June 18, 2026 18:38
Adds the Espresso-introduced contracts and the minimum supporting changes
required for them to compile, test, and pass the contract checks.

New contracts and scripts:

- src/L1/BatchAuthenticator.sol and interfaces/L1/IBatchAuthenticator.sol
  (upgradeable contract that authenticates batch transactions, with switching
  between Espresso and fallback batchers)
- scripts/deploy/DeployBatchAuthenticator.s.sol and
  scripts/deploy/DeployEspresso.s.sol
- test/L1/BatchAuthenticator.t.sol and test/mocks/MockEspressoTEEVerifiers.sol
- snapshots/{abi,storageLayout}/BatchAuthenticator.json
- snapshots/semver-lock.json entry for BatchAuthenticator

New submodules:

- lib/espresso-tee-contracts (interfaces required by BatchAuthenticator)
- lib/openzeppelin-contracts-upgradeable-v5 (OZ v5 used by BatchAuthenticator
  via OwnableUpgradeable)

Supporting changes (Espresso-driven):

- foundry.toml: remappings for OZ v5 and espresso-tee-contracts; ignored
  warning codes for vendored libs; OOM-safe jobs settings; via-ir profile.
- justfile: fix-proxy-artifact recipe to handle OZ v5 shadowing Proxy/ProxyAdmin
  artifacts; build/coverage hooks.
- src/universal/Proxy.sol, src/universal/ProxyAdmin.sol: pin pragma to exact
  0.8.15 so they stay in their own compilation group and never emit PUSH0.
- src/universal/ReinitializableBase.sol: loosen pragma to ^0.8.15 so
  BatchAuthenticator (compiled with OZ v5) can import it.
- scripts/* and test/*: disambiguate Proxy artifact lookups to
  src/universal/Proxy.sol:Proxy (avoids OZ v5 proxy/Proxy.sol shadow).
- scripts/checks: bypass interface checks for artifacts originating from lib/;
  add Espresso-related contract names to exclude lists; pragma exclusions for
  Proxy/ProxyAdmin/BatchAuthenticator.
- test/vendor/Initializable.t.sol: exclude BatchAuthenticator (deployed by a
  separate Espresso script).

Co-authored-by: OpenCode <noreply@opencode.ai>
Co-authored-by: piersy <pierspowlesland@gmail.com>
- strict-pragma: remove unneeded exclusions for src/universal/Proxy.sol
  and src/universal/ProxyAdmin.sol — both already use strict
  'pragma solidity 0.8.15;', so the entries (and their misleading
  comment claiming '^') were dead.
- interfaces: move the Espresso excludeContracts block out of the
  upstream-shared area and down next to the Celo block, with one
  entry per line to match the surrounding style. Localizes future
  rebase deltas.

Co-authored-by: OpenCode <noreply@opencode.ai>
Inline the EspressoTEEVerifier deployment in DeployEspresso.s.sol so it
no longer imports lib/espresso-tee-contracts/scripts/DeployTEEVerifier.s.sol
or DeployNitroTEEVerifier.s.sol. The upstream scripts pulled OZ v5's
TransparentUpgradeableProxy (and its auto-deployed ProxyAdmin) into the
OP artifact tree, shadowing src/universal/ProxyAdmin.sol and forcing a
~90-line fix-proxy-artifact justfile recipe.

The TEEVerifier is now deployed behind src/universal/Proxy.sol +
src/universal/ProxyAdmin.sol, matching how BatchAuthenticator is
deployed in the same script. ERC-1967 slots are unchanged, so external
callers see no difference.

The raw vm.getCode("ProxyAdmin") lookups in the deploy scripts and
BatchAuthenticator tests are switched to the explicit artifact path
vm.getCode("forge-artifacts/ProxyAdmin.sol/ProxyAdmin.json") to
deterministically resolve the default compilation profile's bytecode
(the dispute profile transitively compiles ProxyAdmin at optimizer_runs=5000,
creating a second artifact that broke unqualified lookups).

The fix-proxy-artifact recipe and its 5 callsites are removed.
Cherry-picked from piersy's commit 5d0a803 on PR #443.

Walks the dual-batcher state machine: Espresso path → switchBatcher →
fallback path → switchBatcher → Espresso path. Asserts every transition
emits the expected event, that signer registration survives the
round-trip, and that re-issuing the same call after a mode flip changes
the outcome (the previously-valid Espresso signature is no longer
consulted on the fallback path).

Co-authored-by: Piers Powlesland <pierspowlesland@gmail.com>
Co-authored-by: OpenCode <noreply@opencode.ai>
piersy and others added 20 commits June 18, 2026 18:38
Adds derivation-pipeline support for the BatchAuthenticator contract
introduced in the previous PR. Stacks on the contracts PR.

Introduces an L2-timestamp hardfork (EspressoEnforcementTime) gating all
post-fork derivation semantics. Pre-fork, derivation behaves exactly as
upstream Optimism: batches are accepted based on the L1 transaction
sender matching the SystemConfig batcher address. Post-fork, batches are
authenticated via BatchInfoAuthenticated(bytes32) events emitted by the
BatchAuthenticator contract, and sender-based authorization is rejected.

Adds CollectAuthenticatedBatches which scans L1 receipts over a
configurable lookback window (default 100 blocks) to build the set of
authenticated batch commitment hashes for each L1 block being derived.
Results are cached in two reorg-safe (block-hash-keyed) LRU caches: one
for receipt-derived event sets, one for L1BlockRef resolution. For
consecutive L1 blocks the lookback windows overlap by ~99 blocks, so
only one new block's receipts need to be fetched on each call.

Adds rollup.Config fields: EspressoEnforcementTime *uint64,
BatchAuthenticatorAddress, BatchAuthLookbackWindow.

Adds unit tests for batch authentication across calldata, blob, and
altda data sources.

Co-authored-by: OpenCode <noreply@opencode.ai>
Matches upstream Optimism hardfork naming convention (RegolithTime,
EcotoneTime, IsthmusTime, ...). All hardforks enforce a new set of
rules, so the "Enforcement" qualifier was redundant.

Renames:
  EspressoEnforcementTime    -> EspressoTime    (rollup.Config field)
  IsEspressoEnforcement      -> IsEspresso      (rollup.Config method)
  espressoEnforcementTime    -> espressoTime    (DataSourceConfig field)
  isEspressoEnforcement      -> isEspresso      (DataSourceConfig method)
  espresso_enforcement_time  -> espresso_time   (JSON tag, forEachFork log key)
  "Espresso Enforcement"     -> "Espresso"      (forEachFork display name)

Also rewords prose docstrings: "EspressoEnforcement" -> "Espresso",
"Pre/Post-EspressoEnforcement" -> "Pre/Post-Espresso".

Addresses PR feedback: #445 (comment)

Co-authored-by: OpenCode <noreply@opencode.ai>
Co-authored-by: OpenCode <noreply@opencode.ai>
EspressoTime is a conceptually L2-timestamp fork activation time, but the
derivation pipeline gates on it by comparing against the L1 origin time of
the enclosing L1 block. Update the doc comments to reflect this, consistent
with the existing blob_data_source.go/calldata_source.go comments.

Co-authored-by: OpenCode <noreply@opencode.ai>
Adapts the derivation pipeline to PR #443's updated BatchAuthenticator
event and enforces that a batch is submitted by the same address that
authenticated it.

The contract event changed from BatchInfoAuthenticated(bytes32 indexed
commitment) to BatchInfoAuthenticated(bytes32 commitment, address indexed
caller): the signature hash (Topics[0]) changes, the commitment moves into
the unindexed log data, and the caller becomes the indexed Topics[1]. The
event scanner is updated accordingly (new ABI string, commitment read from
Data[:32], with a length guard).

CollectAuthenticatedBatches and collectAuthEventsFromReceipts now return
map[commitment]caller instead of a commitment set. Post-fork,
isBatchTxAuthorized recovers the batch transaction's L1 sender and accepts
the batch only if it equals the caller that emitted the auth event. This
binds each batch to the address that authenticated it, so a batch
authenticated by one batcher cannot be submitted by another. When the same
commitment is authenticated in more than one block within the lookback
window, the newest event's caller is retained.

Removes FindBatchAuthEvent and its test: it had no production caller (the
pipeline uses CollectAuthenticatedBatches) and, lacking a caller check,
diverged from the enforced sender-equals-caller semantics.

Updates the data-source tests to set the auth event caller to the batch
tx sender, and adds cases covering acceptance for a non-batcher sender
matching its caller, rejection when the sender differs from the caller,
and newest-caller-wins on duplicate authentication.

Co-authored-by: OpenCode <noreply@opencode.ai>
…stances

Remove package-level singleton LRU caches and sync.Once from
batch_authenticator.go. Instead, BatchAuthCaches are constructed in
NewDataSourceFactory and threaded through DataSourceConfig to
CollectAuthenticatedBatches. This eliminates the global mutable state,
the cache-size-locked-by-first-caller problem, and the data race in
resetBatchAuthCaches. Tests now construct their own caches and are safe
for t.Parallel().
Two follow-ups required after cherry-picking the batch-auth cache
dependency-injection and rollup.Config-in-DataSourceConfig changes:

- batch_authenticator_test.go: thread the BatchAuthCaches argument
  through the duplicate-authentication caller test, which post-dates the
  cache-DI change and so was not updated by it.
- IsEspresso: guard against a nil Config receiver. Holding the
  rollup.Config in DataSourceConfig means IsEspresso is now reached via
  DataSourceConfig.rollupCfg, which is nil for non-Espresso data-source
  tests; the previous DataSourceConfig.isEspresso method was nil-safe.

Co-authored-by: OpenCode <noreply@opencode.ai>
Adds derivation tests that close two gaps in the Espresso batch-auth
coverage. Both data sources gate event-based authentication on
IsEspresso(ref.Time), and each implements that gate separately, but no
test exercised the gate flipping across activation or verified that
multiple batches are matched to their own commitments.

Fork-boundary tests (TestDataFromEVMTransactionsForkBoundary for the
calldata source, TestDataAndHashesFromTxsForkBoundary for the blob
source) reuse a single DataSourceConfig with EspressoTime set and vary
only ref.Time across the activation boundary:

  - pre-fork (ref.Time < EspressoTime): the batcher tx is accepted via
    upstream sender-based auth, and an empty L1 mock asserts that zero
    receipt scanning occurs;
  - non-batcher senders remain rejected pre-fork;
  - activation block (ref.Time == EspressoTime): the same batcher tx is
    rejected without a BatchInfoAuthenticated event and accepted with one.

This pins the ">=" gate in both directions: a regression to ">" makes the
activation block either accept an unauthenticated batch or skip the event
scan, failing the test. The blob-source copy drives a type-2 calldata tx,
the shape an Ecotone-active, calldata-batching chain (e.g. Celo) submits
through the blob source.

TestDataFromEVMTransactionsEventAuth gains a "multiple authenticated txs
each accepted for their own commitment" case: two distinct batches, each
authenticated by its own commitment, must both be accepted in order and
mapped to their own data, verifying each tx is matched against its own
commitment rather than to "some" authenticated entry.

Test-only change; no production code is modified.
Moves the Espresso batch-auth tests out of the upstream calldata/blob
data-source test files into new espresso_calldata_source_test.go and
espresso_blob_data_source_test.go, and renames batch_authenticator_test.go
to espresso_batch_authenticator_test.go.

Pure test relocation: no production code and no test logic, names,
comments, or assertions change. The goal is to keep all Espresso-specific
tests in espresso_-prefixed files so upstream changes to the shared
data-source test files cannot conflict with them on rebase.

Moved into espresso_calldata_source_test.go (from calldata_source_test.go):
the mockAuthEvents helper, TestDataFromEVMTransactionsEventAuth (including
the "multiple authenticated txs each accepted for their own commitment"
subtest), and TestDataFromEVMTransactionsForkBoundary.

Moved into espresso_blob_data_source_test.go (from blob_data_source_test.go):
TestDataAndHashesFromTxsEventAuth and TestDataAndHashesFromTxsForkBoundary.

The upstream tests (TestDataFromEVMTransactions, TestDataAndHashesFromTxs,
TestFillBlobPointers, TestBlobDataSourceL1FetcherErrors) stay in their
original files. The only import adjustment is dropping io, common/hexutil,
and op-service/txmgr from the new blob file (used only by the kept
TestBlobDataSourceL1FetcherErrors); both original import blocks are
unchanged.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Regenerated against PR #443's BatchAuthenticator.sol via forge build +
abigen. Includes the new history-based API (espressoBatcherAt,
espressoBatcherAtBlock, espressoBatcherHistoryLength, setEspressoBatcher)
and the EspressoBatcherUpdated(address,address,uint64) event with the
fromBlock parameter; drops the removed paused() function.

Consumed by the fallback batcher (next commit) to read activeIsEspresso
and pack authenticateBatchInfo calldata. The TEE batcher in a follow-up
PR will use the same binding.

Co-authored-by: OpenCode <noreply@opencode.ai>
Add the fallback (non-TEE) batcher's BatchAuthenticator integration:

- op-batcher/batcher/fallback_auth.go: sendTxWithFallbackAuth path that
  posts authenticateBatchInfo before the batch tx, with a deadline check
  against the batch's L1 inclusion window. Computes the batch commitment
  hash from either calldata or concatenated blob versioned hashes.
- op-batcher/batcher/espresso_active.go: hasBatchAuthenticator (does
  this rollup use BatchAuthenticator at all?) and isFallbackAuthRequired
  (gates fallback authentication on Config.IsEspresso(tip.Time + lead)).
  The Espresso hardfork predicate is consulted with the configured
  FallbackAuthLeadTime added to the L1 tip, so the batcher starts
  authenticating slightly before the verifier requires it. This absorbs
  worst-case L1 inclusion delay between the batcher's decision time
  (L1 tip) and the verifier's evaluation time (containing L1 block).
- op-batcher/batcher/espresso_driver.go: the authGroup bookkeeping
  (initAuthGroup, waitForAuthGroup, fallbackAuthGroupLimit) and the
  dispatchAuthenticatedSendTx fan-out used by driver.go sendTx.

Small wiring edits to upstream files:

- op-batcher/flags/flags.go: register --espresso.fallback-auth-lead-time
  (default 5m).
- op-batcher/batcher/config.go: thread the FallbackAuthLeadTime through
  CLIConfig.
- op-batcher/batcher/service.go: BatcherConfig.FallbackAuthLeadTime
  field, propagated from CLIConfig in initFromCLIConfig.
- op-batcher/batcher/driver.go: extend L1Client to embed
  bind.ContractBackend (required by the BatchAuthenticator binding), add
  authGroup field to BatchSubmitter, call initAuthGroup in
  NewBatchSubmitter, call dispatchAuthenticatedSendTx in sendTx, call
  waitForAuthGroup in publishingLoop's shutdown drain.
- op-batcher/batcher/driver_test.go: embed bind.ContractBackend in
  fakeL1Client so the AltDA tests still satisfy L1Client.

The fallback batcher does nothing when the rollup config has no
BatchAuthenticator address, and it falls through to the upstream
queue.Send path pre-EspressoTime. Cancel transactions always take the
upstream path. No new external dependencies are added; the only third-
party Go modules needed are already in PR #445.

The TEE batcher is a separate PR stacked on top.

Co-authored-by: OpenCode <noreply@opencode.ai>
The Espresso fallback-auth path previously dispatched each auth+batch pair to a
separate errgroup and called Txmgr.Send directly, bypassing the operator's
MaxPendingTransactions bound and assigning nonces in a nondeterministic order.
Under Holocene the frame queue drops out-of-order frames instead of buffering
them, so the batcher's L1 txs must land in submission order.

Submit the authenticateBatchInfo tx and the batch inbox tx through the same
ordered queue.Send path as the non-fallback batcher, in submission order, so
the auth tx takes the lower nonce and is mined first, and both txs stay under
MaxPendingTransactions. A watcher goroutine (tracked by authGroup so the
publishing loop drains it before closing receiptsCh) collects both receipts on
private channels, fails the pair if the auth tx reverted (a reverted
authenticateBatchInfo emits no event, so the verifier would silently drop the
batch), runs the lookback-window check, and emits a single synthetic receipt
for the batch txData.

Co-authored-by: OpenCode <noreply@opencode.ai>
@QuentinI QuentinI force-pushed the espresso/batcher-fallback branch from 1616378 to 25a6c63 Compare June 18, 2026 16:40
@QuentinI QuentinI force-pushed the espresso/derivation-pipeline branch from 266fe6f to 2782976 Compare June 18, 2026 16:40
Base automatically changed from espresso/derivation-pipeline to celo-rebase-18 June 22, 2026 09:01
@QuentinI QuentinI marked this pull request as ready for review June 22, 2026 12:14

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 49c4321cf1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// nonces are assigned in submission order. Each Send blocks here when the queue is at its
// MaxPendingTransactions limit.
queue.Send(transactionReference, verifyCandidate, authReceiptCh)
queue.Send(transactionReference, *candidate, batchReceiptCh)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Serialize blob batches behind the auth transaction

When blob DA is used with --max-pending-tx set above the default 1 (or 0 for unlimited), this immediately submits a non-blob authenticateBatchInfo tx and then the blob batch tx from the same sender while the auth tx is still pending. The txmgr interface documents that callers mixing Blob and non-Blob transactions must handle ErrAlreadyReserved (op-service/txmgr/txmgr.go:81-83), so this path can repeatedly reject/requeue fallback-auth blob batches instead of posting them. The batch send should wait until the auth tx is mined, or otherwise avoid having both tx types pending from the same account.

Useful? React with 👍 / 👎.


distance := new(big.Int).Sub(batchResult.Receipt.BlockNumber, authResult.Receipt.BlockNumber)
lookbackWindow := new(big.Int).SetUint64(derive.BatchAuthLookbackWindow)
if distance.Sign() < 0 || distance.Cmp(lookbackWindow) >= 0 {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that op-node loopback window includes differences of [0, loopbackWindow] blocks, while the batcher includes [0, loopbackWindow - 1] (so a block delta of exactly loopbackWindow would be true here). Is this expected?

@palango palango left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline findings M1–M4 from a multi-persona review of the fallback-auth path. All are MEDIUM (no externally exploitable issue found; the crypto binding is sound). Separately there is one HIGH (H1): the batcher never reads the contract's activeIsEspresso flag, so if the fallback batcher runs while it's still true (the default deploy state) every authenticateBatchInfo reverts → infinite retry → silent safe-head stall + gas bleed. Happy to expand on H1 and the test gaps in the PR thread.


distance := new(big.Int).Sub(batchResult.Receipt.BlockNumber, authResult.Receipt.BlockNumber)
lookbackWindow := new(big.Int).SetUint64(derive.BatchAuthLookbackWindow)
if distance.Sign() < 0 || distance.Cmp(lookbackWindow) >= 0 {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M1 — Lookback-window off-by-one (batcher is one block stricter than the verifier).

The verifier accepts an auth event at distance [0, 100] inclusive (101 blocks): CollectAuthenticatedBatches starts currentBlock := ref (distance 0), processes the block, then breaks on ref.Number - currentBlock.Number >= BatchAuthLookbackWindow — so the block at distance 100 is processed before the break (op-node/rollup/derive/batch_authenticator.go:176).

This check rejects distance >= 100, i.e. it re-queues at exactly distance 100 even though the verifier would accept it. The direction is safe (stricter → unnecessary resubmission, never a silent drop), but the two checks should agree. Either use distance.Cmp(lookbackWindow) > 0 to match the verifier, or document the deliberate 1-block safety margin. The op-node/rollup/derive/params.go:24-27 comment is also imprecise about the inclusive 101-block reality.

Comment on lines +84 to +85
authReceiptCh := make(chan txmgr.TxReceipt[txRef], 1)
batchReceiptCh := make(chan txmgr.TxReceipt[txRef], 1)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M2 — max-pending-tx defaults to 1, and this cap-1 buffering is load-bearing.

With MaxPendingTransactions = 1 (the default, op-batcher/flags/flags.go:66), the second queue.Send (line 97) blocks in errgroup.Go until the auth tx mines — so batch submission stalls for a full auth-confirmation latency on every pair, and the batch tx isn't even crafted until the auth tx mines (~doubles per-batch latency).

It avoids deadlock only because these channels are buffered cap 1: the auth handler can write its receipt with no reader present (the watcher goroutine isn't started until after both Sends return, line 99). The comment should state that the buffering is what prevents the deadlock at the queue limit — switching these to unbuffered would hard-deadlock at the default config. Recommend documenting that MaxPendingTransactions >= 2 is required (or auto-bumping it) when fallback auth is active. This serialization is also untested — the fakeTxSender in the unit tests delivers receipts inline and doesn't model the blocking second Send.

Comment on lines +96 to +97
queue.Send(transactionReference, verifyCandidate, authReceiptCh)
queue.Send(transactionReference, *candidate, batchReceiptCh)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M3 — Nonce-gap hazard when the auth send fails asynchronously.

These two Sends craft consecutive nonces synchronously (auth = N, batch = N+1). If the auth tx is crafted but its async sendTx later fails, txmgr calls resetNonce() (op-service/txmgr/txmgr.go:372) while the batch tx at nonce N+1 is already in flight — leaving a permanent nonce gap that wedges the sender until the cancel/gap-fill path runs.

This is the same class as the upstream single-tx path, but amplified here because this path deliberately fires two ordered txs and a failure/revert on the first is now an expected case (e.g. the activeIsEspresso-still-true scenario). Worth a targeted integration test against the real txmgr.Queue; consider gating the batch Send on auth success rather than firing both unconditionally.

if err != nil {
return false, fmt.Errorf("failed to fetch L1 tip for fallback-auth gate: %w", err)
}
leadSec := uint64(l.Config.FallbackAuthLeadTime / time.Second)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M4 — FallbackAuthLeadTime is unvalidated; negative or zero is unsafe.

CLIConfig.Check() (op-batcher/batcher/config.go) never validates this flag. A negative duration makes uint64(... / time.Second) wrap to a huge value, so tip.Time + leadSec overflows and the gate misfires. A value of 0 removes the safety margin entirely and reintroduces the boundary-window silent drop the lead time exists to prevent: a tx decided pre-fork that lands in a post-fork block makes the verifier require a BatchInfoAuthenticated event the batcher never sent → batch silently dropped.

Recommend validating in Check(): reject negative, and reject (or at least warn) on 0; document that the lead time must exceed the worst-case L1 inclusion delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants