Skip to content

fix(conclude): publish request log on terminal reconcile + make idempotent#198

Merged
behinddwalls merged 1 commit into
mainfrom
conclude
Jun 4, 2026
Merged

fix(conclude): publish request log on terminal reconcile + make idempotent#198
behinddwalls merged 1 commit into
mainfrom
conclude

Conversation

@sbalabanov
Copy link
Copy Markdown
Contributor

Summary

The conclude controller reconciles each request in a concluded batch to the batch's terminal state (Landed / Error / Cancelled) but did not emit a corresponding RequestLog entry, leaving the request lifecycle log without a terminal record. Peer controllers (start, score, cancel) already publish to the log topic on their transitions; this brings conclude in line.

The per-request loop is also made idempotent under at-least-once redelivery:

  • Already in target terminal state → skip the CAS but still publish the log. The log queue dedupes on (request_id, status), so a prior delivery that completed the state update but failed before publishing the log gets its log emitted on retry, recorded exactly once.
  • In a different terminal state (a racing writer reached terminal first) → skip both the CAS and the log publish; the other writer owns the terminal log entry for the state it wrote. Recorded via a terminal_state_divergence metric and a warning log.
  • Not terminal → CAS to target state, then publish log.

Test plan

  • bazel test //submitqueue/orchestrator/controller/conclude:conclude_test passes
  • Added test case: idempotent retry — request already Landed, no UpdateState mocked (gomock would fail if called), publisher still expected
  • Added test case: divergent terminal state — request already Cancelled for a Succeeded batch; neither UpdateState nor publish allowed
  • Existing cases updated to expect log publish on the happy path
  • make gazelle clean (picks up new submitqueue/core/request dep)
  • make fmt clean

🤖 Generated with Claude Code

…otent

The conclude controller reconciles request state to the batch's terminal
outcome but did not emit a corresponding RequestLog entry, leaving the
request lifecycle log without a terminal record (landed/error/cancelled).

This change publishes the terminal RequestLog after the state CAS, and
makes the per-request loop idempotent under at-least-once redelivery:

- If the request is already in the target terminal state, skip the CAS
  but still publish the log (the log queue dedupes on (request_id,
  status), so re-delivery after a prior partial success still records
  the terminal entry exactly once).
- If the request is in a different terminal state (a racing writer
  reached terminal first), skip both the CAS and the log publish; the
  other writer owns the terminal log entry for the state it wrote.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@sbalabanov sbalabanov requested review from a team and behinddwalls as code owners June 4, 2026 20:56
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@behinddwalls behinddwalls merged commit ed52df8 into main Jun 4, 2026
13 of 14 checks passed
@behinddwalls behinddwalls deleted the conclude branch June 4, 2026 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants