Skip to content

pyk: per-request haskell-backend RPC log capture, written per request under the proof dir#4928

Open
ehildenb wants to merge 14 commits into
developfrom
kore-rpc-improvements
Open

pyk: per-request haskell-backend RPC log capture, written per request under the proof dir#4928
ehildenb wants to merge 14 commits into
developfrom
kore-rpc-improvements

Conversation

@ehildenb
Copy link
Copy Markdown
Member

@ehildenb ehildenb commented Jun 3, 2026

Blocked on: runtimeverification/haskell-backend#4151
Blocked on: #4929

Lets pyk request the haskell-backend's per-request log bundle and write it to disk, so downstream can diagnose fallback behaviour per RPC. The request carries the list of log entries to capture and the response returns them in-band; pyk owns the default set and writes each request's bundle to its own file. This pairs with the haskell-backend per-request-rpc branch, which adds the haskell-logging request field ([String]) and the haskell-log-entries response field to the five JSON-RPC endpoints and captures the named entries independently of the server's -l/--log-file configuration.

Changes:

  • BoosterServer now emits the booster-form haskell-log CLI (--log-file FILE, repeated -l ENTRY) via a new overridable _haskell_log_cli_args hook; it previously sent the kore-rpc form (--log, --log-entries A,B), which made kore-rpc-booster exit on a usage error before opening its RPC port.
  • KoreClient.{execute,implies,simplify,get_model,add_module} take haskell_logging: Iterable[str] | None — the entries to capture for that request — and send it as the haskell-logging wire field (omitted when None); the optional haskell-log-entries bundle is parsed onto every result type, and simplify returns a SimplifyResult (state, logs, haskell_log_entries).
  • KoreClient.last_request_id exposes the JSON-RPC id of the most recent request on the calling thread, used to name each per-request log file.
  • CTermSymbolic owns the canonical default entry set (HASKELL_LOGGING_ENTRIES), overridable per instance so downstream can tailor it; the on/off flag is threaded through execute/simplify/kast_simplify/implies and KCFGExplore.extend_cterm.
  • CTermSymbolic takes a haskell_log_dir: when set, every RPC requests the bundle and the entries returned on the response are written to <haskell_log_dir>/<request_id>.jsonl (one JSON value per line, named by request id so concurrent proof workers never collide).
  • pyk prove --haskell-logging points that sink at <save-directory>/haskell-logs/ (requires --save-directory).

Validation:

  • make -C pyk check clean; unit suites for cterm/kore/kcfg green, including new coverage for the per-request entry list, result-type bundle parsing, the entry-set override, and the jsonl writer.
  • Confirmed against the per-request-rpc backend implementation: it accepts haskell-logging: [String], routes the flat list to both engines (skipping names each does not recognise), tag-matches id-carrying contexts like Rewrite, and returns the captured entries in-band — pyk's default set is fully recognised, no skips.
  • Integration and regression-new not run locally (the machine's K install is pinned for a benchmark) — relying on CI.

ehildenb and others added 5 commits June 3, 2026 20:53
…rgs hook

Pull the `--log` / `--log-format` / `--log-entries` construction out of
KoreServer._extra_args into a new _haskell_log_cli_args method so subclasses
can supply their own flag form.  No behavior change at this commit; the
default still emits the kore-rpc form.  Mirrors the existing _smt_cli_args
override pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…og-file / -l in BoosterServer

kore-rpc-booster diverged from kore-rpc on the haskell-log CLI: it accepts
`--log-file FILE` (not `--log FILE`) and repeated `-l ENTRY` (not the
comma-separated `--log-entries A,B,C`).  See booster's CLOptions.hs
(`long "log-file"` and `long "log-level" <> short 'l'` inside a `many`).
Pyk was emitting the kore-rpc form to both binaries, causing kore-rpc-booster
to exit on usage error before opening its RPC port — surfacing as a
psutil.AccessDenied in KoreServer._get_host_and_port.

Override _haskell_log_cli_args on BoosterServer to produce the booster form.
KoreServer is unchanged.  Regression test pins both flag forms by calling
_haskell_log_cli_args directly on bypass-__init__ instances.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…-local accessor on KoreClient

`JsonRpcClient.request` now records the deterministic `f'{client_label}-{NNN}'`
id it issues into a thread-local; `KoreClient.last_request_id` exposes it.
A diagnostic caller snapshots this immediately after each RPC to use the id as a
join key onto the per-request captured logs and handoff metadata.  Thread-local
so parallel proof workers never clobber each other's ids.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ion/{kore/test_kore_client,test_division_hooks}: per-request haskell-logging entry list + haskell-log-entries on result types

Add `haskell_logging: Iterable[str] | None = None` to KoreClient.{execute,implies,
simplify,get_model,add_module}: the value is the list of haskell-backend log entries
to capture for that request, sent verbatim as the `haskell-logging` wire field (None
omits it).  Putting the entry set on the request — rather than a fixed backend bundle
or a server-startup `-l` set — lets pyk own the default and downstream override it,
and keeps the backend from committing to a log set ahead of time.  Parse the optional
`haskell-log-entries` bundle onto every ExecuteResult subtype, ImpliesResult, and
GetModelResult; `simplify` now returns a `SimplifyResult` `(state, logs,
haskell_log_entries)`.  Absent entries / unset list preserve today's behaviour.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…haskell_logging through execute, simplify, kast_simplify, implies, extend_cterm

Add an optional `haskell_logging` per-call parameter to
CTermSymbolic.{execute,simplify,kast_simplify,implies} (including the recursive
failure-reason implies) and forward it through KCFGExplore.extend_cterm to
CTermSymbolic.execute.  Additive and default-preserving: omitting it reproduces
today's calls exactly (None leaves backend logging untouched).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@deosa-arch deosa-arch changed the base branch from master to develop June 3, 2026 23:39
@ehildenb ehildenb changed the title Kore rpc improvements kevm prove: add --haskell-log-file/format/entries Jun 3, 2026
@ehildenb ehildenb changed the title kevm prove: add --haskell-log-file/format/entries pyk: per-request haskell-backend RPC log capture, written per request under the proof dir Jun 3, 2026
ehildenb and others added 2 commits June 4, 2026 03:27
…kell-logging bundle to <dir>/<request_id>.jsonl

Add an instance-level `haskell_log_dir` to CTermSymbolic (threaded through the
`cterm_symbolic` factory).  When set, every execute/simplify/implies requests the
per-request `haskell-logging` bundle (unless a per-call flag overrides) and writes
the captured entries to `<haskell_log_dir>/<request_id>.jsonl`, one JSON value per
line, named by request id so concurrent proof workers never collide.  Mirrors the
`booster_only_simplify` instance-default pattern; no-op when unset.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…est bundles under the save dir

Add a `--haskell-logging` boolean to `pyk prove` (ProveOptions).  When set,
exec_prove points `cterm_symbolic` at `<save-directory>/haskell-logs`, so every
RPC's `haskell-logging` bundle is captured and written one file per request.
Requires `--save-directory`; errors clearly otherwise.

The single per-request flag is self-sufficient on the backend side: the
haskell-backend captures its bundle (kore equation attempt/apply plus term
resolution, and the booster proxy/detail/abort/simplify context events)
independently of the server's -l/--log-entries configuration.  So pyk only sets
the one flag and does not configure which entries to enable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
juliankuners
juliankuners previously approved these changes Jun 4, 2026
@ehildenb ehildenb dismissed juliankuners’s stale review June 4, 2026 14:03

Approved wrong PR.

@ehildenb ehildenb marked this pull request as ready for review June 4, 2026 14:04
Comment thread pyk/src/pyk/cterm/symbolic.py Outdated
Comment thread pyk/src/pyk/cterm/symbolic.py Outdated
Comment thread pyk/src/pyk/cterm/symbolic.py
Comment thread pyk/src/pyk/cterm/symbolic.py
Comment thread pyk/src/pyk/kore/rpc.py Outdated
Comment thread pyk/src/tests/unit/kore/test_rpc_client.py Outdated
Comment thread pyk/src/tests/unit/kore/test_server_cli_args.py Outdated
haskell_log_format: KoreExecLogFormat = KoreExecLogFormat.JSON,
haskell_log_entries: list[str] | None = None,
) -> BoosterServer:
server = object.__new__(BoosterServer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using mocks of the actual classes instead that simply patch start into a no-op. This would let us test all other private methods as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you can refactor the classes to call start only on __enter__. Then the real classes can be tested directly.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like scope creep for this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how this can be the case for the second comment.

Regarding the first comment though: if you patch start, it'll eliminate the need to work around the type checker (with __new__ and # type: ignore comments).

Comment thread pyk/src/tests/unit/kore/test_server_cli_args.py Outdated
Comment thread pyk/src/tests/unit/kore/test_rpc_client.py Outdated
ehildenb and others added 5 commits June 4, 2026 17:37
…equest_id per-client, not thread-local

Each proof worker drives its own KoreClient (`_ProverPool`/`KoreServerPool` key
instances by thread), so no client is ever shared across threads — the thread-local
guarded a scenario that does not occur.  Record the issued id on `JsonRpcClient`,
surface it through `JsonRpcClientFacade` and `KoreClient.last_request_id`, and drop
the module-level thread-local.  The id-generation test moves onto `JsonRpcClient`
directly (no reach-through), and a `KoreClient`-level test covers the surfacing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… docstring clarity, spelling

Address review feedback: let `HASKELL_LOGGING_ENTRIES` infer its type from the
literal (`Final` without the explicit tuple annotation), double-backtick
``CTermSymbolic`` in the docstring, clarify that the populated `haskell_log_entries`
default is *what* to request rather than an on-switch (logging stays off until
`haskell_log_dir` is set), and fix the `honors` spelling in a test name.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ponse tests over endpoints

The five near-identical execute/implies/simplify tests collapse into two
endpoint-parameterized tests (requested vs. default), asserting both that the entry
list reaches the wire and that the response bundle is surfaced on the result.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ions over server/flag cases

Collapse the per-class methods and two builders into a single parameterized
top-level `test_haskell_log_cli_args` driven by a `(server_cls, log file, format,
entries, expected)` table, covering both KoreServer (kore-rpc form) and BoosterServer
(booster form) including the no-log-file and no-entries cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ntries through cterm_symbolic to the per-request set

Previously the factory only fed `haskell_log_entries` to the legacy server-side `-l`
path, so a CTermSymbolic built through `cterm_symbolic` (the init path used by
downstream `legacy_explore`/`create_kcfg_explore`) always requested pyk's canonical
default and clients could not override which entries to capture without constructing
CTermSymbolic directly.  Forward the param to the per-request set, falling back to
the canonical default when unset so behavior is unchanged out of the box.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ehildenb ehildenb requested a review from tothtamas28 June 4, 2026 18:57
"""Build a `CTermSymbolic` whose backend returns ``response`` and whose Kast<->Kore
conversions are stubbed, returning the backend `Mock` so a test can inspect call args.
"""
kore_client = Mock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That fixture builds a real KoreClient over a mocked transport, whereas these tests mock the whole KoreClient to exercise CTermSymbolic in isolation

Good point.

Happy to extract a shared mock-KoreClient helper into a conftest if you'd prefer that over the local one.

No need to extract it, but turning it into an actual local fixture would be nice.

Comment on lines +30 to +40
def test_execute_on_requests_configured_entries() -> None:
# Given
kore_client, cts = _mock_client_cts(StuckResult(state=State(term=int_dv(2)), depth=1, logs=()))
dummy: CTerm = Mock()

# When the per-call flag turns logging on
cts.execute(dummy, haskell_logging=True)

# Then the configured entry set (the canonical default here) is what reaches KoreClient.execute
_args, kwargs = kore_client.execute.call_args
assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that fixture could provide an interface similar to this, makes tests much easier to read:

Suggested change
def test_execute_on_requests_configured_entries() -> None:
# Given
kore_client, cts = _mock_client_cts(StuckResult(state=State(term=int_dv(2)), depth=1, logs=()))
dummy: CTerm = Mock()
# When the per-call flag turns logging on
cts.execute(dummy, haskell_logging=True)
# Then the configured entry set (the canonical default here) is what reaches KoreClient.execute
_args, kwargs = kore_client.execute.call_args
assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES
def test_execute_on_requests_configured_entries() -> None:
# Given a mock CTermSymbolic initialized with empty ``logs``
cts = _mock_client_cts(logs=())
# When the per-call flag turns logging on
cts.execute(Mock(), haskell_logging=True)
# Then the configured entry set (the canonical default here) is what reaches KoreClient.execute
assert_client_execute_called_with_haskell_logging(HASKELL_LOGGING_ENTRIES)

# Given a CTermSymbolic configured with a custom entry set
kore_client = Mock()
kore_client.execute.return_value = StuckResult(state=State(term=int_dv(2)), depth=1, logs=())
cts = CTermSymbolic(kore_client, Mock(), haskell_log_entries=['Rewrite', 'DebugApplyEquation'])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_mock_client_cts can take whatever arguments CTermSymbolic takes (except of the KoreClient of course), so that these test functions collapse into a single parameterized test.

Comment on lines +72 to +78
def _log_dir_cts(response: object, haskell_log_dir: Path) -> tuple[Mock, CTermSymbolic]:
kore_client = Mock()
kore_client.execute.return_value = response
cts = CTermSymbolic(kore_client, Mock(), haskell_log_dir=haskell_log_dir)
cts.kast_to_kore = Mock(return_value=Mock()) # type: ignore[method-assign]
cts.kore_to_kast = Mock(return_value=mlTop()) # type: ignore[method-assign]
return kore_client, cts
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this helper is gone as well.

Comment on lines +81 to +95
def test_execute_writes_haskell_log_bundle(tmp_path: Path) -> None:
# Given a configured log dir and a response carrying a per-request bundle
entry = {'context': ['Proxy'], 'message': 'simplifying'}
response = StuckResult(state=State(term=int_dv(2)), depth=1, logs=(), haskell_log_entries=(entry,))
kore_client, cts = _log_dir_cts(response, tmp_path)
kore_client.last_request_id = 'proof-007'

# When
cts.execute(Mock())

# Then logging is requested and the bundle lands in <dir>/<request_id>.jsonl, one JSON value per line
_args, kwargs = kore_client.execute.call_args
assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES
log_file = tmp_path / 'proof-007.jsonl'
assert json.loads(log_file.read_text().strip()) == entry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the more powerful _mock_client_cts to deal with this test case:

Suggested change
def test_execute_writes_haskell_log_bundle(tmp_path: Path) -> None:
# Given a configured log dir and a response carrying a per-request bundle
entry = {'context': ['Proxy'], 'message': 'simplifying'}
response = StuckResult(state=State(term=int_dv(2)), depth=1, logs=(), haskell_log_entries=(entry,))
kore_client, cts = _log_dir_cts(response, tmp_path)
kore_client.last_request_id = 'proof-007'
# When
cts.execute(Mock())
# Then logging is requested and the bundle lands in <dir>/<request_id>.jsonl, one JSON value per line
_args, kwargs = kore_client.execute.call_args
assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES
log_file = tmp_path / 'proof-007.jsonl'
assert json.loads(log_file.read_text().strip()) == entry
def test_execute_writes_haskell_log_bundle(tmp_path: Path) -> None:
# Given
entry = {'context': ['Proxy'], 'message': 'simplifying'}
log_file = tmp_path / 'proof-007.jsonl'
cts = _mock_client_cts(haskell_log_dir=tmp_path)
assume_client_execute_response_with_log_entries((entry,))
assume_client_last_request_id('proof-007')
# When
cts.execute(Mock())
# Then logging is requested and the bundle lands in <dir>/<request_id>.jsonl, one JSON value per line
assert_client_execute_called_with_haskell_logging(HASKELL_LOGGING_ENTRIES)
assert json.loads(log_file.read_text().strip()) == entry

Comment on lines +111 to +130
def test_cterm_symbolic_forwards_custom_entry_set() -> None:
# Given a caller that overrides the per-request entry set through the factory
with patch('pyk.cterm.symbolic.KoreClient'):
with cterm_symbolic(
definition=Mock(),
definition_dir=Mock(),
start_server=False,
port=1,
haskell_log_entries=['Rewrite', 'DebugApplyEquation'],
) as cts:
# Then the built CTermSymbolic requests exactly that set
assert cts._haskell_log_entries == ('Rewrite', 'DebugApplyEquation')


def test_cterm_symbolic_defaults_to_canonical_entry_set() -> None:
# Given no override
with patch('pyk.cterm.symbolic.KoreClient'):
with cterm_symbolic(definition=Mock(), definition_dir=Mock(), start_server=False, port=1) as cts:
# Then the canonical default is used
assert cts._haskell_log_entries == HASKELL_LOGGING_ENTRIES
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests can be dropped I think? What value the private attribute takes after initialization is not a concern, only what entries there are in the execute request, which the previous tests cover thoroughly.

kore_client: KoreClient, rpc_client: MockClient, mock: Mock
) -> None:
# Given the underlying client reports an id and a normal response
mock.last_request_id = 'claim-x-001'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is mock?

assert actual == expected


def test_last_request_id_surfaces_underlying_client_id(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like all these test can be integrated into the existing harnesses, i.e. turn them into entries in the corresponding test data tables.

Comment on lines +252 to +274
_HASKELL_LOG_ENTRIES: Final = [
{'context': ['proxy', 'detail'], 'message': 'kore-simplify'},
{'label': 'EVM.foo', 'rule_id': 'abc', 'pre_hash': 'deadbeef', 'post_hash': 'f00d'},
]


_HASKELL_LOG_RESPONSES: Final[dict[str, dict[str, Any]]] = {
'execute': {'state': {'term': kore(int_dv(2))}, 'depth': 1, 'reason': 'stuck'},
'implies': {'valid': True, 'implication': kore(int_top)},
'simplify': {'state': kore(int_top)},
}


def _invoke_with_logging(client: KoreClient, method: str, haskell_logging: list[str] | None) -> Any:
match method:
case 'execute':
return client.execute(int_dv(0), haskell_logging=haskell_logging)
case 'implies':
return client.implies(int_bottom, int_top, haskell_logging=haskell_logging)
case 'simplify':
return client.simplify(And(INT, (int_top, int_top)), haskell_logging=haskell_logging)
case _:
raise AssertionError(method)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of parametrization doesn't make much sense imho.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants