pyk: per-request haskell-backend RPC log capture, written per request under the proof dir#4928
pyk: per-request haskell-backend RPC log capture, written per request under the proof dir#4928ehildenb wants to merge 14 commits into
Conversation
…rgs hook Pull the `--log` / `--log-format` / `--log-entries` construction out of KoreServer._extra_args into a new _haskell_log_cli_args method so subclasses can supply their own flag form. No behavior change at this commit; the default still emits the kore-rpc form. Mirrors the existing _smt_cli_args override pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…og-file / -l in BoosterServer kore-rpc-booster diverged from kore-rpc on the haskell-log CLI: it accepts `--log-file FILE` (not `--log FILE`) and repeated `-l ENTRY` (not the comma-separated `--log-entries A,B,C`). See booster's CLOptions.hs (`long "log-file"` and `long "log-level" <> short 'l'` inside a `many`). Pyk was emitting the kore-rpc form to both binaries, causing kore-rpc-booster to exit on usage error before opening its RPC port — surfacing as a psutil.AccessDenied in KoreServer._get_host_and_port. Override _haskell_log_cli_args on BoosterServer to produce the booster form. KoreServer is unchanged. Regression test pins both flag forms by calling _haskell_log_cli_args directly on bypass-__init__ instances. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…-local accessor on KoreClient
`JsonRpcClient.request` now records the deterministic `f'{client_label}-{NNN}'`
id it issues into a thread-local; `KoreClient.last_request_id` exposes it.
A diagnostic caller snapshots this immediately after each RPC to use the id as a
join key onto the per-request captured logs and handoff metadata. Thread-local
so parallel proof workers never clobber each other's ids.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ion/{kore/test_kore_client,test_division_hooks}: per-request haskell-logging entry list + haskell-log-entries on result types
Add `haskell_logging: Iterable[str] | None = None` to KoreClient.{execute,implies,
simplify,get_model,add_module}: the value is the list of haskell-backend log entries
to capture for that request, sent verbatim as the `haskell-logging` wire field (None
omits it). Putting the entry set on the request — rather than a fixed backend bundle
or a server-startup `-l` set — lets pyk own the default and downstream override it,
and keeps the backend from committing to a log set ahead of time. Parse the optional
`haskell-log-entries` bundle onto every ExecuteResult subtype, ImpliesResult, and
GetModelResult; `simplify` now returns a `SimplifyResult` `(state, logs,
haskell_log_entries)`. Absent entries / unset list preserve today's behaviour.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…haskell_logging through execute, simplify, kast_simplify, implies, extend_cterm
Add an optional `haskell_logging` per-call parameter to
CTermSymbolic.{execute,simplify,kast_simplify,implies} (including the recursive
failure-reason implies) and forward it through KCFGExplore.extend_cterm to
CTermSymbolic.execute. Additive and default-preserving: omitting it reproduces
today's calls exactly (None leaves backend logging untouched).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…kell-logging bundle to <dir>/<request_id>.jsonl Add an instance-level `haskell_log_dir` to CTermSymbolic (threaded through the `cterm_symbolic` factory). When set, every execute/simplify/implies requests the per-request `haskell-logging` bundle (unless a per-call flag overrides) and writes the captured entries to `<haskell_log_dir>/<request_id>.jsonl`, one JSON value per line, named by request id so concurrent proof workers never collide. Mirrors the `booster_only_simplify` instance-default pattern; no-op when unset. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…est bundles under the save dir Add a `--haskell-logging` boolean to `pyk prove` (ProveOptions). When set, exec_prove points `cterm_symbolic` at `<save-directory>/haskell-logs`, so every RPC's `haskell-logging` bundle is captured and written one file per request. Requires `--save-directory`; errors clearly otherwise. The single per-request flag is self-sufficient on the backend side: the haskell-backend captures its bundle (kore equation attempt/apply plus term resolution, and the booster proxy/detail/abort/simplify context events) independently of the server's -l/--log-entries configuration. So pyk only sets the one flag and does not configure which entries to enable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
a165263 to
13d2a6c
Compare
| haskell_log_format: KoreExecLogFormat = KoreExecLogFormat.JSON, | ||
| haskell_log_entries: list[str] | None = None, | ||
| ) -> BoosterServer: | ||
| server = object.__new__(BoosterServer) |
There was a problem hiding this comment.
Consider using mocks of the actual classes instead that simply patch start into a no-op. This would let us test all other private methods as well.
There was a problem hiding this comment.
Alternatively, you can refactor the classes to call start only on __enter__. Then the real classes can be tested directly.
There was a problem hiding this comment.
Seems like scope creep for this PR.
There was a problem hiding this comment.
I can see how this can be the case for the second comment.
Regarding the first comment though: if you patch start, it'll eliminate the need to work around the type checker (with __new__ and # type: ignore comments).
…equest_id per-client, not thread-local Each proof worker drives its own KoreClient (`_ProverPool`/`KoreServerPool` key instances by thread), so no client is ever shared across threads — the thread-local guarded a scenario that does not occur. Record the issued id on `JsonRpcClient`, surface it through `JsonRpcClientFacade` and `KoreClient.last_request_id`, and drop the module-level thread-local. The id-generation test moves onto `JsonRpcClient` directly (no reach-through), and a `KoreClient`-level test covers the surfacing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… docstring clarity, spelling Address review feedback: let `HASKELL_LOGGING_ENTRIES` infer its type from the literal (`Final` without the explicit tuple annotation), double-backtick ``CTermSymbolic`` in the docstring, clarify that the populated `haskell_log_entries` default is *what* to request rather than an on-switch (logging stays off until `haskell_log_dir` is set), and fix the `honors` spelling in a test name. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ponse tests over endpoints The five near-identical execute/implies/simplify tests collapse into two endpoint-parameterized tests (requested vs. default), asserting both that the entry list reaches the wire and that the response bundle is surfaced on the result. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ions over server/flag cases Collapse the per-class methods and two builders into a single parameterized top-level `test_haskell_log_cli_args` driven by a `(server_cls, log file, format, entries, expected)` table, covering both KoreServer (kore-rpc form) and BoosterServer (booster form) including the no-log-file and no-entries cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ntries through cterm_symbolic to the per-request set Previously the factory only fed `haskell_log_entries` to the legacy server-side `-l` path, so a CTermSymbolic built through `cterm_symbolic` (the init path used by downstream `legacy_explore`/`create_kcfg_explore`) always requested pyk's canonical default and clients could not override which entries to capture without constructing CTermSymbolic directly. Forward the param to the per-request set, falling back to the canonical default when unset so behavior is unchanged out of the box. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| """Build a `CTermSymbolic` whose backend returns ``response`` and whose Kast<->Kore | ||
| conversions are stubbed, returning the backend `Mock` so a test can inspect call args. | ||
| """ | ||
| kore_client = Mock() |
There was a problem hiding this comment.
That fixture builds a real KoreClient over a mocked transport, whereas these tests mock the whole KoreClient to exercise CTermSymbolic in isolation
Good point.
Happy to extract a shared mock-KoreClient helper into a conftest if you'd prefer that over the local one.
No need to extract it, but turning it into an actual local fixture would be nice.
| def test_execute_on_requests_configured_entries() -> None: | ||
| # Given | ||
| kore_client, cts = _mock_client_cts(StuckResult(state=State(term=int_dv(2)), depth=1, logs=())) | ||
| dummy: CTerm = Mock() | ||
|
|
||
| # When the per-call flag turns logging on | ||
| cts.execute(dummy, haskell_logging=True) | ||
|
|
||
| # Then the configured entry set (the canonical default here) is what reaches KoreClient.execute | ||
| _args, kwargs = kore_client.execute.call_args | ||
| assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES |
There was a problem hiding this comment.
And that fixture could provide an interface similar to this, makes tests much easier to read:
| def test_execute_on_requests_configured_entries() -> None: | |
| # Given | |
| kore_client, cts = _mock_client_cts(StuckResult(state=State(term=int_dv(2)), depth=1, logs=())) | |
| dummy: CTerm = Mock() | |
| # When the per-call flag turns logging on | |
| cts.execute(dummy, haskell_logging=True) | |
| # Then the configured entry set (the canonical default here) is what reaches KoreClient.execute | |
| _args, kwargs = kore_client.execute.call_args | |
| assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES | |
| def test_execute_on_requests_configured_entries() -> None: | |
| # Given a mock CTermSymbolic initialized with empty ``logs`` | |
| cts = _mock_client_cts(logs=()) | |
| # When the per-call flag turns logging on | |
| cts.execute(Mock(), haskell_logging=True) | |
| # Then the configured entry set (the canonical default here) is what reaches KoreClient.execute | |
| assert_client_execute_called_with_haskell_logging(HASKELL_LOGGING_ENTRIES) |
| # Given a CTermSymbolic configured with a custom entry set | ||
| kore_client = Mock() | ||
| kore_client.execute.return_value = StuckResult(state=State(term=int_dv(2)), depth=1, logs=()) | ||
| cts = CTermSymbolic(kore_client, Mock(), haskell_log_entries=['Rewrite', 'DebugApplyEquation']) |
There was a problem hiding this comment.
_mock_client_cts can take whatever arguments CTermSymbolic takes (except of the KoreClient of course), so that these test functions collapse into a single parameterized test.
| def _log_dir_cts(response: object, haskell_log_dir: Path) -> tuple[Mock, CTermSymbolic]: | ||
| kore_client = Mock() | ||
| kore_client.execute.return_value = response | ||
| cts = CTermSymbolic(kore_client, Mock(), haskell_log_dir=haskell_log_dir) | ||
| cts.kast_to_kore = Mock(return_value=Mock()) # type: ignore[method-assign] | ||
| cts.kore_to_kast = Mock(return_value=mlTop()) # type: ignore[method-assign] | ||
| return kore_client, cts |
There was a problem hiding this comment.
Then this helper is gone as well.
| def test_execute_writes_haskell_log_bundle(tmp_path: Path) -> None: | ||
| # Given a configured log dir and a response carrying a per-request bundle | ||
| entry = {'context': ['Proxy'], 'message': 'simplifying'} | ||
| response = StuckResult(state=State(term=int_dv(2)), depth=1, logs=(), haskell_log_entries=(entry,)) | ||
| kore_client, cts = _log_dir_cts(response, tmp_path) | ||
| kore_client.last_request_id = 'proof-007' | ||
|
|
||
| # When | ||
| cts.execute(Mock()) | ||
|
|
||
| # Then logging is requested and the bundle lands in <dir>/<request_id>.jsonl, one JSON value per line | ||
| _args, kwargs = kore_client.execute.call_args | ||
| assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES | ||
| log_file = tmp_path / 'proof-007.jsonl' | ||
| assert json.loads(log_file.read_text().strip()) == entry |
There was a problem hiding this comment.
Using the more powerful _mock_client_cts to deal with this test case:
| def test_execute_writes_haskell_log_bundle(tmp_path: Path) -> None: | |
| # Given a configured log dir and a response carrying a per-request bundle | |
| entry = {'context': ['Proxy'], 'message': 'simplifying'} | |
| response = StuckResult(state=State(term=int_dv(2)), depth=1, logs=(), haskell_log_entries=(entry,)) | |
| kore_client, cts = _log_dir_cts(response, tmp_path) | |
| kore_client.last_request_id = 'proof-007' | |
| # When | |
| cts.execute(Mock()) | |
| # Then logging is requested and the bundle lands in <dir>/<request_id>.jsonl, one JSON value per line | |
| _args, kwargs = kore_client.execute.call_args | |
| assert kwargs['haskell_logging'] == HASKELL_LOGGING_ENTRIES | |
| log_file = tmp_path / 'proof-007.jsonl' | |
| assert json.loads(log_file.read_text().strip()) == entry | |
| def test_execute_writes_haskell_log_bundle(tmp_path: Path) -> None: | |
| # Given | |
| entry = {'context': ['Proxy'], 'message': 'simplifying'} | |
| log_file = tmp_path / 'proof-007.jsonl' | |
| cts = _mock_client_cts(haskell_log_dir=tmp_path) | |
| assume_client_execute_response_with_log_entries((entry,)) | |
| assume_client_last_request_id('proof-007') | |
| # When | |
| cts.execute(Mock()) | |
| # Then logging is requested and the bundle lands in <dir>/<request_id>.jsonl, one JSON value per line | |
| assert_client_execute_called_with_haskell_logging(HASKELL_LOGGING_ENTRIES) | |
| assert json.loads(log_file.read_text().strip()) == entry |
| def test_cterm_symbolic_forwards_custom_entry_set() -> None: | ||
| # Given a caller that overrides the per-request entry set through the factory | ||
| with patch('pyk.cterm.symbolic.KoreClient'): | ||
| with cterm_symbolic( | ||
| definition=Mock(), | ||
| definition_dir=Mock(), | ||
| start_server=False, | ||
| port=1, | ||
| haskell_log_entries=['Rewrite', 'DebugApplyEquation'], | ||
| ) as cts: | ||
| # Then the built CTermSymbolic requests exactly that set | ||
| assert cts._haskell_log_entries == ('Rewrite', 'DebugApplyEquation') | ||
|
|
||
|
|
||
| def test_cterm_symbolic_defaults_to_canonical_entry_set() -> None: | ||
| # Given no override | ||
| with patch('pyk.cterm.symbolic.KoreClient'): | ||
| with cterm_symbolic(definition=Mock(), definition_dir=Mock(), start_server=False, port=1) as cts: | ||
| # Then the canonical default is used | ||
| assert cts._haskell_log_entries == HASKELL_LOGGING_ENTRIES |
There was a problem hiding this comment.
These tests can be dropped I think? What value the private attribute takes after initialization is not a concern, only what entries there are in the execute request, which the previous tests cover thoroughly.
| kore_client: KoreClient, rpc_client: MockClient, mock: Mock | ||
| ) -> None: | ||
| # Given the underlying client reports an id and a normal response | ||
| mock.last_request_id = 'claim-x-001' |
| assert actual == expected | ||
|
|
||
|
|
||
| def test_last_request_id_surfaces_underlying_client_id( |
There was a problem hiding this comment.
I feel like all these test can be integrated into the existing harnesses, i.e. turn them into entries in the corresponding test data tables.
| _HASKELL_LOG_ENTRIES: Final = [ | ||
| {'context': ['proxy', 'detail'], 'message': 'kore-simplify'}, | ||
| {'label': 'EVM.foo', 'rule_id': 'abc', 'pre_hash': 'deadbeef', 'post_hash': 'f00d'}, | ||
| ] | ||
|
|
||
|
|
||
| _HASKELL_LOG_RESPONSES: Final[dict[str, dict[str, Any]]] = { | ||
| 'execute': {'state': {'term': kore(int_dv(2))}, 'depth': 1, 'reason': 'stuck'}, | ||
| 'implies': {'valid': True, 'implication': kore(int_top)}, | ||
| 'simplify': {'state': kore(int_top)}, | ||
| } | ||
|
|
||
|
|
||
| def _invoke_with_logging(client: KoreClient, method: str, haskell_logging: list[str] | None) -> Any: | ||
| match method: | ||
| case 'execute': | ||
| return client.execute(int_dv(0), haskell_logging=haskell_logging) | ||
| case 'implies': | ||
| return client.implies(int_bottom, int_top, haskell_logging=haskell_logging) | ||
| case 'simplify': | ||
| return client.simplify(And(INT, (int_top, int_top)), haskell_logging=haskell_logging) | ||
| case _: | ||
| raise AssertionError(method) |
There was a problem hiding this comment.
This sort of parametrization doesn't make much sense imho.
Blocked on: runtimeverification/haskell-backend#4151Blocked on: #4929Lets pyk request the haskell-backend's per-request log bundle and write it to disk, so downstream can diagnose fallback behaviour per RPC. The request carries the list of log entries to capture and the response returns them in-band; pyk owns the default set and writes each request's bundle to its own file. This pairs with the haskell-backend
per-request-rpcbranch, which adds thehaskell-loggingrequest field ([String]) and thehaskell-log-entriesresponse field to the five JSON-RPC endpoints and captures the named entries independently of the server's-l/--log-fileconfiguration.Changes:
BoosterServernow emits the booster-form haskell-log CLI (--log-file FILE, repeated-l ENTRY) via a new overridable_haskell_log_cli_argshook; it previously sent the kore-rpc form (--log,--log-entries A,B), which made kore-rpc-booster exit on a usage error before opening its RPC port.KoreClient.{execute,implies,simplify,get_model,add_module}takehaskell_logging: Iterable[str] | None— the entries to capture for that request — and send it as thehaskell-loggingwire field (omitted whenNone); the optionalhaskell-log-entriesbundle is parsed onto every result type, andsimplifyreturns aSimplifyResult (state, logs, haskell_log_entries).KoreClient.last_request_idexposes the JSON-RPC id of the most recent request on the calling thread, used to name each per-request log file.CTermSymbolicowns the canonical default entry set (HASKELL_LOGGING_ENTRIES), overridable per instance so downstream can tailor it; the on/off flag is threaded throughexecute/simplify/kast_simplify/impliesandKCFGExplore.extend_cterm.CTermSymbolictakes ahaskell_log_dir: when set, every RPC requests the bundle and the entries returned on the response are written to<haskell_log_dir>/<request_id>.jsonl(one JSON value per line, named by request id so concurrent proof workers never collide).pyk prove --haskell-loggingpoints that sink at<save-directory>/haskell-logs/(requires--save-directory).Validation:
make -C pyk checkclean; unit suites forcterm/kore/kcfggreen, including new coverage for the per-request entry list, result-type bundle parsing, the entry-set override, and the jsonl writer.per-request-rpcbackend implementation: it acceptshaskell-logging: [String], routes the flat list to both engines (skipping names each does not recognise), tag-matches id-carrying contexts likeRewrite, and returns the captured entries in-band — pyk's default set is fully recognised, no skips.regression-newnot run locally (the machine's K install is pinned for a benchmark) — relying on CI.