Skip to content

Review fixes: kernel soundness, untrusted-input hardening, perf, CI#439

Open
johnchandlerburnham wants to merge 1 commit into
mainfrom
jcb/fixes
Open

Review fixes: kernel soundness, untrusted-input hardening, perf, CI#439
johnchandlerburnham wants to merge 1 commit into
mainfrom
jcb/fixes

Conversation

@johnchandlerburnham

Copy link
Copy Markdown
Member

Implement the findings of a full-codebase security/soundness review of 623be4b: the critical finding, all 21 highs, 30/36 mediums, and the 12 highest-value lows. Per-finding record with deferral reasons: fable-review-fixes.md. Trust-model rule throughout: the Lean IxVM kernel is the proven artifact, so Lean-side guards are soundness fixes; Rust-side-only gaps affect only the convenience checker.

Soundness (proven Lean kernel + ingress):

  • C-1: is_unit_like_type now requires num_indices == 0 (mirrors def_eq.rs / lean4lean). Indexed singletons are no longer unit-like, closing a definitional-UIP / provable-False hole.
  • H-2: try_eta_struct requires infer(t) defeq infer(s) before comparing fields. Relatedly (L): out-of-range projections in Whnf now stay stuck instead of fabricating BVar(0).
  • H-3: GeneratedRecursor carries the canonical params/motives/minors/ indices split; stored recursors with a different split are rejected (the type def-eq only constrains the sum; iota slices by the scalars).
  • Contested-C: the Canonical && motives>1 escape hatch to the weak coherence check is removed. Wrong-candidate selection (the case it papered over) is handled by trying every signature-matched generated candidate; acceptance always requires full type + rules def-eq.
  • H-1: blob bytes are blake3-verified against their address at load in all three env deserializers (Lean in-circuit ingress already did).
  • Anon ingress now runs the same Indc-block canonicity validation as the Meta path (M); infer_proj checks struct application arity (L).

Claims & verification:

  • H-6: ix verify prints the decoded claim; conditional claims are rejected unless --allow-conditional (shard mode passes it internally, where digests are bound).
  • H-17: README/docs state the non-ZK status of the pinned multi-stark/Plonky3 backend; perf/size figures relabeled as design targets; ix prove prints a runtime not-zero-knowledge warning.
  • Store.readVerified (blake3 re-check) used for proof reads; raw Store.read documented as writeAt-keyed only (L). rs_aiur_proof_of_bytes returns Except instead of aborting the process on malformed proof bytes (M).

Untrusted-input DoS:

  • H-7: every untrusted length-prefixed allocation capped to the actual buffer remainder: serialize.rs name tables + get_aux_layout, metadata.rs (15 sites), proof.rs, profile.rs .ixesp parser.
  • H-8/H-9: Share-expansion and Mdata arena walks fuel-bounded at all ingress + decompile loops (len+1 is exact: deterministic single- successor walk, so a revisit implies a cycle); cyclic vectors error.
  • H-10/H-11: iroh Request/Response decoding is fallible (trailing bytes rejected); handshake moved inside the per-connection task so one bad peer can't kill or head-of-line-block the server. Plus (M): 64-connection cap, 1 GiB fail-closed store cap, client-side hash verification of GET responses, ix-specific ALPN.
  • H-12: native Nat reduction output cap (2^26 bits): shiftLeft checks bits(a)+shift, pow checks bits(a)*exp; terms stay stuck, mirroring the existing pow-exponent guard.
  • M/L: shard.rs infeasible-partition clamp (was panic), ffi/ixon/expr.rs build() made iterative, deser_syntax depth cap (512).

Performance (quadratics):

  • H-13: decompile Pass 2 indexes Muts plan candidates once (build_muts_plan_index) instead of rescanning the env per aux block.
  • H-14: in-circuit shard skip-check uses a ptr_val-keyed RBTreeMap membership map (same fail-closed invariant as build_addr_pos_map) plus a lockstep cursor, replacing two per-constant linear list walks.
  • H-15: whnf transient-nat probe pre-filters with an allocation-free spine_head_and_len; spine collection + Recr clone now happen only for actual Nat-recursor heads; redundant rules.clone() in iota removed.

Compile correctness (M):

  • univ_cache invalidated when the level-param list changes; under-applied and bare references to permuted auxiliaries are hard CompileErrors; aux-regen surgery guard requires the root to classify to a compiled inductive; recursor is_large override uses is_never_zero (Lean's is_not_zero); Prop-level nested brecOn fails closed instead of indexing OOB.
  • Aiur: Check.lean unbound-local fallback to globals removed; Lower's silent #[0]/#[] binding defaults replaced by lookupBinding errors (6 sites); nullary-ctor ref padding pads with 0, not the tag (two flat encodings of one value, L).

Aiur semantics (contested-H, L):

  • BytecodeEval matchContinue: a branch return now truncates the continuation (BlockExit threading, mirrors Rust execute.rs).
  • SourceEval's remaining .ret divergence documented as KNOWN DIVERGENCE (differential tests can now catch it on the BytecodeEval side).

Tests (H-4/H-5 coverage gaps):

  • Every Aiur prove/verify test also asserts verification rejects a mismatched claim and tampered proof bytes.
  • New kernelProveCheck in the ixvm suite: real STARK prove->verify of a kernel Check HEq claim (previously no test ever proved the kernel program or verified one of its certificates).

CI / supply chain:

  • H-16: bench-pr.yml split so untrusted PR code runs in a secretless job; a separate comment job (fresh runner, no checkout) mints the App token and posts from an uploaded artifact; comment-body shell injection closed via env-var indirection.
  • Third-party actions SHA-pinned across all 5 workflows; bencher pinned to v0.6.6 (held an API token on @main); deny.toml [sources] now enforcing with an explicit git allowlist; hickory advisory ignores documented with reasons.

Misc: compute_sccs tolerates dangling refs (was unwrap panic); insert_addr_aliases probes known-name KIds instead of scanning the env; ClaimHarness computes shard closures in one multi-root walk (closureFromAll); iroh GET output is binary-safe; HashMap FFI collision leak fixed (uset decrements the overwritten slot); ZKVoting verifier stub exits 1 instead of reporting success; catch_unwind-under- panic=abort documented; DebugIxVM.lean single-case debug driver added.

Deferred with reasons: subst.rs depth guard, lean_env.rs wrapper-layout redesign, KEnv::get Arc conversion (~109 call sites; H-15 removed the hot cost), compile.rs comparator recursion, partition objective modeling, Tests/Main suite gating, remaining off-path lows, and the G-1 kernel-parity audit (process item).
fable-review.md
fable-review-fixes.md

Validation: cargo test 1045/1045; clippy --all-targets --all-features 0 warnings; fmt clean; cargo deny check sources advisories pass; lake build (lib, ix, IxTests); lake test cli, rbtree-map, --ignored ixvm, aiur, aiur-hashes all pass (incl. the new prove+verify and negative cases); ix check-rs compileinitstd.ixe: 105,487/105,487 meta and 89,010/89,010 anon -- no false rejections on a real 321 MB env; CLI e2e: prove HEq -> verify ok (decoded claim printed), tampered stored proof rejected with the new store-corruption error.

Implement the findings of a full-codebase security/soundness review of
623be4b (fable-review.md): the critical finding, all 21 highs, 30/36
mediums, and the 12 highest-value lows. Per-finding record with deferral
reasons: fable-review-fixes.md. Trust-model rule throughout: the Lean
IxVM kernel is the proven artifact, so Lean-side guards are soundness
fixes; Rust-side-only gaps affect only the convenience checker.

Soundness (proven Lean kernel + ingress):
- C-1: is_unit_like_type now requires num_indices == 0 (mirrors
  def_eq.rs / lean4lean). Indexed singletons are no longer unit-like,
  closing a definitional-UIP / provable-False hole.
- H-2: try_eta_struct requires infer(t) defeq infer(s) before comparing
  fields. Relatedly (L): out-of-range projections in Whnf now stay
  stuck instead of fabricating BVar(0).
- H-3: GeneratedRecursor carries the canonical params/motives/minors/
  indices split; stored recursors with a different split are rejected
  (the type def-eq only constrains the sum; iota slices by the scalars).
- Contested-C: the `Canonical && motives>1` escape hatch to the weak
  coherence check is removed. Wrong-candidate selection (the case it
  papered over) is handled by trying every signature-matched generated
  candidate; acceptance always requires full type + rules def-eq.
- H-1: blob bytes are blake3-verified against their address at load in
  all three env deserializers (Lean in-circuit ingress already did).
- Anon ingress now runs the same Indc-block canonicity validation as
  the Meta path (M); infer_proj checks struct application arity (L).

Claims & verification:
- H-6: `ix verify` prints the decoded claim; conditional claims are
  rejected unless --allow-conditional (shard mode passes it internally,
  where digests are bound).
- H-17: README/docs state the non-ZK status of the pinned
  multi-stark/Plonky3 backend; perf/size figures relabeled as design
  targets; `ix prove` prints a runtime not-zero-knowledge warning.
- Store.readVerified (blake3 re-check) used for proof reads; raw
  Store.read documented as writeAt-keyed only (L).
  rs_aiur_proof_of_bytes returns Except instead of aborting the process
  on malformed proof bytes (M).

Untrusted-input DoS:
- H-7: every untrusted length-prefixed allocation capped to the actual
  buffer remainder: serialize.rs name tables + get_aux_layout,
  metadata.rs (15 sites), proof.rs, profile.rs .ixesp parser.
- H-8/H-9: Share-expansion and Mdata arena walks fuel-bounded at all
  ingress + decompile loops (len+1 is exact: deterministic single-
  successor walk, so a revisit implies a cycle); cyclic vectors error.
- H-10/H-11: iroh Request/Response decoding is fallible (trailing bytes
  rejected); handshake moved inside the per-connection task so one bad
  peer can't kill or head-of-line-block the server. Plus (M):
  64-connection cap, 1 GiB fail-closed store cap, client-side hash
  verification of GET responses, ix-specific ALPN.
- H-12: native Nat reduction output cap (2^26 bits): shiftLeft checks
  bits(a)+shift, pow checks bits(a)*exp; terms stay stuck, mirroring
  the existing pow-exponent guard.
- M/L: shard.rs infeasible-partition clamp (was panic),
  ffi/ixon/expr.rs build() made iterative, deser_syntax depth cap (512).

Performance (quadratics):
- H-13: decompile Pass 2 indexes Muts plan candidates once
  (build_muts_plan_index) instead of rescanning the env per aux block.
- H-14: in-circuit shard skip-check uses a ptr_val-keyed RBTreeMap
  membership map (same fail-closed invariant as build_addr_pos_map)
  plus a lockstep cursor, replacing two per-constant linear list walks.
- H-15: whnf transient-nat probe pre-filters with an allocation-free
  spine_head_and_len; spine collection + Recr clone now happen only for
  actual Nat-recursor heads; redundant rules.clone() in iota removed.

Compile correctness (M):
- univ_cache invalidated when the level-param list changes;
  under-applied and bare references to permuted auxiliaries are hard
  CompileErrors; aux-regen surgery guard requires the root to classify
  to a compiled inductive; recursor is_large override uses
  is_never_zero (Lean's is_not_zero); Prop-level nested brecOn fails
  closed instead of indexing OOB.
- Aiur: Check.lean unbound-local fallback to globals removed; Lower's
  silent #[0]/#[] binding defaults replaced by lookupBinding errors
  (6 sites); nullary-ctor ref padding pads with 0, not the tag (two
  flat encodings of one value, L).

Aiur semantics (contested-H, L):
- BytecodeEval matchContinue: a branch `return` now truncates the
  continuation (BlockExit threading, mirrors Rust execute.rs).
- SourceEval's remaining .ret divergence documented as KNOWN DIVERGENCE
  (differential tests can now catch it on the BytecodeEval side).

Tests (H-4/H-5 coverage gaps):
- Every Aiur prove/verify test also asserts verification rejects a
  mismatched claim and tampered proof bytes.
- New kernelProveCheck in the ixvm suite: real STARK prove->verify of a
  kernel `Check HEq` claim (previously no test ever proved the kernel
  program or verified one of its certificates).

CI / supply chain:
- H-16: bench-pr.yml split so untrusted PR code runs in a secretless
  job; a separate comment job (fresh runner, no checkout) mints the App
  token and posts from an uploaded artifact; comment-body shell
  injection closed via env-var indirection.
- Third-party actions SHA-pinned across all 5 workflows; bencher pinned
  to v0.6.6 (held an API token on @main); deny.toml [sources] now
  enforcing with an explicit git allowlist; hickory advisory ignores
  documented with reasons.

Misc: compute_sccs tolerates dangling refs (was unwrap panic);
insert_addr_aliases probes known-name KIds instead of scanning the env;
ClaimHarness computes shard closures in one multi-root walk
(closureFromAll); iroh GET output is binary-safe; HashMap FFI collision
leak fixed (uset decrements the overwritten slot); ZKVoting verifier
stub exits 1 instead of reporting success; catch_unwind-under-
panic=abort documented; DebugIxVM.lean single-case debug driver added.

Deferred with reasons (see fable-review-fixes.md): subst.rs depth guard,
lean_env.rs wrapper-layout redesign, KEnv::get Arc conversion (~109 call
sites; H-15 removed the hot cost), compile.rs comparator recursion,
partition objective modeling, Tests/Main suite gating, remaining
off-path lows, and the G-1 kernel-parity audit (process item).

Validation: cargo test 1045/1045; clippy --all-targets --all-features
0 warnings; fmt clean; cargo deny check sources advisories pass;
lake build (lib, ix, IxTests); lake test cli, rbtree-map, --ignored
ixvm, aiur, aiur-hashes all pass (incl. the new prove+verify and
negative cases); ix check-rs compileinitstd.ixe: 105,487/105,487 meta
and 89,010/89,010 anon -- no false rejections on a real 321 MB env;
CLI e2e: prove HEq -> verify ok (decoded claim printed), tampered
stored proof rejected with the new store-corruption error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant