diff --git a/.claude/board/LATEST_STATE.md b/.claude/board/LATEST_STATE.md
index 17099dad..c9f05378 100644
--- a/.claude/board/LATEST_STATE.md
+++ b/.claude/board/LATEST_STATE.md
@@ -140,6 +140,8 @@ Membrane consumers can now pull BOTH halves of a render `classid` BBB-safely fro
 
 ## Current Contract Inventory (lance-graph-contract)
 
+> **2026-06-25 — MODULARIZED (follow-up to #613) — `lance_graph_contract::facet`**: extracted `FacetTier` / `FacetCascade` from `canonical_node` into a dedicated, reusable `facet` module (a *reading*, NOT part of the locked node layout — the cleaner factoring; `canonical_node` re-exports both for the historical path). **Reusable lane API rounded out:** `as_u128`/`from_u128` (single-register view), `rows()` (the 4 dword rows `{domain}{schema}` / `HEEL:HIP` / `TWIG:LEAF` / `family:identity`), `prefix_distance`/`shared_prefix_tiles` (the **granularity-free LCP redout** — `vpxor`+`tzcnt`; 8:8 vs nibble is a free `>>` on the count, measured), `row_match_mask` (`vpcmpeqd`-lane), plus `as_bytes`/`ref_from_bytes` — a **zero-cost reinterpret** (`#[repr(C, align(16))]`; `as_bytes` measured to lower to `mov rax,rdi`, a literal no-op; fields read straight through as single loads). One register → row(`u32`)/tile(`u16`)/prefix(bit)/nibble(Morton) lenses, each one SIMD op (module docs). Lab-test write-up deferred. Additive, zero-dep; 741 lib green (default + `guid-v3-tail`), clippy `-D warnings` + fmt clean. EPIPHANIES `E-FACET-8-8-ALWAYS`. Branch `claude/facet-module`.
+>
 > **2026-06-25 — ADDED (#613, the 6-tier 8:8 homogeneous facet + V3 routing fold)**: `lance_graph_contract::canonical_node::{FacetTier, FacetCascade}` — the **ALWAYS-8:8** content-blind facet substrate. `FacetTier{lo, hi}` (2 B, `const`; `as_u16` concatenated + `morton` 2bit×2bit Morton-tile projections); `FacetCascade{facet_classid: u32, tiers: [FacetTier; 6]}` (16 B = `facet_classid(4) | 6×(8:8)=12`, harvest §5.1) — a *reading* over a borrowed `[u8;16]` with `from_bytes`/`to_bytes`/`hi_chain`/`lo_chain`/`hi_distance`/`lo_distance`. **Carries NO value-slab offset** → does NOT touch the operator-LOCKED 480 B layout (the `classid→ClassView` byte-pick is the separate, panel-gated step); content-blind — only the consumer projects meaning (`part_of:is_a` / 256:256 palette centroid / `group:member` / `column:row` / concatenated u16 …), every reading amortizing to one 2bit×2bit Morton tile cascade. **Key-side V3 routing:** `hhtl::NiblePath::from_guid_prefix_v3` (feature `guid-v3-tail`) folds the 4 HHTL tiers `HEEL·HIP·TWIG·LEAF` in FULL (both bytes, depth 16) — the facet's routing prefix; `family`/`identity` stay the basin tail. `classid` NOT folded, so `soa_graph::hhtl_path` (schema-driven by `tail_variant`) routes OSINT-V3 `0x1000_0700` non-empty — fixes the Codex-P2 latent EMPTY-fold. `from_guid_prefix`'s "reserved-zero" doc/guard scoped to **v1-fold** (NOT a global classid law). Additive, zero-dep; 739 lib green (default + `guid-v3-tail`), clippy `-D warnings` + fmt clean. EPIPHANIES `E-FACET-8-8-ALWAYS`. Branch `claude/p-a-readmode-tail-variant`.
 
 > **2026-06-21 — ADDED (content-store for the AriGraph/OSINT episodic arc)**: `lance_graph_contract::content_store::{ContentId, SourceSpan, ContentError, ContentStore, ContentSink}` — the content-addressed **cold text/blob store** contract. `ContentId(u64)` = `hash::fnv1a` of the bytes (stable across versions — the correct content address; `DefaultHasher` must never key one; `0` = sentinel). `SourceSpan{ContentId,u32,u32}` = the fixed-size, `Copy` typed form of `template-equivalence`'s `(source_id,start,end)` provenance; `is_cited()` = "no source span → no claim" (non-sentinel content + non-empty span). `ContentStore` (cold read: `resolve(id) -> Option<&[u8]>` zero-copy slice into the mmap/backing store; `resolve_span`/`contains` defaulted) + `ContentSink` (idempotent `put -> ContentId`, dedup by content-address: many episodes → one source row). **Hot/cold firewall (ADR-022)**: the hot path (SIMD sweep, AriGraph edge traversal) touches only the fixed-size `ContentId`/`SourceSpan`; bytes hydrate cold at the membrane (the fingerprint is the hot-path stand-in for text). Nothing variable-length enters the 512 B node. Additive, zero-dep; +6 tests (stable/dedup, idempotent put, resolve_span slice, OOB/missing errors, uncited-rejected); clippy clean. Consumers: `rs-graph-llm/episodic-arc-task` (replaces its local fnv1a), `template-equivalence` (typed provenance). Plan: `.claude/plans/arigraph-osint-episodic-v1.md` (D-CC-ARI-3). Branch `claude/content-store-contract-draft`.
diff --git a/crates/lance-graph-contract/src/canonical_node.rs b/crates/lance-graph-contract/src/canonical_node.rs
index 0c410c79..6c2be389 100644
--- a/crates/lance-graph-contract/src/canonical_node.rs
+++ b/crates/lance-graph-contract/src/canonical_node.rs
@@ -1213,183 +1213,10 @@ impl KanbanTenant {
     }
 }
 
-// ── FacetCascade value tenant (the 6-tier 8:8 homogeneous facet) ─────────────────
-
-/// One **8:8 tile** of a [`FacetCascade`] — ALWAYS exactly two bytes, `hi` and `lo`.
-/// The substrate is **content-blind**: only the CONSUMER (the
-/// [`FacetCascade::facet_classid`]'s ClassView) decides what the 8:8 *means*. The
-/// same two bytes project as any of:
-///
-/// - `(part_of : is_a)` — mereology : taxonomy (the anatomy / `converge.rs` default)
-/// - a **256:256 palette centroid** pair (CAM-PQ — `hi`/`lo` index a 256-codebook)
-/// - a concatenated `u16` ([`as_u16`](Self::as_u16))
-/// - `(group : member)`, `(mixin : identity)`, `(column : row)`, `(memberof : name)`,
-///   a `(Y : Z)` coordinate, …
-///
-/// The producer never bakes a meaning in; the reader projects one (AGI-as-glove: the
-/// SoA is content-blind). `hi` is the coarse-side byte, `lo` the fine-side byte.
-#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Default)]
-#[repr(C)]
-pub struct FacetTier {
-    /// Low byte of the LE 8:8 tile (is_a / member / row / centroid-lo / …).
-    pub lo: u8,
-    /// High byte of the LE 8:8 tile (part_of / group / column / centroid-hi / …).
-    pub hi: u8,
-}
-
-impl FacetTier {
-    /// The two bytes as the LE `u16 = (hi << 8) | lo` — the "consumer reads the 8:8
-    /// as one concatenated 16-bit value" projection.
-    #[inline]
-    #[must_use]
-    pub const fn as_u16(self) -> u16 {
-        ((self.hi as u16) << 8) | self.lo as u16
-    }
-
-    /// The `hi:lo` pair **Morton-interleaved** into a `u16` Z-order code (`lo` on
-    /// even bits, `hi` on odd) — the amortization benefit of the always-8:8
-    /// substrate: every nibble of the result is a **2 bit × 2 bit Morton tile** (2
-    /// bits of `hi` interleaved with 2 bits of `lo`), so a nibble prefix is a
-    /// quad-tree quadrant in BOTH bytes at once (`256 = 4⁴` hierarchical ancestry).
-    /// Whatever the consumer decides the 8:8 *means* (part_of:is_a, centroid:centroid,
-    /// group:member …), it ALWAYS amortizes to this one Morton tile cascade — so
-    /// hierarchical-prefix routing is uniform across every interpretation.
-    #[inline]
-    #[must_use]
-    pub const fn morton(self) -> u16 {
-        Self::spread8(self.lo) | (Self::spread8(self.hi) << 1)
-    }
-
-    /// Spread a byte's 8 bits to the even positions `0,2,…,14` of a `u16` (the
-    /// Morton building block).
-    const fn spread8(x: u8) -> u16 {
-        let mut v = x as u16; // ........ abcdefgh
-        v = (v | (v << 4)) & 0x0F0F; // ....abcd ....efgh
-        v = (v | (v << 2)) & 0x3333; // ..ab..cd ..ef..gh
-        v = (v | (v << 1)) & 0x5555; // .a.b.c.d .e.f.g.h
-        v
-    }
-}
-
-/// The **FacetCascade** — the 6-tier **8:8** homogeneous facet, read at an
-/// **alternative location to the key**: a 16-byte ClassView reading over the value
-/// slab (`soa-value-tenant-migration-v1-harvest.md` §5.1,
-/// `facet_classid(4) | 6×(8:8)=12 = 16B`).
-///
-/// **The substrate is ALWAYS 8:8.** Six tiers, each two opaque bytes (`hi:lo`); the
-/// `facet_classid`'s ClassView decides the interpretation — `(part_of:is_a)`,
-/// 256:256 palette centroid, `(group:member)`, `(column:row)`, concatenated `u16`,
-/// … (see [`FacetTier`]). Both bytes of every tier are carried (lossless): the `hi`
-/// chain prefix-routes one hierarchy, the `lo` chain the orthogonal one.
-///
-/// The full 6-tier facet does NOT fit the 64-bit key `NiblePath` — that carries only
-/// the 4-tier HHTL routing **prefix** ([`crate::hhtl::NiblePath::from_guid_prefix_v3`],
-/// `HEEL·HIP·TWIG·LEAF`); the complete 6-tier address (HEEL·HIP·TWIG·LEAF·family·
-/// identity) lives here, at the alternative value-slab location.
-///
-/// This type is a *reading* over a borrowed `[u8; 16]` — it carries NO value-slab
-/// offset, so it does not touch the operator-LOCKED 480-byte layout. The
-/// `classid → ClassView` wiring that picks which 16 value bytes it reads is a
-/// separate, panel-gated step (harvest §5 + §6).
-#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Default)]
-#[repr(C)]
-pub struct FacetCascade {
-    /// The facet's own class id — which ClassView interprets the 6 tiers' 8:8.
-    pub facet_classid: u32,
-    /// 6 tiers coarse→fine: HEEL·HIP·TWIG·LEAF·family·identity, each an 8:8 tile.
-    pub tiers: [FacetTier; 6],
-}
-
-const _: () = assert!(core::mem::size_of::<FacetTier>() == 2, "one 8:8 tile");
-const _: () = assert!(
-    core::mem::size_of::<FacetCascade>() == 16,
-    "facet_classid(4) | 6×(8:8)=12 = 16B (harvest §5.1)"
-);
-
-impl FacetCascade {
-    /// Decode from the 16 facet bytes (LE): `facet_classid` in `[0..4)`, then 6
-    /// tiers, each an LE `u16 = (hi << 8) | lo` — on the wire `[lo, hi]` (matching
-    /// the key tiers' `converge.rs` byte order).
-    #[inline]
-    #[must_use]
-    pub const fn from_bytes(b: &[u8; 16]) -> Self {
-        FacetCascade {
-            facet_classid: u32::from_le_bytes([b[0], b[1], b[2], b[3]]),
-            tiers: [
-                FacetTier { lo: b[4], hi: b[5] },
-                FacetTier { lo: b[6], hi: b[7] },
-                FacetTier { lo: b[8], hi: b[9] },
-                FacetTier {
-                    lo: b[10],
-                    hi: b[11],
-                },
-                FacetTier {
-                    lo: b[12],
-                    hi: b[13],
-                },
-                FacetTier {
-                    lo: b[14],
-                    hi: b[15],
-                },
-            ],
-        }
-    }
-
-    /// Encode to the 16 facet bytes (LE), the inverse of [`from_bytes`](Self::from_bytes).
-    #[inline]
-    #[must_use]
-    pub const fn to_bytes(self) -> [u8; 16] {
-        let c = self.facet_classid.to_le_bytes();
-        let t = &self.tiers;
-        [
-            c[0], c[1], c[2], c[3], t[0].lo, t[0].hi, t[1].lo, t[1].hi, t[2].lo, t[2].hi, t[3].lo,
-            t[3].hi, t[4].lo, t[4].hi, t[5].lo, t[5].hi,
-        ]
-    }
-
-    /// The `hi`-byte chain, coarse→fine — one hierarchy (part_of / group / column /
-    /// centroid-hi, per the consumer).
-    #[inline]
-    #[must_use]
-    pub const fn hi_chain(self) -> [u8; 6] {
-        let t = &self.tiers;
-        [t[0].hi, t[1].hi, t[2].hi, t[3].hi, t[4].hi, t[5].hi]
-    }
-
-    /// The `lo`-byte chain, coarse→fine — the orthogonal hierarchy (is_a / member /
-    /// row / centroid-lo, per the consumer).
-    #[inline]
-    #[must_use]
-    pub const fn lo_chain(self) -> [u8; 6] {
-        let t = &self.tiers;
-        [t[0].lo, t[1].lo, t[2].lo, t[3].lo, t[4].lo, t[5].lo]
-    }
-
-    /// Shared coarse→fine prefix length (0..=6) of two 6-byte chains.
-    const fn shared(a: [u8; 6], b: [u8; 6]) -> u8 {
-        let mut n = 0u8;
-        while (n as usize) < 6 && a[n as usize] == b[n as usize] {
-            n += 1;
-        }
-        n
-    }
-
-    /// `hi`-chain distance: `6 − shared hi-prefix` — locality along the `hi`
-    /// hierarchy (e.g. `part_of` place), orthogonal to [`lo_distance`](Self::lo_distance).
-    #[inline]
-    #[must_use]
-    pub const fn hi_distance(self, other: Self) -> u8 {
-        6 - Self::shared(self.hi_chain(), other.hi_chain())
-    }
-
-    /// `lo`-chain distance: `6 − shared lo-prefix` — locality along the orthogonal
-    /// `lo` hierarchy (e.g. `is_a` type), on the SAME facet.
-    #[inline]
-    #[must_use]
-    pub const fn lo_distance(self, other: Self) -> u8 {
-        6 - Self::shared(self.lo_chain(), other.lo_chain())
-    }
-}
+// `FacetTier` / `FacetCascade` live in the dedicated [`crate::facet`] module — a
+// reusable, content-blind 8:8 substrate (a *reading* over borrowed bytes, NOT part of
+// the locked node layout). Re-exported here for the historical `canonical_node` path.
+pub use crate::facet::{FacetCascade, FacetTier};
 
 impl NodeRow {
     /// Read the [`KanbanTenant`] phase cursor from the [`ValueTenant::Kanban`]
@@ -1451,62 +1278,6 @@ impl NodeRow {
 mod tests {
     use super::*;
 
-    #[test]
-    fn facet_cascade_is_16_bytes_always_8_8_consumer_neutral() {
-        // 16-byte facet: facet_classid(4) | 6×(8:8)=12 (harvest §5.1). The bytes are
-        // content-blind — this test reads them as raw hi:lo, no part_of/is_a baked in.
-        assert_eq!(core::mem::size_of::<FacetCascade>(), 16);
-        assert_eq!(core::mem::size_of::<FacetTier>(), 2);
-
-        let bytes: [u8; 16] = [
-            0xEF, 0xBE, 0xAD, 0xDE, // facet_classid = 0xDEAD_BEEF (LE)
-            0x01, 0xAB, // tier0: lo=0x01, hi=0xAB
-            0x02, 0xCD, // tier1
-            0x03, 0xEF, // tier2
-            0x04, 0x12, // tier3
-            0x05, 0x34, // tier4
-            0x06, 0x56, // tier5
-        ];
-        let f = FacetCascade::from_bytes(&bytes);
-        assert_eq!(f.facet_classid, 0xDEAD_BEEF);
-        // round-trip is exact (the substrate stores the 8:8 verbatim).
-        assert_eq!(f.to_bytes(), bytes);
-
-        // The two orthogonal chains: hi (one hierarchy) and lo (the other).
-        assert_eq!(f.hi_chain(), [0xAB, 0xCD, 0xEF, 0x12, 0x34, 0x56]);
-        assert_eq!(f.lo_chain(), [0x01, 0x02, 0x03, 0x04, 0x05, 0x06]);
-
-        // Consumer projections of ONE tier's 8:8 — concatenated u16 and Morton tile.
-        let t0 = f.tiers[0]; // hi=0xAB, lo=0x01
-        assert_eq!(t0.as_u16(), 0xAB01, "concatenated 16-bit reading");
-        // Morton: lo on even bits, hi on odd. Every nibble = 2 bits hi × 2 bits lo.
-        assert_eq!(t0.morton(), {
-            let spread = |x: u8| {
-                let mut v = x as u16;
-                v = (v | (v << 4)) & 0x0F0F;
-                v = (v | (v << 2)) & 0x3333;
-                v = (v | (v << 1)) & 0x5555;
-                v
-            };
-            spread(0x01) | (spread(0xAB) << 1)
-        });
-        // Morton de-interleaves back to the two bytes (lossless amortization).
-        assert_eq!(t0.morton() & 0x5555, FacetTier { lo: 0x01, hi: 0 }.morton());
-
-        // Distances are prefix metrics, orthogonal: same hi-chain, different lo-chain
-        // ⇒ hi_distance 0, lo_distance > 0 (and vice-versa).
-        let same_hi_diff_lo = FacetCascade::from_bytes(&{
-            let mut b = bytes;
-            b[4] = 0x99; // tier0 lo (the fine end of the lo chain... actually coarsest)
-            b
-        });
-        assert_eq!(f.hi_distance(same_hi_diff_lo), 0, "hi chain unchanged");
-        assert!(
-            f.lo_distance(same_hi_diff_lo) > 0,
-            "lo chain diverges at tier0"
-        );
-    }
-
     #[test]
     fn kanban_tenant_round_trip_and_field_isolation() {
         let mut row = NodeRow {
diff --git a/crates/lance-graph-contract/src/facet.rs b/crates/lance-graph-contract/src/facet.rs
new file mode 100644
index 00000000..8d6eb441
--- /dev/null
+++ b/crates/lance-graph-contract/src/facet.rs
@@ -0,0 +1,390 @@
+//! `facet` — the content-blind **8:8 facet** substrate (a reusable 16-byte primitive).
+//!
+//! A [`FacetCascade`] is `facet_classid(4) | 6×(8:8) = 16 B` — one 128-bit register.
+//! The substrate is **ALWAYS 8:8** (each tier is two opaque bytes `hi:lo`); only the
+//! CONSUMER projects meaning onto the bytes — `(part_of:is_a)`, a `256:256` palette
+//! (CAM-PQ) centroid pair, `(group:member)`, `(mixin:identity)`, `(column:row)`, a
+//! `(Y:Z)` coordinate, or a concatenated `u16`. The producer bakes in nothing
+//! (AGI-as-glove: the SoA is content-blind, the reader interprets).
+//!
+//! It carries **no value-slab offset** — it is a *reading* over a borrowed `[u8; 16]`,
+//! so it never touches the operator-LOCKED 480-byte node layout. The
+//! `classid → ClassView` wiring that picks which 16 value bytes it reads is a separate
+//! step (`soa-value-tenant-migration-v1-harvest.md` §5.1, §5–§6).
+//!
+//! ## One register, four lanes
+//!
+//! The same 16 bytes are addressable at four granularities, each a single SIMD op —
+//! pick the lens by the operation (measured; the redout is granularity-free):
+//!
+//! | lens | unit | accessor | hardware op |
+//! |---|---|---|---|
+//! | **row** | 4× `u32` | [`FacetCascade::rows`] / [`row_match_mask`](FacetCascade::row_match_mask) | `vpcmpeqd` + `vmovmskps` |
+//! | **tile** | 8× `u16` (the 8:8) | [`tiers`](FacetCascade::tiers) / [`hi_chain`](FacetCascade::hi_chain) | `vpcmpeqw` / `pshufb` |
+//! | **prefix** | bit (LCP) | [`prefix_distance`](FacetCascade::prefix_distance) | `vpxor` + `tzcnt` (granularity-free) |
+//! | **nibble** | 32× `[4]` (Morton) | [`FacetTier::morton`] | GFNI `vgf2p8affineqb` (AVX-512) |
+//!
+//! Row 0 is the `facet_classid` (`{domain}{schema}`); rows 1–3 are the 6 cascade
+//! tiers paired coarse→fine (`HEEL:HIP` / `TWIG:LEAF` / `family:identity`). The layout
+//! is transpose-native: 4 facets → `_MM_TRANSPOSE4` → SoA columns for a batch sweep.
+
+/// One **8:8 tile** of a [`FacetCascade`] — ALWAYS exactly two bytes, `hi` and `lo`.
+/// The substrate is **content-blind**: only the CONSUMER (the
+/// [`FacetCascade::facet_classid`]'s ClassView) decides what the 8:8 *means*
+/// (`(part_of:is_a)`, a `256:256` palette centroid, `(group:member)`, `(column:row)`,
+/// a concatenated `u16`, …). `hi` is the coarse-side byte, `lo` the fine-side byte.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Default)]
+#[repr(C)]
+pub struct FacetTier {
+    /// Low byte of the LE 8:8 tile (is_a / member / row / centroid-lo / …).
+    pub lo: u8,
+    /// High byte of the LE 8:8 tile (part_of / group / column / centroid-hi / …).
+    pub hi: u8,
+}
+
+impl FacetTier {
+    /// The two bytes as the LE `u16 = (hi << 8) | lo` — the "consumer reads the 8:8
+    /// as one concatenated 16-bit value" projection.
+    #[inline]
+    #[must_use]
+    pub const fn as_u16(self) -> u16 {
+        ((self.hi as u16) << 8) | self.lo as u16
+    }
+
+    /// The `hi:lo` pair **Morton-interleaved** into a `u16` Z-order code (`lo` on
+    /// even bits, `hi` on odd) — the amortization benefit of the always-8:8
+    /// substrate: every nibble of the result is a **2 bit × 2 bit Morton tile**, so a
+    /// nibble prefix is a quad-tree quadrant in BOTH bytes at once (`256 = 4⁴`
+    /// hierarchical ancestry). Whatever the consumer decides the 8:8 means, it ALWAYS
+    /// amortizes to this one Morton tile cascade — uniform prefix routing.
+    #[inline]
+    #[must_use]
+    pub const fn morton(self) -> u16 {
+        Self::spread8(self.lo) | (Self::spread8(self.hi) << 1)
+    }
+
+    /// Spread a byte's 8 bits to the even positions `0,2,…,14` of a `u16` (the Morton
+    /// building block).
+    const fn spread8(x: u8) -> u16 {
+        let mut v = x as u16; // ........ abcdefgh
+        v = (v | (v << 4)) & 0x0F0F; // ....abcd ....efgh
+        v = (v | (v << 2)) & 0x3333; // ..ab..cd ..ef..gh
+        v = (v | (v << 1)) & 0x5555; // .a.b.c.d .e.f.g.h
+        v
+    }
+}
+
+/// The **FacetCascade** — a content-blind 16-byte facet: `facet_classid(4) | 6×(8:8)`.
+///
+/// **ALWAYS 8:8.** Six tiers, each two opaque bytes (`hi:lo`); the `facet_classid`'s
+/// ClassView decides the interpretation (see [`FacetTier`]). Both bytes of every tier
+/// are carried (lossless): the `hi` chain prefix-routes one hierarchy, the `lo` chain
+/// the orthogonal one. The full 6-tier facet does NOT fit the 64-bit key `NiblePath`
+/// (which carries only the 4-tier HHTL routing prefix,
+/// [`crate::hhtl::NiblePath::from_guid_prefix_v3`]) — the complete address lives here.
+///
+/// A *reading* over a borrowed `[u8; 16]`: NO value-slab offset, does not touch the
+/// LOCKED 480-byte layout. `#[repr(C, align(16))]` makes it a 128-bit register value
+/// byte-identical to `[u8; 16]`, so decode is a **reinterpret no-op** — see
+/// [`ref_from_bytes`](Self::ref_from_bytes) / [`as_bytes`](Self::as_bytes). The
+/// compiler reads fields/lanes straight from the backing store; nothing materializes.
+/// See the module docs for the one-register / four-lane design.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Default)]
+#[repr(C, align(16))]
+pub struct FacetCascade {
+    /// The facet's own class id — `{domain}{schema}`, row 0; which ClassView
+    /// interprets the 6 tiers' 8:8.
+    pub facet_classid: u32,
+    /// 6 tiers coarse→fine: `HEEL·HIP·TWIG·LEAF·family·identity`, each an 8:8 tile.
+    pub tiers: [FacetTier; 6],
+}
+
+const _: () = assert!(core::mem::size_of::<FacetTier>() == 2, "one 8:8 tile");
+const _: () = assert!(
+    core::mem::size_of::<FacetCascade>() == 16,
+    "facet_classid(4) | 6×(8:8)=12 = 16B (harvest §5.1)"
+);
+
+impl FacetCascade {
+    /// Decode from the 16 facet bytes (LE): `facet_classid` in `[0..4)`, then 6 tiers,
+    /// each an LE `u16 = (hi << 8) | lo` — on the wire `[lo, hi]` (the `converge.rs`
+    /// `tier(hi, lo)` byte order, matching the key tiers).
+    #[inline]
+    #[must_use]
+    pub const fn from_bytes(b: &[u8; 16]) -> Self {
+        FacetCascade {
+            facet_classid: u32::from_le_bytes([b[0], b[1], b[2], b[3]]),
+            tiers: [
+                FacetTier { lo: b[4], hi: b[5] },
+                FacetTier { lo: b[6], hi: b[7] },
+                FacetTier { lo: b[8], hi: b[9] },
+                FacetTier {
+                    lo: b[10],
+                    hi: b[11],
+                },
+                FacetTier {
+                    lo: b[12],
+                    hi: b[13],
+                },
+                FacetTier {
+                    lo: b[14],
+                    hi: b[15],
+                },
+            ],
+        }
+    }
+
+    /// Encode to the 16 facet bytes (LE), the inverse of [`from_bytes`](Self::from_bytes).
+    #[inline]
+    #[must_use]
+    pub const fn to_bytes(self) -> [u8; 16] {
+        let c = self.facet_classid.to_le_bytes();
+        let t = &self.tiers;
+        [
+            c[0], c[1], c[2], c[3], t[0].lo, t[0].hi, t[1].lo, t[1].hi, t[2].lo, t[2].hi, t[3].lo,
+            t[3].hi, t[4].lo, t[4].hi, t[5].lo, t[5].hi,
+        ]
+    }
+
+    /// The whole facet as one LE `u128` — the single-register view (the `vmovdqu`
+    /// load). Use for the bit-level redout ([`prefix_distance`](Self::prefix_distance))
+    /// and for SIMD batch.
+    #[inline]
+    #[must_use]
+    pub const fn as_u128(self) -> u128 {
+        u128::from_le_bytes(self.to_bytes())
+    }
+
+    /// Build from the single-register LE `u128` — inverse of [`as_u128`](Self::as_u128).
+    #[inline]
+    #[must_use]
+    pub const fn from_u128(v: u128) -> Self {
+        Self::from_bytes(&v.to_le_bytes())
+    }
+
+    /// Zero-cost view of the facet AS its 16 LE bytes — a **reinterpret no-op**
+    /// (`repr(C, align(16))`, byte-identical to `[u8; 16]`); the compiler emits no
+    /// conversion. Companion to [`ref_from_bytes`](Self::ref_from_bytes).
+    #[inline]
+    #[must_use]
+    pub fn as_bytes(&self) -> &[u8; 16] {
+        // SAFETY: FacetCascade is #[repr(C, align(16))], size_of == 16, byte-identical
+        // to [u8; 16] and strictly more-aligned (16 ≥ 1). The bytes ARE the facet's own
+        // backing store — a pure pointer reinterpret, lifetime tied to `&self`.
+        unsafe { &*(self as *const Self).cast::<[u8; 16]>() }
+    }
+
+    /// **Zero-copy borrow** of 16 slab bytes AS a facet — the literal no-op decode: the
+    /// compiler reads fields/lanes straight from the slab, nothing materializes. Returns
+    /// `None` if `b` is not 16-byte aligned (then copy via [`from_bytes`](Self::from_bytes)).
+    /// Mirrors `node_rows_from_le_bytes`'s checked reinterpret.
+    #[inline]
+    #[must_use]
+    pub fn ref_from_bytes(b: &[u8; 16]) -> Option<&Self> {
+        if !(b.as_ptr() as usize).is_multiple_of(core::mem::align_of::<Self>()) {
+            return None;
+        }
+        // SAFETY: 16-byte alignment checked above; FacetCascade is #[repr(C,
+        // align(16))], size_of == 16 == the array, byte-identical layout — a pure
+        // reinterpret of the borrow, lifetime tied to `b`.
+        Some(unsafe { &*(b.as_ptr().cast::<Self>()) })
+    }
+
+    /// The 4 **dword rows** (the 4×4 lane): `[facet_classid, HEEL:HIP, TWIG:LEAF,
+    /// family:identity]`. `rows()[0] == facet_classid`. Compares as `vpcmpeqd`.
+    #[inline]
+    #[must_use]
+    pub const fn rows(self) -> [u32; 4] {
+        let b = self.to_bytes();
+        [
+            u32::from_le_bytes([b[0], b[1], b[2], b[3]]),
+            u32::from_le_bytes([b[4], b[5], b[6], b[7]]),
+            u32::from_le_bytes([b[8], b[9], b[10], b[11]]),
+            u32::from_le_bytes([b[12], b[13], b[14], b[15]]),
+        ]
+    }
+
+    /// The `hi`-byte chain, coarse→fine — one hierarchy (part_of / group / column /
+    /// centroid-hi, per the consumer).
+    #[inline]
+    #[must_use]
+    pub const fn hi_chain(self) -> [u8; 6] {
+        let t = &self.tiers;
+        [t[0].hi, t[1].hi, t[2].hi, t[3].hi, t[4].hi, t[5].hi]
+    }
+
+    /// The `lo`-byte chain, coarse→fine — the orthogonal hierarchy (is_a / member /
+    /// row / centroid-lo, per the consumer).
+    #[inline]
+    #[must_use]
+    pub const fn lo_chain(self) -> [u8; 6] {
+        let t = &self.tiers;
+        [t[0].lo, t[1].lo, t[2].lo, t[3].lo, t[4].lo, t[5].lo]
+    }
+
+    /// Shared coarse→fine prefix length (0..=6) of two 6-byte chains.
+    const fn shared6(a: [u8; 6], b: [u8; 6]) -> u8 {
+        let mut n = 0u8;
+        while (n as usize) < 6 && a[n as usize] == b[n as usize] {
+            n += 1;
+        }
+        n
+    }
+
+    /// `hi`-chain distance: `6 − shared hi-prefix` — locality along the `hi` hierarchy,
+    /// orthogonal to [`lo_distance`](Self::lo_distance).
+    #[inline]
+    #[must_use]
+    pub const fn hi_distance(self, other: Self) -> u8 {
+        6 - Self::shared6(self.hi_chain(), other.hi_chain())
+    }
+
+    /// `lo`-chain distance: `6 − shared lo-prefix` — locality along the orthogonal `lo`
+    /// hierarchy, on the SAME facet.
+    #[inline]
+    #[must_use]
+    pub const fn lo_distance(self, other: Self) -> u8 {
+        6 - Self::shared6(self.lo_chain(), other.lo_chain())
+    }
+
+    /// Number of fully-matching low **tiles** (0..=8, classid tiles 0–1 first, then the
+    /// 6 cascade tiers) — the granularity-free LCP redout: `(xor).trailing_zeros() / 16`.
+    /// `8` ⇒ identical. The whole-facet prefix over class + cascade in one `vpxor`+`tzcnt`.
+    #[inline]
+    #[must_use]
+    pub const fn shared_prefix_tiles(self, other: Self) -> u8 {
+        let x = self.as_u128() ^ other.as_u128();
+        if x == 0 {
+            8
+        } else {
+            (x.trailing_zeros() / 16) as u8
+        }
+    }
+
+    /// `8 − shared_prefix_tiles` — the coarse→fine tile distance over the whole facet
+    /// (class first, then the cascade). `0` ⇒ identical.
+    #[inline]
+    #[must_use]
+    pub const fn prefix_distance(self, other: Self) -> u8 {
+        8 - self.shared_prefix_tiles(other)
+    }
+
+    /// 4-bit mask: bit `i` set iff [`row`](Self::rows) `i` matches `other` — the
+    /// dword-lane "which of `{class, HEEL:HIP, TWIG:LEAF, family:identity}` agree"
+    /// (`vpcmpeqd` + `vmovmskps`).
+    #[inline]
+    #[must_use]
+    pub const fn row_match_mask(self, other: Self) -> u8 {
+        let (a, b) = (self.rows(), other.rows());
+        let mut m = 0u8;
+        let mut i = 0;
+        while i < 4 {
+            if a[i] == b[i] {
+                m |= 1 << i;
+            }
+            i += 1;
+        }
+        m
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn sample() -> [u8; 16] {
+        [
+            0xEF, 0xBE, 0xAD, 0xDE, // facet_classid = 0xDEAD_BEEF (LE)
+            0x01, 0xAB, // tier0 lo=01 hi=AB
+            0x02, 0xCD, // tier1
+            0x03, 0xEF, // tier2
+            0x04, 0x12, // tier3
+            0x05, 0x34, // tier4
+            0x06, 0x56, // tier5
+        ]
+    }
+
+    #[test]
+    fn always_8_8_consumer_neutral_roundtrip_and_lanes() {
+        assert_eq!(core::mem::size_of::<FacetCascade>(), 16);
+        assert_eq!(core::mem::size_of::<FacetTier>(), 2);
+
+        let b = sample();
+        let f = FacetCascade::from_bytes(&b);
+        assert_eq!(f.facet_classid, 0xDEAD_BEEF);
+        assert_eq!(f.to_bytes(), b, "round-trip is exact (8:8 stored verbatim)");
+
+        // u128 single-register view round-trips.
+        assert_eq!(FacetCascade::from_u128(f.as_u128()), f);
+        assert_eq!(f.as_u128(), u128::from_le_bytes(b));
+
+        // The two orthogonal chains (content-neutral hi/lo).
+        assert_eq!(f.hi_chain(), [0xAB, 0xCD, 0xEF, 0x12, 0x34, 0x56]);
+        assert_eq!(f.lo_chain(), [0x01, 0x02, 0x03, 0x04, 0x05, 0x06]);
+
+        // The 4 dword rows; row 0 IS the classid.
+        let r = f.rows();
+        assert_eq!(r[0], 0xDEAD_BEEF);
+        assert_eq!(r[0], f.facet_classid);
+        assert_eq!(r[1], u32::from_le_bytes([0x01, 0xAB, 0x02, 0xCD]));
+
+        // Tier projections: concatenated u16 + Morton tile (2bit×2bit).
+        assert_eq!(f.tiers[0].as_u16(), 0xAB01);
+        assert_eq!(
+            f.tiers[0].morton() & 0x5555,
+            FacetTier { lo: 0x01, hi: 0 }.morton()
+        );
+    }
+
+    #[test]
+    fn redout_is_granularity_free_and_orthogonal() {
+        let f = FacetCascade::from_bytes(&sample());
+
+        // identical ⇒ all 8 tiles shared, distance 0.
+        assert_eq!(f.shared_prefix_tiles(f), 8);
+        assert_eq!(f.prefix_distance(f), 0);
+        assert_eq!(f.row_match_mask(f), 0b1111);
+
+        // Differ only in tier0's is_a (lo) byte ⇒ hi chain intact, lo chain diverges
+        // at tier0; the whole-facet prefix breaks after the 2 classid tiles (tile 2).
+        let mut b = sample();
+        b[4] = 0x99; // tier0 lo
+        let g = FacetCascade::from_bytes(&b);
+        assert_eq!(f.hi_distance(g), 0, "hi chain unchanged");
+        assert!(f.lo_distance(g) > 0, "lo chain diverges at tier0");
+        assert_eq!(
+            f.shared_prefix_tiles(g),
+            2,
+            "class (tiles 0-1) shared, tile 2 differs"
+        );
+        // row 1 (HEEL:HIP, holds tier0) differs; rows 0/2/3 match.
+        assert_eq!(f.row_match_mask(g), 0b1101);
+
+        // Differ in the classid (row 0) ⇒ diverge at the very first tile.
+        let h = FacetCascade::from_u128(f.as_u128() ^ 1);
+        assert_eq!(h.shared_prefix_tiles(f), 0);
+        assert_eq!(h.row_match_mask(f), 0b1110);
+    }
+
+    #[test]
+    fn reinterpret_is_a_no_op() {
+        // align(16) ⇒ the facet's own bytes are 16-aligned, so the zero-copy borrow
+        // round-trips: bytes → &FacetCascade reads straight from the same store.
+        let f = FacetCascade::from_bytes(&sample());
+        let bytes: &[u8; 16] = f.as_bytes();
+        assert_eq!(bytes, &f.to_bytes());
+        assert_eq!(
+            bytes.as_ptr() as usize,
+            &f as *const _ as usize,
+            "as_bytes is a pointer reinterpret, no copy"
+        );
+        let g = FacetCascade::ref_from_bytes(bytes).expect("a facet's own bytes are 16-aligned");
+        assert_eq!(*g, f);
+        assert_eq!(
+            g as *const FacetCascade as usize,
+            bytes.as_ptr() as usize,
+            "ref_from_bytes is a borrow reinterpret, no decode"
+        );
+        assert_eq!(core::mem::align_of::<FacetCascade>(), 16);
+    }
+}
diff --git a/crates/lance-graph-contract/src/lib.rs b/crates/lance-graph-contract/src/lib.rs
index bad59293..af3a4c88 100644
--- a/crates/lance-graph-contract/src/lib.rs
+++ b/crates/lance-graph-contract/src/lib.rs
@@ -71,6 +71,7 @@ pub mod episodic_edges;
 pub mod escalation;
 pub mod exploration;
 pub mod external_membrane;
+pub mod facet;
 pub mod faculty;
 pub mod grammar;
 pub mod graph_render;