feat(knowledge): novelty-gated write-back (agents' KB #1)#222
Merged
Conversation
Turn the KB from a blind write sink into a curated one: when an agent proposes an
article, gate it against the published corpus instead of publishing whatever it
sends. The KB does only the MECHANICAL part (embed → cosine → route); the merge
decision stays with the consuming agent, which is a step smarter than the KB.
Pipeline (POST /api/v1/articles, default ON; force:true bypasses):
ProposalGate.assess → embed '{title}\n\n{body}', VectorSearch.nearest, classify:
>= 0.97 duplicate → create nothing, 200 + point at the canonical article
>= 0.88 overlap → create as a DRAFT, stamp metadata.proposal_novelty
(score + nearest ids) for a reviewer/consumer to merge
< 0.88 novel → create on the requested path
Response carries so the agent can act in-session.
Design choices:
- Behaviour + config DI (ProposalAssessorBehaviour / MockProposalAssessor), so
propose_article/3 is unit-testable and the assessor is swappable.
- Resilient: ANY embedding failure (API down, power/internet outage) or a
system-scoped (nil-tenant) proposal falls OPEN (:unknown) → never blocks a write.
- Non-destructive: flags duplicates, never edits/deletes; a vanished canonical
neighbor falls through to create.
- Reuses existing machinery — status :draft + list_drafts review queue, the
embedding client, VectorSearch (raw-vector path, no row needed). No new status.
- Thresholds config-tunable (:knowledge_proposal_{duplicate,overlap}_threshold).
Tests (+19): pure classify bands; real-pgvector assess (duplicate/low/novel/
fall-open/system-scope); propose_article routing incl. canonical-vanished + tenant
isolation; controller verdict→HTTP rendering + force bypass. Default stub is :novel
so all existing create tests stay green. Full gate green (3037 tests, dialyzer, credo).
mkreyman
added a commit
that referenced
this pull request
Jun 30, 2026
…s (v2.25.0) (#223) The novelty-gated write-back (#222) is enforced server-side, so MCP clients already get gated behavior. This makes the agent AWARE of it: knowledge_create's description now explains the gate.verdict outcomes (duplicate → nothing created, read data.id; gated_to_draft → created as draft with metadata.proposal_novelty; created → novel), and adds a force:true param to bypass the gate. +2 tests for the force passthrough. Bumps to 2.25.0 (mcp-autopublish publishes on package.json version change).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What — agents' KB goodie #1: novelty-gated write-back
Turns the KB from a blind write sink into a curated one. When an agent proposes an article (
POST /api/v1/articles), it's gated against the published corpus instead of publishing whatever it sends. The KB does only the mechanical part — embed → cosine → route — and the merge decision stays with the consuming agent, which is a step smarter than the KB.Pipeline (default ON;
force: truebypasses)ProposalGate.assessembeds"{title}\n\n{body}", runsVectorSearch.nearest, and classifies the top similarity:0.97duplicate200pointing at the canonical article0.88low_noveltymetadata.proposal_novelty(score + nearest ids) for a reviewer/consumer to merge0.88novelThe response carries
gate: {verdict, similarity, nearest}so the agent can act in-session (read/update the canonical instead of duplicating).Design
ProposalAssessorBehaviour/MockProposalAssessor) →propose_article/3is unit-testable, the assessor is swappable.:unknown) → the gate never blocks a write.status: :draft+list_draftsreview queue, the embedding client,VectorSearch(raw-vector path, no row needed). No new status.:knowledge_proposal_{duplicate,overlap}_threshold).Tests (+19)
Pure
classifybands; real-pgvectorassess(duplicate / low / novel / fall-open / system-scope);propose_articlerouting incl. canonical-vanished + tenant isolation; controller verdict→HTTP rendering +forcebypass. Default stub is:novelso all 44 existing create tests stay green.Full gate green locally: format, credo --strict, dialyzer, 3037 tests, 0 failures.
Follow-up (next PR)
MCP
knowledge_createtool description/params (surfacegate.verdict+force) and reconciledebrief.md(its "always creates drafts" note predates this; the server now enforces the gate). The gate is already live for MCP clients since it's enforced server-side at the API.