Epic 28: loopctl as the agent knowledge platform — move stable knowledge primitives + ingestion into the API (after Epic 27)

## Thesis
Turn loopctl from "a store the Python skills talk to" into **the platform that does the knowledge work and serves LLM agents directly**. Today the harvesting/retrieval logic lives in ~6 Python skills (a venv each, `loopctl_client.py` copied 6×, symlinked dirs, env-key juggling, cross-machine drift — a whole `claude-config-sync` skill exists only to fight that drift). Consolidating the **stable knowledge primitives** into loopctl (Elixir, tested, deployed once, reachable by any agent/machine/user via API+MCP) removes that mess and makes the value a sellable product (cf. the engine's own `ContextForge` + `RAGBench` proposals).

**Sequence: AFTER Epic 27 (knowledge scale hardening, #175/#176/#177).** Productizing primitives that aren't yet scale-correct just bakes the bugs in. Harden first, then expose the hardened primitives as the platform API.

## The cut (what moves where) — the load-bearing decision
| Bucket | What | Goes to |
|---|---|---|
| **Primitives** | ingest→chunk→synthesize→dedup→embed→link→store; retrieval; quality/prune; dedup; graph; pairs/novelty/walk | **loopctl (Elixir API)** |
| **Local bridge** | local file-tree discovery (Dropbox/Synology/gdrive/`~/workspace`), OCR (tesseract), Calibre convert, portfolio/`gh` gathering | **thin claude-config skill** (can't disappear — loopctl can't reach the user's disk) |
| **Orchestration** | the idea/creativity engine; (future) content-production | **agent skill over the primitives** — keep prompts/model/tilt out of the deploy cycle |

## Keystone deliverable: a server-side ingestion endpoint
`POST /api/v1/knowledge/ingest` (Oban job + status): accepts `{text | file | url, source_type, format, project_id, extra_tags}` and runs the full pipeline server-side — extract (for server-reachable formats) → chunk → synthesize (configurable LLM; port the careful prompts) → dedup via idempotency tag → embed → link → store. This single endpoint absorbs the *synthesis + publish* half of `book/document/code/web/youtube-knowledge-extract`. Format extraction that needs local resources (OCR of local scans, Calibre) stays in the bridge, which uploads already-extracted text.

## Inventory (current scripts → target)
- `extract_book/docs/repo/web/youtube .py` synthesis+publish → **ingest endpoint**. Their *local* extraction (OCR/Calibre/file-walk) → **bridge skill**.
- `prune_kb.py` (LLM-judged junk removal) → **curation endpoint / Oban job** (server-side quality).
- `cleanup_frontmatter.py`, dedup, idempotency → **fold into ingest + curation**.
- `backup.py` → already covered once Epic 27 #176 (streamed export) lands.
- `idea-synthesizer` (creativity engine) + `portfolio.py` → **stays an agent skill** over loopctl primitives.
- `second-brain-orchestrator` (monitor/harvest_all) → thins to **a watcher that feeds the ingest API**.

## Companion: retrieval-quality measurement (the "dogfood" win)
Build the `RAGBench` dogfood loop as part of the platform: continuously score loopctl's *own* retrieval quality (chunking/ranking/recall) on its own corpus, surfaced as a health metric. This is the measurement layer Epic 27 Theme 1 (observability) implies, and it's the proof-of-concept that sells the platform.

## Honest caveats
- **Move the *stable* logic; keep experimenting in scripts.** OCR/idempotency/chunking are stable → port. The creativity tuning and any new extractor still change weekly → keep as fast-iterating skills until they settle. Don't trade Python's iteration speed for Elixir's deploy ceremony prematurely.
- **The local bridge never fully disappears** — but it shrinks from ~600-line harvesters to a ~50-line uploader.
- **LLM cost/model/keys move server-side** for the ingest synthesis — needs per-tenant config, rate limits, cost attribution.

## Out of scope (separate, later)
- Loop-*execution* architecture refactor (gen_statem/supervision/telemetry) — its own design-first epic, lower priority (loop enforcement matters less as agents improve).
- Content-production pipeline (corpus → drafted posts/newsletters) — an agent skill, not a loopctl endpoint, per the cut above.


Bucket	What	Goes to
Primitives	ingest→chunk→synthesize→dedup→embed→link→store; retrieval; quality/prune; dedup; graph; pairs/novelty/walk	loopctl (Elixir API)
Local bridge	local file-tree discovery (Dropbox/Synology/gdrive/`~/workspace`), OCR (tesseract), Calibre convert, portfolio/`gh` gathering	thin claude-config skill (can't disappear — loopctl can't reach the user's disk)
Orchestration	the idea/creativity engine; (future) content-production	agent skill over the primitives — keep prompts/model/tilt out of the deploy cycle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic 28: loopctl as the agent knowledge platform — move stable knowledge primitives + ingestion into the API (after Epic 27) #179

Thesis

The cut (what moves where) — the load-bearing decision

Keystone deliverable: a server-side ingestion endpoint

Inventory (current scripts → target)

Companion: retrieval-quality measurement (the "dogfood" win)

Honest caveats

Out of scope (separate, later)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Epic 28: loopctl as the agent knowledge platform — move stable knowledge primitives + ingestion into the API (after Epic 27) #179

Description

Thesis

The cut (what moves where) — the load-bearing decision

Keystone deliverable: a server-side ingestion endpoint

Inventory (current scripts → target)

Companion: retrieval-quality measurement (the "dogfood" win)

Honest caveats

Out of scope (separate, later)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions