Skip to content

refactor: knowledge system rename, agent decoupling, and feature alignment #199

Description

@dean0x

Knowledge System Refactor

Comprehensive refactor of the three knowledge-related systems to fix naming confusion, decouple the learning/decisions agents, align naming conventions, and improve implementation quality.

Context & Motivation

Today, the word "knowledge" is overloaded across three distinct systems:

  • Feature Knowledge (.features/) — per-feature area documentation (patterns, architecture, gotchas)
  • Decisions & Pitfalls (.memory/knowledge/) — architectural decisions (ADR-NNN) and pitfalls (PF-NNN)
  • Self-Learning (.memory/learning-log.jsonl) — pattern detection from session transcripts

The naming creates confusion: knowledge-persistence is about decisions/pitfalls (not feature knowledge), the Knowledge agent creates feature KBs (not decisions), KNOWLEDGE_CONTEXT carries the ADR/PF index while FEATURE_KNOWLEDGE carries feature KB content. Additionally, the learning system and decisions system are entangled — one background Sonnet agent detects all 4 observation types (workflow, procedural, decision, pitfall) in a single call, making them impossible to toggle independently.

Decisions Made

Naming

Current New Name System
KNOWLEDGE_CONTEXT (variable) DECISIONS_CONTEXT Decisions
FEATURE_KNOWLEDGE (variable) Keep as-is Feature Knowledge
knowledge-persistence (skill) decisions-format Decisions
apply-knowledge (skill) apply-decisions Decisions
knowledge-context.cjs (script) decisions-index.cjs Decisions
feature-kb (skill) feature-knowledge Feature Knowledge
apply-feature-kb (skill) apply-feature-knowledge Feature Knowledge
feature-kb.cjs (script) feature-knowledge.cjs Feature Knowledge
devflow kb (CLI command) devflow knowledge Feature Knowledge
--kb / --no-kb (init flags) --knowledge / --no-knowledge Feature Knowledge
features.kb (manifest field) features.knowledge Feature Knowledge
session-end-kb-refresh (hook) session-end-knowledge-refresh Feature Knowledge
background-kb-refresh (hook) background-knowledge-refresh Feature Knowledge
.features/.kb.lock .features/.knowledge.lock Feature Knowledge
.features/.kb-last-refresh .features/.knowledge-last-refresh Feature Knowledge
Knowledge agent Keep as-is (name is fine) Feature Knowledge
.memory/knowledge/ (directory) .memory/decisions/ Decisions
.memory/.knowledge.lock .memory/.decisions.lock Decisions
.memory/.knowledge-usage.json .memory/.decisions-usage.json Decisions

Agent Decoupling

Split the single background Sonnet agent into 2 independent agents:

Agent Detects Input Channel Output Toggle
Learning agent workflow + procedural USER_SIGNALS (clean user text) .claude/commands/self-learning/, .claude/skills/self-learning:*/ devflow learn --enable/--disable
Decisions agent decision + pitfall DIALOG_PAIRS (assistant→user pairs) .memory/decisions/decisions.md, .memory/decisions/pitfalls.md devflow decisions --enable/--disable

Each agent gets its own:

  • Custom-tailored system prompt (focused on its 2 types only)
  • CLI-mediated background runner (see Background Agent Architecture below)
  • Session-end trigger hook (thin ~5-line shell trigger)
  • Batch counter and lock file
  • Toggle mechanism (hook presence + sentinel file + manifest field)

Background Agent Architecture — Align to Pattern A (CLI-Mediated)

Today there are two patterns for background agents in the codebase:

Aspect Pattern A: Feature Knowledge CLI Pattern B: Learning/Memory Hooks
Trigger CLI command (devflow kb create) Hook shell script directly
Prompt construction TypeScript, structured Shell script, 400+ lines inline
Model config Agent YAML definition Hardcoded or loaded from JSON files
Tool allowlist Declared in code Hardcoded in shell strings
Output handling Expects agent to write files Shell parses JSON / checks mtimes
Error recovery CLI reports to user .processing file, watchdog timers

Decision: All background agents must follow Pattern A (CLI-mediated).

The new background-decisions and updated background-learning scripts will NOT be 400+ line shell scripts with inline prompts. Instead:

  1. Hooks become thin triggers (~5-10 lines): guard clauses, batch check, then nohup devflow learn --run-background & disown or nohup devflow decisions --run-background & disown
  2. Prompt construction moves to TypeScript: src/cli/commands/learn.ts and src/cli/commands/decisions.ts gain a --run-background subcommand that handles prompt building, config loading, model resolution, claude -p invocation, and output processing
  3. Node.js helpers stay: json-helper.cjs, transcript-filter.cjs, staleness.cjs etc. continue doing the real work — the TypeScript CLI calls them just like the shell did, but with proper error handling
  4. What stays in the hooks: feedback loop guards (DEVFLOW_BG_LEARNER=1), raw turn capture (prompt-capture-memory), throttling (120s checks), and the nohup ... & disown detach — but detaching a devflow CLI command instead of a raw bash script

Benefits:

  • Testable, type-safe prompt construction (no more shell-embedded prompts)
  • Centralized config loading (CLI already has this infrastructure)
  • Tool allowlists and flags in one place
  • Proper JSON handling instead of sed + shell conditionals
  • Same crash recovery and throttling patterns as the feature knowledge CLI

Structured Output

The decisions agent will use claude -p --json-schema instead of the current approach where Sonnet outputs a semicolon-delimited details string that gets regex-parsed. With --json-schema, observation fields (area, issue, impact, resolution, context, decision, rationale) become proper JSON keys validated by the Claude API. This eliminates the fragile regex parsing in render-ready.

Threshold Changes

Type Current New
Decision 2 obs required, 0.65 confidence 1 obs, 0.65 confidence, 0 spread
Pitfall 2 obs required, 0.65 confidence 1 obs, 0.65 confidence, 0 spread
Workflow 3 obs required, 0.60 confidence No change
Procedural 4 obs required, 0.70 confidence No change

Rationale: Decisions are typically made once with explicit rationale. Pitfalls are corrections of assistant behavior — strong signal on first occurrence. Both types already have strict quality gates (quality_ok requires rationale anchors for decisions, both assistant action + user rejection for pitfalls).

Toggleability

Each system gets independent toggle following the feature knowledge pattern:

  • Hook presence in settings.json
  • Sentinel file (.disabled) for runtime gating
  • Manifest field for tracking user choice

Init flow becomes:

Working memory:        [Y/n]
Self-learning:         [Y/n]   # workflows + procedural
Decision tracking:     [Y/n]   # decisions + pitfalls
Feature knowledge:     [Y/n]   # per-area KBs

Implementation Phases

Phase 1: Rename — Decisions System (PR 1)

Scope: Rename all ADR/PF-related identifiers from "knowledge" to "decisions".

Files (~35):

  • shared/skills/knowledge-persistence/shared/skills/decisions-format/ (rename directory + update SKILL.md content)
  • shared/skills/apply-knowledge/shared/skills/apply-decisions/ (rename directory + update SKILL.md content)
  • scripts/hooks/lib/knowledge-context.cjsscripts/hooks/lib/decisions-index.cjs (rename + update internal references)
  • scripts/hooks/json-helper.cjs — update all references to .knowledge.lock, knowledge/decisions.md, knowledge/pitfalls.md, knowledge-append operation, lock dir names
  • 6 orchestration skills (plan:orch, implement:orch, review:orch, resolve:orch, explore:orch, debug:orch) — KNOWLEDGE_CONTEXTDECISIONS_CONTEXT, skill references
  • 5 agent definitions (reviewer, resolver, designer, evaluator, scrutinizer) — skill references devflow:apply-knowledgedevflow:apply-decisions
  • All plugin plugin.json files that reference old skill names
  • CLAUDE.md — update all references
  • Tests: tests/resolve/knowledge-citation.test.ts, tests/learning/render-pitfall.test.ts, tests/learning/render-decision.test.ts, tests/learning/knowledge-usage-scan.test.ts, tests/learning/reconcile.test.ts, etc.
  • Migration: Add rename-knowledge-to-decisions per-project migration that moves .memory/knowledge/.memory/decisions/ and .memory/.knowledge.lock.memory/.decisions.lock and .memory/.knowledge-usage.json.memory/.decisions-usage.json

Verification: All 1022+ tests pass. npm run build succeeds. devflow init works.

Phase 2: Rename — Feature Knowledge System (PR 2)

Scope: Rename all "KB" / "feature-kb" references to "feature-knowledge" / "knowledge".

Files (~30):

  • shared/skills/feature-kb/shared/skills/feature-knowledge/ (rename directory + update SKILL.md)
  • shared/skills/apply-feature-kb/shared/skills/apply-feature-knowledge/ (rename directory + update SKILL.md)
  • scripts/hooks/lib/feature-kb.cjsscripts/hooks/lib/feature-knowledge.cjs
  • scripts/hooks/session-end-kb-refreshscripts/hooks/session-end-knowledge-refresh
  • scripts/hooks/background-kb-refreshscripts/hooks/background-knowledge-refresh
  • src/cli/commands/kb/src/cli/commands/knowledge/ (rename directory + update all files)
  • src/cli/commands/init.ts--kb--knowledge, kbEnabledknowledgeEnabled, display text
  • src/cli/utils/manifest.tsfeatures.kbfeatures.knowledge
  • src/cli/utils/post-install.ts.features/.kb.lock.features/.knowledge.lock, .features/.kb-last-refresh.features/.knowledge-last-refresh
  • 6 orchestration skills — devflow:feature-kbdevflow:feature-knowledge, devflow:apply-feature-kbdevflow:apply-feature-knowledge
  • 5 agent definitions — skill references update
  • Plugin plugin.json files
  • Knowledge agent (shared/agents/knowledge.md) — skill reference update (agent name stays)
  • CLAUDE.md — update all references
  • Tests: tests/feature-kb/ directory rename + content updates
  • Migration: Add rename-kb-to-knowledge per-project migration that renames .features/.kb.lock.features/.knowledge.lock and .features/.kb-last-refresh.features/.knowledge-last-refresh

Verification: All tests pass. npm run build succeeds. devflow init and devflow knowledge work.

Phase 3: Decouple Decisions Agent + Migrate to Pattern A (PR 3)

Scope: Split the single background learner into 2 independent agents, both following the CLI-mediated pattern (Pattern A).

This is the meaty architecture change. The current background-learning shell script (400+ lines, inline Sonnet prompt, raw JSON parsing) gets replaced by two CLI-mediated background runners.

Changes:

3a. New CLI command: devflow decisions

  • devflow decisions --enable/--disable/--status — toggle (following the feature knowledge pattern: hook + sentinel + manifest)
  • devflow decisions --run-background — the actual background runner, invoked by the thin session-end hook
  • TypeScript handles: config loading, transcript extraction (calls transcript-filter.cjs), prompt construction, claude -p --model sonnet --json-schema invocation, response validation, calls json-helper.cjs process-observations and render-ready
  • Prompt: focused on decision+pitfall detection only, using DIALOG_PAIRS channel
  • --json-schema enforces structured output: area, issue, impact, resolution, context, decision, rationale as proper JSON keys

3b. Migrate background-learning to CLI-mediated

  • Add devflow learn --run-background subcommand
  • Move prompt construction, config loading, Sonnet invocation, and post-processing pipeline from scripts/hooks/background-learning (shell) to src/cli/commands/learn.ts (TypeScript)
  • Prompt: focused on workflow+procedural detection only, using USER_SIGNALS channel
  • Keep existing Node.js helpers (json-helper.cjs, transcript-filter.cjs, staleness.cjs) — CLI calls them via execFileSync/execFile

3c. Thin hooks

  • session-end-learning becomes ~10 lines: guard clauses, batch counting, then nohup devflow learn --run-background & disown
  • New session-end-decisions: same structure, nohup devflow decisions --run-background & disown
  • All prompt construction, config loading, JSON validation, timeout management moves to the CLI commands

3d. Update json-helper.cjs

  • render-ready for decisions uses structured JSON fields (from --json-schema) instead of regex-parsing details string
  • process-observations updated for new threshold (1 obs for decision/pitfall)
  • Separate manifest files or namespaced entries per agent

3e. Update devflow init

  • Add --decisions/--no-decisions flag
  • Add interactive prompt for decision tracking
  • Install/remove session-end-decisions hook based on user choice

3f. What stays in hooks vs what moves to CLI

Stays in hooks (thin triggers):

  • Feedback loop guards (DEVFLOW_BG_LEARNER=1 / DEVFLOW_BG_UPDATER=1 env vars)
  • Session depth check (min 3 user turns)
  • Reinforcement of loaded self-learning artifacts (local grep, no LLM)
  • Batch counting and batch-full check
  • Daily cap enforcement
  • nohup devflow <cmd> --run-background & disown detach

Moves to CLI (devflow learn --run-background / devflow decisions --run-background):

  • Config loading (model, throttle, caps, debug)
  • Lock acquisition and stale lock recovery
  • Transcript extraction (calls transcript-filter.cjs)
  • Prompt construction (TypeScript string templates, not shell-embedded)
  • claude -p invocation with timeout watchdog
  • Response validation and JSON parsing
  • process-observations call
  • render-ready call
  • Staleness check call
  • Log rotation

Verification: All tests pass. Both agents can be toggled independently. devflow learn --status and devflow decisions --status work. Decisions are captured with threshold 1. Learning still requires 3+ observations for workflows. Hooks are <15 lines each.

Phase 4: Threshold Update + Manual Entry (PR 4)

Scope: Update promotion thresholds and add manual decision recording.

Changes:

  1. Threshold update: Decision and pitfall promotion thresholds → 1 observation required (already partially done in Phase 3 via process-observations update, but this phase verifies end-to-end)
  2. Manual entry command: devflow decisions add — interactive prompt that asks for type (decision/pitfall), pattern name, and structured fields. Writes directly to decisions.md/pitfalls.md using the same decisions-append operation in json-helper.cjs. No LLM needed — user provides the content.
  3. Update devflow decisions --review: Port relevant review functionality from devflow learn --review for decisions-specific management (deprecate entries, inspect observations)
  4. Tests: End-to-end test for manual entry, threshold verification

Verification: Manual devflow decisions add creates proper ADR/PF entries. Background agent captures decisions on first quality observation. devflow decisions --review works.

Phase 5: Directory Consolidation Under .devflow/ (PR 5)

Scope: Move all devflow project-level state under a single .devflow/ root directory. Clean break — no backwards compatibility shims.

Current structure (scattered across project root):

project-root/
├── .memory/                          # working memory + learning + decisions
│   ├── WORKING-MEMORY.md
│   ├── backup.json
│   ├── .pending-turns.jsonl
│   ├── .working-memory-last-trigger
│   ├── .gitignore-configured
│   ├── learning-log.jsonl            # learning system state
│   ├── learning.json
│   ├── learning-log.v1.jsonl.bak
│   ├── .learning-manifest.json
│   ├── .learning-notified-at
│   ├── .learning-runs-today
│   ├── .learning-session-count
│   ├── .decisions-usage.json         # (renamed in Phase 1)
│   ├── decisions/                    # (renamed in Phase 1)
│   │   ├── decisions.md
│   │   └── pitfalls.md
│   └── working/                      # catch-up cache (HUD feature)
│       ├── catch-up-cache.json
│       ├── catch-up-state.json
│       ├── catch-up-summary.json
│       ├── recent-summaries.json
│       └── session-*.json
├── .features/                        # feature knowledge bases
│   ├── index.json
│   ├── .knowledge.lock               # (renamed from .kb.lock in Phase 2)
│   ├── .knowledge-last-refresh       # (renamed from .kb-last-refresh in Phase 2)
│   ├── .disabled
│   └── {slug}/KNOWLEDGE.md
└── .docs/                            # review reports, design docs
    ├── reviews/{branch-slug}/{timestamp}/
    ├── design/
    ├── audits/
    ├── competitors/
    ├── features/
    ├── investigations/
    ├── references/
    ├── releases/
    └── status/

Target structure (consolidated):

project-root/
└── .devflow/
    ├── memory/                       # working memory
    │   ├── WORKING-MEMORY.md
    │   ├── backup.json
    │   ├── .pending-turns.jsonl
    │   ├── .working-memory-last-trigger
    │   └── working/                  # catch-up cache
    ├── decisions/                    # architectural decisions & pitfalls
    │   ├── decisions.md
    │   ├── pitfalls.md
    │   ├── .decisions.lock
    │   ├── .decisions-usage.json
    │   └── .disabled
    ├── learning/                     # self-learning state
    │   ├── learning-log.jsonl
    │   ├── learning.json
    │   ├── .learning-manifest.json
    │   ├── .learning.lock
    │   ├── .learning-notified-at
    │   ├── .learning-runs-today
    │   ├── .learning-session-count
    │   ├── .decisions-session-count
    │   └── .disabled
    ├── features/                     # feature knowledge bases
    │   ├── index.json
    │   ├── .knowledge.lock
    │   ├── .knowledge-last-refresh
    │   ├── .disabled
    │   └── {slug}/KNOWLEDGE.md
    └── docs/                         # review reports, design docs
        ├── reviews/
        ├── design/
        ├── audits/
        └── ...

Note: subdirectories inside .devflow/ drop the leading dot (.memory/memory/, .features/features/) since they're already inside a hidden directory. The .devflow/ directory itself is the hidden marker.

Files to update (~80):

Category Files What changes
Hooks (background scripts) session-end-learning, session-end-decisions, session-end-knowledge-refresh, stop-update-memory, prompt-capture-memory, session-start-memory, pre-compact-memory All .memory/.devflow/memory/, .features/.devflow/features/ path references
Hook libraries json-helper.cjs, decisions-index.cjs, feature-knowledge.cjs, staleness.cjs, sidecar-ops.cjs Path constants for .memory/, .features/
CLI commands init.ts, learn.ts, decisions.ts, knowledge/ (all subcommands), memory.ts, ambient.ts All hardcoded .memory/, .features/, .docs/ references
CLI utilities post-install.ts, manifest.ts, paths.ts .gitignore entries, path helpers
Orchestration skills All 7 orch skills .features/index.json.devflow/features/index.json, .features/{slug}/KNOWLEDGE.md.devflow/features/{slug}/KNOWLEDGE.md, .docs/.devflow/docs/
Agent definitions coder.md, reviewer.md, resolver.md, synthesizer.md, git.md .docs/ path references in output instructions
Skills docs-framework/SKILL.md, feature-knowledge/SKILL.md .docs/ and .features/ path references
CLAUDE.md 1 file All path references in project structure docs
.gitignore 1 file Replace .docs/, .memory/, .features/ entries with single .devflow/
Tests ~20 files All tests that create/reference .memory/, .features/, .docs/ paths

Migration:

  • Add consolidate-to-devflow-dir per-project migration
  • Migration moves existing directories: .memory/.devflow/memory/, .features/.devflow/features/, .docs/.devflow/docs/
  • Strips leading dots from subdirectory names during move
  • Creates .devflow/.gitignore or updates project .gitignore
  • Clean break: no backwards compatibility — after migration, old paths are gone

.gitignore changes:

- .docs/
- .docs/.working-memory.lock
- .docs/.working-memory-update.log
- .memory/
+ .devflow/

Clean break rationale: By Phase 5, all code references have already been updated in Phases 1-4 to use the new names. The migration just moves files. No backwards compat shims are needed because devflow init runs migrations automatically, and there are no external consumers of these paths.

Verification: All tests pass. npm run build succeeds. devflow init migrates existing projects. All hooks, CLI commands, and orchestration skills resolve paths correctly under .devflow/.


Files Affected Summary (All Phases)

Category Count Key Files
Skills ~12 4 renamed skill directories, 6 orchestration skills updated
Agents ~6 5 agent definitions + Knowledge agent skill refs
Hooks/Scripts ~12 json-helper.cjs, thin hook triggers, CLI background runners, decisions-index.cjs
CLI ~15 kb/ → knowledge/, init.ts, learn.ts, new decisions command, path updates
Tests ~27 learning/, feature-kb/ → feature-knowledge/, resolve/, integration tests
Config ~5 CLAUDE.md, plugin.json files, manifest types, .gitignore
Total ~80-90

Testing Strategy

Each phase must:

  1. Pass all existing tests (1022+)
  2. npm run build succeeds
  3. devflow init installs correctly
  4. Smoke test: run devflow learn --status, devflow decisions --status, devflow knowledge list (Phase 2+)

Future Considerations (Not In Scope)

  • 4-way agent split (workflow/procedural/decision/pitfall as separate agents) — can be done later if needed
  • Changes to feature knowledge content/format (KNOWLEDGE.md structure stays the same)
  • Migrate working memory hooks to Pattern A (CLI-mediated) — natural follow-up after learning/decisions are migrated

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions