Skip to content

feat(cdc): native SQLite CDC source (db.cdc.sqlite)#351

Open
skhaz wants to merge 1 commit into
mainfrom
feat/sqlite-cdc
Open

feat(cdc): native SQLite CDC source (db.cdc.sqlite)#351
skhaz wants to merge 1 commit into
mainfrom
feat/sqlite-cdc

Conversation

@skhaz

@skhaz skhaz commented Jun 16, 2026

Copy link
Copy Markdown
Member

Why

The runtime streams Postgres row changes via db.cdc.postgres (#323) but had no SQLite equivalent. SQLite has no logical-replication slot, so the native mechanism is the preupdate hook, which observes row changes on the connection the runtime writes through.

What

Adds a db.cdc.sqlite registry kind backed by a supervised Source that installs SQLite preupdate + commit/rollback hooks on the target db.sql.sqlite pool's writer connection and emits insert/update/delete (plus a gap-free snapshot bootstrap) through the existing, engine-agnostic cdc Lua module (cdc.stream / list_sources / source).

  • service/sql: build-tagged hook seam (sqlite_preupdate_hook) registering a ConnectHook-enabled driver; the factory selects it transparently (1-line change).
  • service/cdc/sqlite: preupdate rows buffered per-transaction, flushed atomically on commit via a bounded handoff to a drain goroutine. Column names/affinity resolve over a dedicated read-only connection and checkpoints write through a separate plain-driver connection, so the writer-blocking commit hook can never deadlock against schema resolution or checkpoints. A laggard subscriber is closed loudly rather than allowed to stall the writer. Durable snapshot/offset state lives in wippy_cdc_offsets in the source DB.
  • api/service/cdc: db.cdc.sqlite kind, SQLiteConfig, and a composite inspector/streamer so the Lua module observes both engines.
  • boot: kind-specific listeners (db.cdc.postgres + db.cdc.sqlite) feed the composite.

Limitation (by design)

Capture is in-process and live-only: changes made while the runtime is down, or by an external process writing the file, are not captured (unlike a Postgres slot that replays). The checkpoint exists for snapshot-gating and idempotent dedupe, not replay. This is the trade for zero schema intrusion and lowest overhead.

Setup

Build with the new tag (already wired into the Makefile): make build-wippy-local. Without sqlite_preupdate_hook the source fails loudly instead of silently capturing nothing.

Registry usage:

- { name: db,  kind: db.sql.sqlite, file: /path/app.db, lifecycle: { auto_start: true } }
- { name: cdc, kind: db.cdc.sqlite, db_resource: app:db, lifecycle: { auto_start: true } }

Lua: local s = cdc.stream("app:cdc"); local c = s:channel():receive() -> { op, table, before, after, source, lsn }.

Testing

  • golangci-lint (tags race,sqlite_preupdate_hook): 0 issues.
  • Unit + integration (-race): decode/affinity, subscribers, manager, composite, config; and against a real WAL SQLite file: insert/update/delete, value fidelity (blob/text/NULL), rollback-discard, snapshot bootstrap, restart-keeps-checkpoint, table allowlist, and laggard-subscriber-does-not-stall-writes.
  • Adversarial agent code review run on the branch: 4 real bugs found and fixed (error-path hook leak -> writer deadlock; laggard subscriber stalling the writer; Stop drain dropping changes via cancelled ctx; mode=ro on WAL).
  • Mutation (gremlins): untagged core 90% efficacy (2 equivalent survivors); tagged ~79%.
  • Live binary run (full production path: registry -> boot -> manager -> supervisor -> Source.Start -> preupdate hook -> drain -> cdc.stream -> Lua channel):
INFO cdc.sqlite  sqlite cdc source started  {"id":"app:cdc","file":".../data.db","snapshot":false}
INFO op=insert table=users source=app:cdc lsn=1
INFO after.email=live@wippy.ai after.balance=42.5

Why:
The runtime could stream row changes from Postgres (db.cdc.postgres) but
had no equivalent for SQLite. SQLite has no logical-replication slot, so
the native mechanism is the preupdate hook, which observes row changes on
the connection the runtime writes through.

What:
Adds a db.cdc.sqlite registry kind backed by a supervised Source that
installs SQLite preupdate + commit/rollback hooks on the target
db.sql.sqlite pool's writer connection and emits insert/update/delete
(plus a gap-free snapshot bootstrap) through the existing,
engine-agnostic cdc Lua module (cdc.stream/list_sources/source).

How:
- service/sql: a build-tagged hook seam (sqlite_preupdate_hook) registers
  a ConnectHook-enabled driver and installs/clears hooks on a raw
  *sqlite3.SQLiteConn; the factory selects the driver transparently.
- service/cdc/sqlite: the Source buffers preupdate rows per transaction
  and flushes atomically on commit via a bounded handoff to a drain
  goroutine. Column names/affinity resolve over a dedicated read-only
  connection, and checkpoints write through a separate plain-driver
  connection, so the writer-blocking commit hook can never deadlock
  against schema resolution or checkpoint writes. A laggard subscriber is
  closed rather than allowed to stall the writer. Durable snapshot/offset
  state lives in wippy_cdc_offsets in the source DB.
- api/service/cdc: db.cdc.sqlite kind, SQLiteConfig, and a composite
  inspector/streamer so the Lua module observes both engines.
- boot: kind-specific listeners (db.cdc.postgres + db.cdc.sqlite) feed the
  composite.

Capture is in-process and live-only: changes made while the runtime is
down, or by an external writer, are not captured. The checkpoint exists
for snapshot-gating and idempotent dedupe, not replay. Building requires
the sqlite_preupdate_hook tag (added to the Makefile); without it the
source fails loudly instead of silently capturing nothing.
@skhaz skhaz requested a review from wolfy-j June 16, 2026 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant