Skip to content

Narrated walkthrough: in-page play mode + animated MP4 export#2

Open
russ wants to merge 4 commits into
mainfrom
feat/narrated-walkthrough-video
Open

Narrated walkthrough: in-page play mode + animated MP4 export#2
russ wants to merge 4 commits into
mainfrom
feat/narrated-walkthrough-video

Conversation

@russ

@russ russ commented Jun 22, 2026

Copy link
Copy Markdown
Owner

What & why

Turns a PatchStory walkthrough into a narrated screencast that breaks down the code being reviewed. The chapters already are the script (intent, risk, referenced diff hunks), so this adds a player and an exporter over data that already exists.

1. In-page "play" mode (zero-dependency default)

A ▶ Play button turns the static .html into a self-playing screencast: each chapter becomes a scene that pans the actual diff and spotlights the lines it references, narrated via the browser's built-in speech synthesis (captions included). No ffmpeg, no API key, no network.

  • Optional Chapter.narration (falls back to intentsummary).
  • Player overlay: rAF scene clock, speech + auto-advance, transport controls, diff spotlight/dim, keyboard, header Play button + p.

2. patchstory video — opt-in MP4, animated

Renders the same scenes into a real shareable .mp4. Two engines (--engine):

  • hyperframes (default) — generates a HyperFrames composition (HTML + one GSAP timeline) and renders it frame-by-frame in headless Chrome. Title card → one animated scene per chapter (diff reveals line-by-line, referenced lines light up as narrated, sentence-beat captions) → outro. Invoked via npxno npm runtime dep.
  • pan — fully-local fallback: Chromium screenshot + ffmpeg pan.

TTS (--tts): elevenlabs (ELEVENLABS_API_KEY), kokoro (local neural, keyless default), espeak-ng/flite/say, or none. ffmpeg/ffprobe resolved by running candidates (PATH → /usr/binPATCHSTORY_FFMPEG), and the working one is handed to HyperFrames so a broken/shadowing PATH entry can't break the render.

Verification

  • typecheck clean · build clean · 23/23 tests pass.
  • Play mode driven headless (overlay mounts, scenes advance, end card); caught+fixed a TTS onerror instant-skip bug.
  • patchstory video (hyperframes + Kokoro) renders a valid 1920×1080 H.264 + AAC, ~146s MP4 from the 5-chapter demo — title card, 5 animated chapter scenes with live spotlight, outro; sampled frames confirm each.

Notes

  • HyperFrames engine needs network for npx on first use; --engine pan is the offline path.
  • Web Speech API / local-TTS voice quality varies; ElevenLabs is best with a key.

🤖 Generated with Claude Code

russ and others added 3 commits June 22, 2026 15:34
Turn the static walkthrough into a self-playing, narrated screencast: each
chapter becomes a scene that pans the actual diff and spotlights the lines
it references, while the browser's built-in speech synthesis reads a short
narration (captions included, so it works muted). No new dependencies, no
API key, no network -- the same single .html, just playing itself.

- core: add optional Chapter.narration and allow it in the JSON Schema.
  Falls back to intent then summary, so existing walkthroughs still play.
- web: player overlay -- requestAnimationFrame scene clock, speech +
  auto-advance, transport controls, diff spotlight/dim, keyboard
  (space/arrows/m/Esc), header Play button and `p` shortcut.
- styles: full-screen player themed via the existing light/dark CSS vars,
  with a prefers-reduced-motion fallback.
- skill + README: author a per-chapter `narration`; document play mode.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
An opt-in counterpart to the in-page play mode that produces a real,
shareable .mp4 -- a title card plus one scene per chapter, each panning the
actual diff (spotlighting referenced lines) over a narration track. Reuses
the same scene model, but uses *system* tools, so it adds no npm runtime
dependencies and only touches them when a video is requested.

Pipeline per scene: build a deterministic scene HTML (fixed header/caption
bands) -> headless-Chromium screenshot -> TTS narration -> ffmpeg slices the
PNG into a fixed header, a vertically-panning code region, and a fixed
caption, then muxes the audio. Per-scene clips are concatenated to the MP4.

- renderer: packages/renderer/src/video/{scene-html,index}.ts; export
  renderVideo + types.
- cli: `patchstory video <walkthrough.json>` with
  --tts/--voice/--chrome/--fps/--keep (plus --diff/--redact via render path).
- tools: resolve ffmpeg/ffprobe by actually running candidates (PATH, then
  /usr/bin, then PATCHSTORY_FFMPEG/FFPROBE), so a broken or shadowing PATH
  entry is skipped; Chrome resolution also picks up a flatpak Chromium.
- tts: elevenlabs (ELEVENLABS_API_KEY) | espeak-ng | flite | say | none.
- docs: README "Narrated video" section; skill note.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
The static screenshot+ffmpeg-pan video looked flat. Replace it with a real
motion-graphics engine: generate a HyperFrames composition (HTML + one paused
GSAP timeline) from the walkthrough and render it frame-by-frame in headless
Chrome. Scenes animate -- the diff reveals line-by-line and the referenced
("spotlight") lines light up as they're narrated -- with a title card, an outro,
and sentence-beat captions timed to the measured voiceover. Still no npm runtime
deps: HyperFrames and TTS are invoked through `npx`.

- composition.ts: generate the full index.html (CSS + per-scene HTML + GSAP
  timeline) from scenes; obeys the determinism contract (no wall-clock logic,
  static captions, autoAlpha/transforms, hard-clear after fade-out). Reuses the
  existing per-line highlighter so any language is colored.
- hyperframes.ts: orchestrator -- window each chapter's diff around the
  spotlight, synth Kokoro VO per scene, lay the timeline out from measured
  durations, render via `npx hyperframes render`. Prepends the resolved (working)
  ffmpeg dir to the child PATH so HyperFrames doesn't pick up a broken ffmpeg.
- index.ts: `renderVideo` now dispatches engine `hyperframes` (default) | `pan`
  (the previous static renderer, kept as an offline fallback). Add a `kokoro`
  TTS provider (local neural voice via `npx hyperframes tts`, no API key) and
  thread a child env through `synth`.
- cli: `--engine hyperframes|pan`; `--tts` gains `kokoro`.
- docs: README "Narrated video" rewrite; skill note.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
@russ russ changed the title Narrated walkthrough: in-page play mode + opt-in MP4 export Narrated walkthrough: in-page play mode + animated MP4 export Jun 26, 2026
Burned-in captions crowded the code (and a centering bug overflowed them off
the right edge). Drop the on-screen caption elements from the composition and
instead emit the narration as a soft, toggleable subtitle track muxed into the
MP4 (mov_text, forced=0) plus a sidecar .srt. Cues are the narration split into
beats, timed across each scene's measured voiceover window.

Now the video frame is just the title/code/spotlight, and viewers turn captions
on/off and let their player style and position them.

- composition.ts: remove the .cap elements, their timeline tweens, and CSS.
- hyperframes.ts: collect subtitle cues during scene layout; build SRT; render
  to a temp file then mux mov_text + write the sidecar .srt next to the output.

Note: MP4/mov_text always flags a lone subtitle default=1 (an ffmpeg movenc
limitation); forced=0 so it's never burned, and players gate display on the CC
toggle, rendering it via their own subtitle engine.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant