Narrated walkthrough: in-page play mode + animated MP4 export#2
Open
russ wants to merge 4 commits into
Open
Conversation
Turn the static walkthrough into a self-playing, narrated screencast: each chapter becomes a scene that pans the actual diff and spotlights the lines it references, while the browser's built-in speech synthesis reads a short narration (captions included, so it works muted). No new dependencies, no API key, no network -- the same single .html, just playing itself. - core: add optional Chapter.narration and allow it in the JSON Schema. Falls back to intent then summary, so existing walkthroughs still play. - web: player overlay -- requestAnimationFrame scene clock, speech + auto-advance, transport controls, diff spotlight/dim, keyboard (space/arrows/m/Esc), header Play button and `p` shortcut. - styles: full-screen player themed via the existing light/dark CSS vars, with a prefers-reduced-motion fallback. - skill + README: author a per-chapter `narration`; document play mode. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
An opt-in counterpart to the in-page play mode that produces a real,
shareable .mp4 -- a title card plus one scene per chapter, each panning the
actual diff (spotlighting referenced lines) over a narration track. Reuses
the same scene model, but uses *system* tools, so it adds no npm runtime
dependencies and only touches them when a video is requested.
Pipeline per scene: build a deterministic scene HTML (fixed header/caption
bands) -> headless-Chromium screenshot -> TTS narration -> ffmpeg slices the
PNG into a fixed header, a vertically-panning code region, and a fixed
caption, then muxes the audio. Per-scene clips are concatenated to the MP4.
- renderer: packages/renderer/src/video/{scene-html,index}.ts; export
renderVideo + types.
- cli: `patchstory video <walkthrough.json>` with
--tts/--voice/--chrome/--fps/--keep (plus --diff/--redact via render path).
- tools: resolve ffmpeg/ffprobe by actually running candidates (PATH, then
/usr/bin, then PATCHSTORY_FFMPEG/FFPROBE), so a broken or shadowing PATH
entry is skipped; Chrome resolution also picks up a flatpak Chromium.
- tts: elevenlabs (ELEVENLABS_API_KEY) | espeak-ng | flite | say | none.
- docs: README "Narrated video" section; skill note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
The static screenshot+ffmpeg-pan video looked flat. Replace it with a real
motion-graphics engine: generate a HyperFrames composition (HTML + one paused
GSAP timeline) from the walkthrough and render it frame-by-frame in headless
Chrome. Scenes animate -- the diff reveals line-by-line and the referenced
("spotlight") lines light up as they're narrated -- with a title card, an outro,
and sentence-beat captions timed to the measured voiceover. Still no npm runtime
deps: HyperFrames and TTS are invoked through `npx`.
- composition.ts: generate the full index.html (CSS + per-scene HTML + GSAP
timeline) from scenes; obeys the determinism contract (no wall-clock logic,
static captions, autoAlpha/transforms, hard-clear after fade-out). Reuses the
existing per-line highlighter so any language is colored.
- hyperframes.ts: orchestrator -- window each chapter's diff around the
spotlight, synth Kokoro VO per scene, lay the timeline out from measured
durations, render via `npx hyperframes render`. Prepends the resolved (working)
ffmpeg dir to the child PATH so HyperFrames doesn't pick up a broken ffmpeg.
- index.ts: `renderVideo` now dispatches engine `hyperframes` (default) | `pan`
(the previous static renderer, kept as an offline fallback). Add a `kokoro`
TTS provider (local neural voice via `npx hyperframes tts`, no API key) and
thread a child env through `synth`.
- cli: `--engine hyperframes|pan`; `--tts` gains `kokoro`.
- docs: README "Narrated video" rewrite; skill note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
Burned-in captions crowded the code (and a centering bug overflowed them off the right edge). Drop the on-screen caption elements from the composition and instead emit the narration as a soft, toggleable subtitle track muxed into the MP4 (mov_text, forced=0) plus a sidecar .srt. Cues are the narration split into beats, timed across each scene's measured voiceover window. Now the video frame is just the title/code/spotlight, and viewers turn captions on/off and let their player style and position them. - composition.ts: remove the .cap elements, their timeline tweens, and CSS. - hyperframes.ts: collect subtitle cues during scene layout; build SRT; render to a temp file then mux mov_text + write the sidecar .srt next to the output. Note: MP4/mov_text always flags a lone subtitle default=1 (an ffmpeg movenc limitation); forced=0 so it's never burned, and players gate display on the CC toggle, rendering it via their own subtitle engine. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01M8q2ot94CPocoNHL7XVStW
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Turns a PatchStory walkthrough into a narrated screencast that breaks down the code being reviewed. The chapters already are the script (intent, risk, referenced diff hunks), so this adds a player and an exporter over data that already exists.
1. In-page "play" mode (zero-dependency default)
A ▶ Play button turns the static
.htmlinto a self-playing screencast: each chapter becomes a scene that pans the actual diff and spotlights the lines it references, narrated via the browser's built-in speech synthesis (captions included). No ffmpeg, no API key, no network.Chapter.narration(falls back tointent→summary).p.2.
patchstory video— opt-in MP4, animatedRenders the same scenes into a real shareable
.mp4. Two engines (--engine):hyperframes(default) — generates a HyperFrames composition (HTML + one GSAP timeline) and renders it frame-by-frame in headless Chrome. Title card → one animated scene per chapter (diff reveals line-by-line, referenced lines light up as narrated, sentence-beat captions) → outro. Invoked vianpx— no npm runtime dep.pan— fully-local fallback: Chromium screenshot + ffmpeg pan.TTS (
--tts):elevenlabs(ELEVENLABS_API_KEY),kokoro(local neural, keyless default),espeak-ng/flite/say, ornone. ffmpeg/ffprobe resolved by running candidates (PATH →/usr/bin→PATCHSTORY_FFMPEG), and the working one is handed to HyperFrames so a broken/shadowing PATH entry can't break the render.Verification
typecheckclean ·buildclean · 23/23 tests pass.onerrorinstant-skip bug.patchstory video(hyperframes + Kokoro) renders a valid 1920×1080 H.264 + AAC, ~146s MP4 from the 5-chapter demo — title card, 5 animated chapter scenes with live spotlight, outro; sampled frames confirm each.Notes
npxon first use;--engine panis the offline path.🤖 Generated with Claude Code