Skip to content

Record replay audio#252

Open
rgarcia wants to merge 5 commits into
mainfrom
hypeship/replay-audio
Open

Record replay audio#252
rgarcia wants to merge 5 commits into
mainfrom
hypeship/replay-audio

Conversation

@rgarcia
Copy link
Copy Markdown
Contributor

@rgarcia rgarcia commented May 26, 2026

Summary

  • add optional audio capture to ffmpeg replay recordings using the PulseAudio output monitor
  • start a shared PulseAudio null sink in the browser images and keep the monitor active with a silent sink input
  • keep audio aligned through the full recording by avoiding forced audio PTS reset and using faster low-latency x264 encoding so ffmpeg does not fall behind real time
  • add e2e coverage that records browser audio, downloads the MP4, verifies an audio track, checks the audio stream duration reaches the end of the recording, and includes a Zombocom archive sample recording

Tests

  • go test ./lib/recorder -count=1
  • DOCKER_BUILDKIT=1 docker build -f images/chromium-headful/Dockerfile -t onkernel/chromium-headful-test:latest .
  • TESTCONTAINERS_RYUK_DISABLED=true TESTCONTAINERS_HOST_OVERRIDE=127.0.0.1 RECORDING_ZOMBO_OUTPUT_PATH=/home/agent/repos/kernel-images/server/e2e/testdata/replay-audio-zombocom.mp4 go test ./e2e -run 'TestReplayRecording(IncludesAudioTrack|ZombocomArchiveAudio)' -count=1 -v

@rgarcia rgarcia marked this pull request as ready for review May 26, 2026 21:26
@firetiger-agent
Copy link
Copy Markdown

Firetiger deploy monitoring skipped

This PR didn't match the auto-monitor filter configured on your GitHub connection:

Any PR that changes the kernel API. Monitor changes to API endpoints (packages/api/cmd/api/) and Temporal workflows (packages/api/lib/temporal) in the kernel repo

Reason: PR modifies chromium launcher, wrapper, recorder, and config packages rather than the kernel API endpoints (packages/api/cmd/api/) or Temporal workflows (packages/api/lib/temporal) specified in the filter.

To monitor this PR anyway, reply with @firetiger monitor this.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Recording cap shorter than script
    • I made the recording duration configurable per test and increased the Zombocom archive scenario cap to 70 seconds so recording no longer ends before the Playwright flow completes.

Create PR

Or push these changes by commenting:

@cursor push 2f6a9b26ef
Preview (2f6a9b26ef)
diff --git a/server/e2e/e2e_recording_audio_test.go b/server/e2e/e2e_recording_audio_test.go
--- a/server/e2e/e2e_recording_audio_test.go
+++ b/server/e2e/e2e_recording_audio_test.go
@@ -52,7 +52,7 @@
 		return await page.title();
 	`, audioSite.ContainerURL())
 
-	recordReplayAudio(t, ctx, c, playwrightCode, os.Getenv("RECORDING_AUDIO_OUTPUT_PATH"), 0.1)
+	recordReplayAudio(t, ctx, c, playwrightCode, os.Getenv("RECORDING_AUDIO_OUTPUT_PATH"), 35, 0.1)
 }
 
 func TestReplayRecordingZombocomArchiveAudio(t *testing.T) {
@@ -132,16 +132,15 @@
 		return playback;
 	`
 
-	recordReplayAudio(t, ctx, c, playwrightCode, outputPath, 0.01)
+	recordReplayAudio(t, ctx, c, playwrightCode, outputPath, 70, 0.01)
 }
 
-func recordReplayAudio(t *testing.T, ctx context.Context, c *TestContainer, playwrightCode string, outputPath string, minPeakLevel float64) {
+func recordReplayAudio(t *testing.T, ctx context.Context, c *TestContainer, playwrightCode string, outputPath string, maxDuration int, minPeakLevel float64) {
 	t.Helper()
 
 	client, err := c.APIClient()
 	require.NoError(t, err, "failed to create API client")
 
-	maxDuration := 35
 	maxFileSize := 100
 	startResp, err := client.StartRecordingWithResponse(ctx, instanceoapi.StartRecordingJSONRequestBody{
 		MaxDurationInSeconds: &maxDuration,

You can send follow-ups to the cloud agent here.

client, err := c.APIClient()
require.NoError(t, err, "failed to create API client")

maxDuration := 35
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording cap shorter than script

Medium Severity

recordReplayAudio caps ffmpeg at 35 seconds while the Zombocom Playwright script can run ~50+ seconds (30s selector timeout plus 20s of fixed waits). Recording stops mid-script, so the MP4 may omit most playback and audio checks can fail or pass on noise despite a successful browser run.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit ee7fb17. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Pulse socket before sink ready
    • I load module-null-sink before module-native-protocol-unix so the Pulse socket appears only after KernelOutput is created.
  • ✅ Fixed: Keepalive exit kills PulseAudio
    • I replaced wait -n with a keepalive retry loop and wait only on PulseAudio so a pacat exit no longer terminates the daemon.

Create PR

Or push these changes by commenting:

@cursor push 8197b93a61
Preview (8197b93a61)
diff --git a/shared/start-pulseaudio.sh b/shared/start-pulseaudio.sh
--- a/shared/start-pulseaudio.sh
+++ b/shared/start-pulseaudio.sh
@@ -22,16 +22,12 @@
     --daemonize=no \
     --log-target=stderr \
     --exit-idle-time=-1 \
-    --load="module-native-protocol-unix socket=/tmp/pulse/native auth-anonymous=1" \
-    --load="module-null-sink sink_name=KernelOutput rate=48000 channels=2 sink_properties=device.description=KernelOutput" &
+    --load="module-null-sink sink_name=KernelOutput rate=48000 channels=2 sink_properties=device.description=KernelOutput" \
+    --load="module-native-protocol-unix socket=/tmp/pulse/native auth-anonymous=1" &
 
   pulse_pid=$!
-  keepalive_pid=""
 
   cleanup() {
-    if [ -n "$keepalive_pid" ]; then
-      kill "$keepalive_pid" 2>/dev/null || true
-    fi
     kill "$pulse_pid" 2>/dev/null || true
     wait 2>/dev/null || true
   }
@@ -44,10 +40,12 @@
     sleep 0.1
   done
 
-  (
-    pacat --raw --rate=48000 --channels=2 --format=s16le --device=KernelOutput /dev/zero
-  ) &
-  keepalive_pid=$!
+  while kill -0 "$pulse_pid" 2>/dev/null; do
+    pacat --raw --rate=48000 --channels=2 --format=s16le --device=KernelOutput /dev/zero || true
+    if kill -0 "$pulse_pid" 2>/dev/null; then
+      sleep 0.1
+    fi
+  done
 
-  wait -n "$pulse_pid" "$keepalive_pid"
+  wait "$pulse_pid"
   '

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit f416d9d. Configure here.

) &
keepalive_pid=$!

wait -n "$pulse_pid" "$keepalive_pid"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulse socket before sink ready

Medium Severity

Backgrounding pulseaudio exposes the Unix socket as soon as the protocol module loads, while KernelOutput may still be registering. The container wrapper only waits for the socket file, then starts Chromium with PULSE_SINK=KernelOutput, so the browser can launch before that sink exists and miss early replay audio.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f416d9d. Configure here.

) &
keepalive_pid=$!

wait -n "$pulse_pid" "$keepalive_pid"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keepalive exit kills PulseAudio

Medium Severity

The script uses wait -n on both pulseaudio and the pacat keepalive, and the EXIT trap always kills pulse_pid. If pacat exits while pulseaudio is still healthy, cleanup terminates the daemon and the whole supervisord program exits, causing an unnecessary audio stack restart.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f416d9d. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant