Skip to content

[ENHANCEMENT] Add Docker-backed Playwright visual smoke testing for webview-ui #515

@edelauna

Description

@edelauna

Problem (one or two sentences)

The webview UI has substantial Vitest and React Testing Library coverage, but no browser-rendered visual regression coverage. Its unit tests mock @vscode/webview-ui-toolkit/react, so they cannot detect layout, styling, theme-variable, or real web-component rendering regressions.

Context (who is affected and when)

Maintainers and contributors reviewing changes under webview-ui/** currently have to catch visual regressions manually. PR #453 exposed the related risk that the mocked toolkit behavior can diverge from the real browser implementation, but this infrastructure should land independently from that feature PR.

A concrete example of the kind of failure this harness can catch: a long-running chat session causes the webview panel to go completely gray (reported by multiple users in Discord, May–June 2026). The root cause was increaseViewportBy: { top: 3000, bottom: 1000 } on the Virtuoso list pre-rendering too many off-screen DOM nodes, exhausting Chromium's renderer memory. The fix (PR #153) tightens the buffer to { top: 600, bottom: 800 } and adds defaultItemHeight and computeItemKey. A visual regression test that mounts ChatView with a large synthetic transcript and asserts the list renders non-empty would have caught this class of regression in CI before it reached users.

Desired behavior

Add a deliberately small Playwright Component Testing proof of concept based on main. It should render one existing, context-free webview component in real Chromium, compare it with a committed screenshot baseline, and run in a pinned Playwright Docker image in CI.

Vitest should remain the primary layer for component behavior. Contributors can add visual cases selectively when a UI change has meaningful visual risk.

Constraints / preferences

  • Start with one screenshot for the existing ProgressIndicator component.
  • Use the real VS Code toolkit web component rather than the Vitest mock.
  • Provide the representative VS Code dark theme custom properties needed by the harness.
  • Pin the Playwright package and Docker image versions together.
  • Use Docker locally and in CI to make rendering consistent.
  • Do not introduce Storybook or a hosted visual-testing vendor for the proof of concept.
  • Keep the workflow scoped to relevant webview-ui/** changes.
  • Defer extension-state, i18n, and React Query providers until a selected screenshot needs them.

Request checklist

  • I searched existing issues and did not find an issue covering Playwright visual regression testing for the webview UI.
  • This request describes a specific testing gap and a bounded proof of concept.

Acceptance criteria

  • Playwright Component Testing is configured for webview-ui.
  • The component test harness loads the real toolkit components and representative VS Code dark theme variables.
  • One stable screenshot test covers the existing ProgressIndicator component.
  • The expected screenshot baseline is committed to the repository.
  • Package scripts support running and intentionally updating the baseline through Docker.
  • A GitHub Actions workflow runs the visual test in a pinned official Playwright image.
  • Failed comparisons upload expected, actual, and diff artifacts.
  • Existing webview unit tests, lint, and type checking continue to pass.

Proposed approach

  • Add @playwright/test and @playwright/experimental-ct-react to webview-ui.
  • Add a Playwright CT config and minimal browser harness.
  • Add a focused VS Code dark-theme CSS variable shim.
  • Add a Docker Compose service for local comparison and baseline updates.
  • Add one colocated *.visual.tsx test for ProgressIndicator.
  • Add a path-filtered visual-regression workflow for pull requests and merge queue checks.

Trade-offs / risks

  • Screenshot baselines require intentional review and updates when rendering changes.
  • Playwright Component Testing is still published under the experimental package name.
  • Baselines are browser and platform sensitive, which is why Docker should be the documented and CI execution path.
  • The theme shim can drift from VS Code; the proof of concept should define only representative variables and expand them as needed.
  • Components coupled to extension state or i18n will need browser-side wrappers or targeted aliases. Those should be introduced with the first screenshot that requires them rather than speculatively.

Follow-up: ChatView memory regression test

Once the harness from PR #526 is merged, a ChatView.visual.tsx test should be added that:

  • Mounts ChatView with 500+ synthetic messages
  • Uses Playwright's CDP session to take a JS heap snapshot before and after scrolling through the transcript
  • Asserts the heap does not grow beyond a threshold (catches the increaseViewportBy class of regression)
  • Takes a screenshot to confirm the list renders non-empty (catches the gray-screen failure mode)

This test belongs in webview-ui/src/components/chat/__tests__/ alongside the existing ChatView.spec.tsx.

Related: #453
Related: #153
Related: #526

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions