[ENHANCEMENT] Add Docker-backed Playwright visual smoke testing for webview-ui

## Problem (one or two sentences)

The webview UI has substantial Vitest and React Testing Library coverage, but no browser-rendered visual regression coverage. Its unit tests mock `@vscode/webview-ui-toolkit/react`, so they cannot detect layout, styling, theme-variable, or real web-component rendering regressions.

## Context (who is affected and when)

Maintainers and contributors reviewing changes under `webview-ui/**` currently have to catch visual regressions manually. PR #453 exposed the related risk that the mocked toolkit behavior can diverge from the real browser implementation, but this infrastructure should land independently from that feature PR.

A concrete example of the kind of failure this harness can catch: a long-running chat session causes the webview panel to go completely gray (reported by multiple users in Discord, May–June 2026). The root cause was `increaseViewportBy: { top: 3000, bottom: 1000 }` on the Virtuoso list pre-rendering too many off-screen DOM nodes, exhausting Chromium's renderer memory. The fix (PR #153) tightens the buffer to `{ top: 600, bottom: 800 }` and adds `defaultItemHeight` and `computeItemKey`. A visual regression test that mounts `ChatView` with a large synthetic transcript and asserts the list renders non-empty would have caught this class of regression in CI before it reached users.

## Desired behavior

Add a deliberately small Playwright Component Testing proof of concept based on `main`. It should render one existing, context-free webview component in real Chromium, compare it with a committed screenshot baseline, and run in a pinned Playwright Docker image in CI.

Vitest should remain the primary layer for component behavior. Contributors can add visual cases selectively when a UI change has meaningful visual risk.

## Constraints / preferences

- Start with one screenshot for the existing `ProgressIndicator` component.
- Use the real VS Code toolkit web component rather than the Vitest mock.
- Provide the representative VS Code dark theme custom properties needed by the harness.
- Pin the Playwright package and Docker image versions together.
- Use Docker locally and in CI to make rendering consistent.
- Do not introduce Storybook or a hosted visual-testing vendor for the proof of concept.
- Keep the workflow scoped to relevant `webview-ui/**` changes.
- Defer extension-state, i18n, and React Query providers until a selected screenshot needs them.

## Request checklist

- [x] I searched existing issues and did not find an issue covering Playwright visual regression testing for the webview UI.
- [x] This request describes a specific testing gap and a bounded proof of concept.

## Acceptance criteria

- [ ] Playwright Component Testing is configured for `webview-ui`.
- [ ] The component test harness loads the real toolkit components and representative VS Code dark theme variables.
- [ ] One stable screenshot test covers the existing `ProgressIndicator` component.
- [ ] The expected screenshot baseline is committed to the repository.
- [ ] Package scripts support running and intentionally updating the baseline through Docker.
- [ ] A GitHub Actions workflow runs the visual test in a pinned official Playwright image.
- [ ] Failed comparisons upload expected, actual, and diff artifacts.
- [ ] Existing webview unit tests, lint, and type checking continue to pass.

## Proposed approach

- Add `@playwright/test` and `@playwright/experimental-ct-react` to `webview-ui`.
- Add a Playwright CT config and minimal browser harness.
- Add a focused VS Code dark-theme CSS variable shim.
- Add a Docker Compose service for local comparison and baseline updates.
- Add one colocated `*.visual.tsx` test for `ProgressIndicator`.
- Add a path-filtered visual-regression workflow for pull requests and merge queue checks.

## Trade-offs / risks

- Screenshot baselines require intentional review and updates when rendering changes.
- Playwright Component Testing is still published under the experimental package name.
- Baselines are browser and platform sensitive, which is why Docker should be the documented and CI execution path.
- The theme shim can drift from VS Code; the proof of concept should define only representative variables and expand them as needed.
- Components coupled to extension state or i18n will need browser-side wrappers or targeted aliases. Those should be introduced with the first screenshot that requires them rather than speculatively.

## Follow-up: ChatView memory regression test

Once the harness from PR #526 is merged, a `ChatView.visual.tsx` test should be added that:
- Mounts `ChatView` with 500+ synthetic messages
- Uses Playwright's CDP session to take a JS heap snapshot before and after scrolling through the transcript
- Asserts the heap does not grow beyond a threshold (catches the `increaseViewportBy` class of regression)
- Takes a screenshot to confirm the list renders non-empty (catches the gray-screen failure mode)

This test belongs in `webview-ui/src/components/chat/__tests__/` alongside the existing `ChatView.spec.tsx`.

Related: https://github.com/Zoo-Code-Org/Zoo-Code/pull/453
Related: https://github.com/Zoo-Code-Org/Zoo-Code/pull/153
Related: https://github.com/Zoo-Code-Org/Zoo-Code/pull/526

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENHANCEMENT] Add Docker-backed Playwright visual smoke testing for webview-ui #515

Problem (one or two sentences)

Context (who is affected and when)

Desired behavior

Constraints / preferences

Request checklist

Acceptance criteria

Proposed approach

Trade-offs / risks

Follow-up: ChatView memory regression test

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[ENHANCEMENT] Add Docker-backed Playwright visual smoke testing for webview-ui #515

Description

Problem (one or two sentences)

Context (who is affected and when)

Desired behavior

Constraints / preferences

Request checklist

Acceptance criteria

Proposed approach

Trade-offs / risks

Follow-up: ChatView memory regression test

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions