Skip to content

feat(design-tools): make applying-bitwarden-branding self-contained (build + review)#126

Draft
SaintPatrck wants to merge 23 commits into
mainfrom
feat/bitwarden-design-tools
Draft

feat(design-tools): make applying-bitwarden-branding self-contained (build + review)#126
SaintPatrck wants to merge 23 commits into
mainfrom
feat/bitwarden-design-tools

Conversation

@SaintPatrck

@SaintPatrck SaintPatrck commented May 22, 2026

Copy link
Copy Markdown
Contributor

🎟️ Tracking

Exploratory. No Jira ticket.

📔 Objective

Upgrade applying-bitwarden-branding from reference-only to self-contained, and
bump bitwarden-design-tools to 0.2.0. The skill used to point the model at
bitwarden.com/brand and hope; now it bundles the canon and both applies and
reviews brand.

  • Bundled canon: nine official lockup/shield/wordmark SVGs (verbatim path
    data) plus bitwarden-tokens.css (palette, 36px radius, Inter). No more
    redrawn shields or guessed hexes.
  • Build + review: applies brand to a deliverable, and reviews one for
    compliance, with severity calibration (canon violation vs brand-silent
    judgment call).
  • Dark mode by default: deliverables follow the device prefers-color-scheme
    with a light/dark toggle; the dark surface derives from Deep Blue.
  • Drift guard: verify-brand-canon.sh checks the bundled palette against
    bitwarden/brand and reports the correct value. It never mutates the bundle.
  • Eval harness: evals/ with mock fixtures, a pre-registered rubric, and a
    deterministic grader for the context-free checks; calibration stays with the agent.
  • Fixes: the 36px radius no longer rounds full-bleed headers; --bw-yellow
    corrected to #FDC700; brand-assets.md folded into logo-usage.md; new
    typography.md reference.

Structure, marketplace, and lint checks pass.

Add plugin manifest, README, and CHANGELOG for a new plugin that applies
Bitwarden's canonical brand identity (palette, Inter, official logo
lockup, 36px radius foundation) to standalone HTML deliverables.

The plugin is explicit about the boundary between canonical brand
guidance (from bitwarden.com/brand) and pragmatic deliverable choices
the brand site is silent on (surface mode, heading scale, code font,
component shapes).
…content, and example

Add the assets, skill content, and example that ship with the plugin.

Assets:

- bitwarden-tokens.css: full palette as CSS custom properties, the 36px
  radius foundation, and Inter on :root. Intentionally lean — no
  component CSS, no dark-surface vars, no gradients. Component shapes
  are not part of brand canon.
- bitwarden-lockup-official.svg: the full official lockup file,
  verbatim from images.ctfassets.net/.../BitwardenLogo.svg. Serves as
  the canonical reference and the source for the derived variants
  below.
- bitwarden-lockup-{horizontal,vertical}-{blue,white}.svg: primary
  horizontal and secondary vertical lockups in both surface-color
  variants.
- bitwarden-{shield,wordmark}-{blue,white}.svg: shield-only and
  wordmark-only assets for chip-scale and composite use.

Every derived asset preserves the path data and coordinate system
from the official lockup verbatim; only the <svg> wrapper and viewBox
crop are local. viewBox crops were computed by walking the path
commands programmatically rather than estimated.

Skill content:

- SKILL.md with strong, pushy triggering on direct asks ("make this
  look like Bitwarden", "on-brand", "apply our branding") and on any
  standalone HTML deliverable request inside the marketplace where the
  user hasn't specified a different brand. Explicit non-triggers cover
  product UI (bitwarden/clients, web vault, mobile apps), third-party
  brand work, and partner co-branding.
- references/color-palette.md, references/typography.md, and
  references/logo-usage.md. Each is self-contained and explicit about
  the boundary between canonical brand guidance (from
  bitwarden.com/brand) and pragmatic deliverable choices the brand
  site is silent on (surface mode, heading scale, code font, component
  shapes).

Example:

- examples/on-brand-one-pager.html demonstrates the canonical bits
  applied correctly — palette, Inter, shield, 36px radius — with a
  light/dark surface toggle. The dark surface derives its background
  from --bw-deep-blue rather than introducing a new neutral, per
  references/color-palette.md. The shield's fill uses currentColor so
  the same path renders in either canonical color variant (blue or
  white) — both ship in the official lockup, so this is not a recolor.

CHANGELOG and plugin README enumerate the asset catalog.
…lace

Add the new plugin to .claude-plugin/marketplace.json and to the root
README.md plugin catalog table so it's discoverable.

Also add to .cspell.json the domain terms surfaced by the plugin's
content: lockup, lockups, CMYK, wordmark, Segoe, Cantarell, Neue, Menlo,
Consolas, ctfassets, evenodd.
@github-actions

github-actions Bot commented May 22, 2026

Copy link
Copy Markdown

Plugin Validation Report — PR #126

Plugin: bitwarden-design-tools (v0.1.0 → v0.2.0)
Scope: Rework of the applying-bitwarden-branding skill plus manifest/changelog/README updates.

✅ Overall: PASS

All three validation passes (plugin structure, skill review, security) completed with no critical or major issues and no must-fix errors. The plugin is structurally sound, the version bump is consistent across all required files, the reworked skill is well-formed, and no credentials are present. A few minor, optional housekeeping suggestions are listed below.


1. Plugin Structure Validation (plugin-validator)

Area Result
plugin.json manifest (name, semver 0.2.0, author, fields) ✅ Valid
Version consistency: plugin.jsonmarketplace.jsonCHANGELOG.md ✅ All 0.2.0
CHANGELOG.md Keep a Changelog format ✅ Conforms
Directory structure / skill auto-discovery ✅ Correct
Referenced-file integrity (all paths in SKILL.md exist) ✅ Verified
README.md present, no junk files ✅ Clean
Hardcoded credentials scan ✅ None
JSON validity (plugin.json, evals.json, rubric.json, marketplace.json) ✅ Parse cleanly

Components: 6 skills (all valid frontmatter), 0 agents (skills-only by design), 0 commands, 0 hooks, 0 MCP servers declared.

No errors. No warnings that block.


2. Skill Review — applying-bitwarden-branding (skill-reviewer)

Criterion Result
YAML frontmatter (name, description) ✅ Valid; name matches directory
Description quality (third-person, quoted triggers, proactive clause, explicit "Not for…" exclusion) ✅ Exemplary (~480 chars, under 500 limit)
Body word count (~1,170 words) ✅ Within 1,000–3,000 target
Writing style (imperative/infinitive) ✅ Consistent
Progressive disclosure (lean core; detail in references/, template in examples/, drift-check in scripts/, canon in assets/) ✅ Model implementation
allowed-tools scoping ✅ Tightly scoped to single script invocation

Notable strengths: surface-gate guardrail as the first instruction (prevents misuse on product UI); the canon-violation vs. brand-silent-choice calibration (lines 74–77) directly counters review over-flagging; offline-first with optional network drift-check.

Note — corrected false positive: The skill-reviewer initially flagged an "orphaned references/brand-assets.md". This is incorrect and has been excluded. That file was deliberately removed in this PR (folded into references/logo-usage.md, commit 07a136b, recorded under CHANGELOG "Removed"). It does not exist on disk, and SKILL.md does not reference it. No action needed.


3. Security Validation (claude-config-validator)

Check Result
Committed secrets (API keys, tokens, passwords) ✅ None found
Hardcoded credentials in scripts/verify-brand-canon.sh ✅ None — calls public GitHub contents API over HTTPS, no auth header
Hardcoded credentials in evals/grade.py ✅ None — reads local bundled assets only, no network
settings.local.json committed ✅ N/A (not present)
allowed-tools permission scoping (SKILL.md:4) ✅ Scoped to Bash(${CLAUDE_SKILL_DIR}/scripts/verify-brand-canon.sh:*) — not broad Bash(*)
Dangerous command auto-approvals ✅ None
Overly broad file access ✅ None

verify-brand-canon.sh uses set -euo pipefail, never mutates the bundle, fails safe (exit codes 0/1/2 with graceful offline fallback), and performs no destructive operations. The only credential-keyword matches in the plugin are the legitimate product names "Password Manager" / "Secrets Manager." Hex color values are brand tokens, not secrets.


Minor / Optional Suggestions (non-blocking)

These are warnings/housekeeping only — none block merge.

  1. Frontmatter style consistency (SKILL.md:3) — This skill folds triggers into a single combined description, while the five sibling skills split into description + a dedicated when_to_use. Both are valid; optionally split for in-plugin consistency.
  2. Lingering "disputed" note (evals/rubric.json drift_watch) — The --bw-yellow value is now resolved to #fdc700 (per CHANGELOG), but the rubric's drift_watch still describes it as "disputed (#fdc700 vs #ffd700)". Optionally prune now that it's settled. (Intentional Phase-2 drift signal, not a defect.)

Verified facts during this review

  • references/brand-assets.md confirmed deleted (not orphaned); SKILL.md contains no reference to it.
  • scripts/verify-brand-canon.sh is executable (-rwxr-xr-x).
  • ${CLAUDE_SKILL_DIR} is an established repo convention (also used by bitwarden-code-review), correct for a skill-relative script path.

Recommendation: APPROVE. No must-fix issues across structure, skill quality, or security.

@SaintPatrck SaintPatrck changed the title feat(design-tools): apply Bitwarden brand to HTML deliverables (PoC) feat: add bitwarden-design-tools plugin (PoC) May 22, 2026
@SaintPatrck SaintPatrck added the ai-review-vnext Request a Claude code review using the vNext workflow label May 22, 2026
@github-actions

github-actions Bot commented May 22, 2026

Copy link
Copy Markdown

🤖 Bitwarden Claude Code Review

Overall Assessment: APPROVE

Reviewed the bitwarden-design-tools 0.2.0 change that makes applying-bitwarden-branding self-contained — bundled canon (SVG assets + bitwarden-tokens.css), build-and-review skill guidance, a drift-detection script, and an eval harness. This is documentation, skill content, and developer-facing tooling; no production runtime code or trust-boundary surface is touched. Verified cross-file consistency of the palette/version data and the eval ground truth against the fixtures.

Code Review Details

No blocking or actionable findings.

Verification performed:

  • Version 0.2.0 is consistent across plugin.json, .claude-plugin/marketplace.json, and CHANGELOG.md; changelog entry follows Keep a Changelog format.
  • The --bw-yellow correction to #FDC700 is consistent across bitwarden-tokens.css, references/color-palette.md, examples/on-brand-one-pager.html, and evals/rubric.json.
  • The inline shield path in examples/on-brand-one-pager.html matches assets/bitwarden-shield-blue.svg verbatim — the eval grader's official_lockup_not_redrawn check depends on this.
  • Eval fixtures (inputs/project-atlas-deck.html, inputs/engineering-recap.html) carry no eval-meta leakage and their content matches the documented ground truth (off-brand fonts/radii in the deck; the single #ff7a18 orange fault in the recap).
  • grade.py name-substring dispatch routes correctly (review-tp-deck → true-positive branch, review-fp-recap → deferred); verify-brand-canon.sh grep patterns do not false-match light-teal-highlight against teal-highlight.
  • ${CLAUDE_SKILL_DIR} in allowed-tools matches the established convention used in bitwarden-code-review.
  • The only new dependency is curl for an on-demand, network-failure-tolerant drift script (exit 2 on offline/upstream-moved); no new package-manifest dependencies.

@withinfocus

Copy link
Copy Markdown
Contributor

I merged my #125 change -- do you want to lift your skills enhancements from this?

- Merge review/audit guidance and design-lifecycle composition into the skill
- Add review calibration: canon violations vs brand-silent judgment calls
- Fix --bw-yellow to #FDC700 to match brand repo palette.scss (was #FFD700)
- Add scripts/refresh-brand-canon.sh to verify/refresh bundled palette vs source
- Tighten the trigger description per skill-authoring guidance
Integrate main's design suite (#125: content-style-guide, using-figma,
preparing-design-handoff, evolving-design-system-components,
navigating-design-jira-process, plus the bitwarden-designer plugin) and the
pnpm migration with #126's expanded applying-bitwarden-branding skill.

Conflict resolutions:
- applying-bitwarden-branding/SKILL.md: keep #126 (build + review, bundled canon, drift guard)
- color-palette.md: union of main's full reference with WCAG pairings + surface guidance; yellow #FDC700
- plugin.json / marketplace.json / root README: main's suite, design-tools bumped to 0.2.0
- design-tools CHANGELOG: main's [0.1.0] plus new [0.2.0]
- .cspell.json: union of both word lists
- main's other 5 skills, bitwarden-designer, and pnpm migration kept intact
The merged applying-bitwarden-branding skill references logo-usage.md, not
brand-assets.md. Fold the unique repo-path inventory (product lockups, PNG icon
sizes, bare shield) and the trademark note into logo-usage.md, then remove the
now-orphaned brand-assets.md.
- Add Bitwarden capitalization as a fifth brand-canon item (it was enforced in
  the review section but missing from the build checklist)
- Reconcile 36px radius: canonical for primary surfaces; applying it to dense
  internal-tool components is a judgment call, not a hard fail
- Map SCSS palette vars to the bundled CSS custom properties in color-palette.md
- Bind the proactive trigger to a Bitwarden deliverable to reduce over-firing
Format files touched by the merge (README table, color-palette token-name
table) and #126's example one-pager so the repo's lint:prettier check passes.
Verified: pnpm run lint (prettier --check + cspell) is clean.
…DC700)

The example inlines a copy of bitwarden-tokens.css; its --bw-yellow was missed
in the yellow correction. Align it with the canonical token and the changelog.
The description frontmatter already establishes what the skill does and when
not to use it (the trigger decision happens before the body loads). The body's
Quick start and Reviewing sections already structure the build/review modes.
…randing

Replace the removed purpose section with a single runtime stop-gate: bail
early if the target is product UI, third-party, or partner co-branding, rather
than mis-applying Bitwarden branding (notably on a proactive trigger).
Rename refresh-brand-canon.sh -> verify-brand-canon.sh and drop the --refresh
mode that rewrote the bundled tokens in place. Silently modifying a committed
marketplace asset mid-session is the wrong default and hides that landing the
fix needs a marketplace PR. The script now only detects drift and prints the
correct live value, so the agent/human applies it in the deliverable being
branded. Updated SKILL.md, color-palette.md, and CHANGELOG references.
…ools

Add allowed-tools: Bash(${CLAUDE_SKILL_DIR}/scripts/verify-brand-canon.sh:*)
so the skill runs the drift check without a permission prompt. allowed-tools is
additive (pre-auth only), so all other tools remain available under normal
rules. Switch the body invocation to ${CLAUDE_SKILL_DIR} so the command and the
permission pattern resolve to the same path.
Extend allowed-tools with Read(${CLAUDE_SKILL_DIR}/references/*) and
Read(${CLAUDE_SKILL_DIR}/assets/*) so the skill reads its own bundled brand
docs, tokens, and lockup SVGs without prompts. Scoped to the skill's own
directories (not blanket Read); additive, so all other tools follow normal rules.
The example is specific to applying-bitwarden-branding, so it belongs under the
skill (beside assets/, references/, scripts/) rather than at the plugin root.
Update the SKILL.md link from ../../examples/ to examples/.
…g Read

Revert the Read(${CLAUDE_SKILL_DIR}/...) allowed-tools entries. The permission
prompts came from the agent running Bash(ls) to confirm files exist as
due-diligence, not from Read. State that every bundled file is present exactly
as named so the agent references it directly without listing the directory.
Keep only the verify-brand-canon.sh Bash pre-auth.
- Remove CI allusion from the drift-check section (CI wiring is a separate, not-yet-existing task)
- Surface brand-silent choices (surface mode, voice/tone) to the requester when interactive instead of silently defaulting
- Reframe the example from 'See it applied' (an agent cannot view a render) to a readable/adaptable template under References
- Remove the 'Within the design lifecycle' section: a leaf skill enumerating how its sibling skills coordinate around it is a backwards dependency; the orchestrating flow should reference this skill, not the reverse
- Update the unreleased 0.2.0 changelog entry to match (drops the design-suite composition claim)
The drift script's CI references (in its header comment and the changelog
entry) describe wiring that does not exist yet. CI integration is a separate,
future task; the script runs on demand. Aligns with removing the same allusion
from SKILL.md.

(A deterministic brand-compliance linter was prototyped and evaluated here; the
eval showed it over-flags sanctioned off-palette derivations such as deep-blue
dark-surface ramps, so calibration stays with the reviewing agent and the
linter was dropped.)
…anding

Extract the reusable core of the branding eval into a tracked harness under the
skill, following the creating-pull-request/evals precedent. Replaces the
untracked .eval-workspaces scratch (222 files) so future skill changes can be
re-evaluated.

- evals/inputs/: MOCK-ONLY fixtures (no real data) — an off-brand deck and an
  on-brand control seeded with defensible choices (Deep-Blue-derived dark
  surface, data-viz series) plus one genuine fault, to exercise both recall and
  false-positive/over-flagging
- evals/evals.json: four eval definitions (apply deck, apply from scratch,
  review true-positive, review false-positive)
- evals/rubric.json: pre-registered objective assertions and ground truth
- evals/grade.py: deterministic grader for the context-free checks; reads the
  canonical palette and official-logo signatures from the live bundled assets
  (../assets) so it tracks canon changes instead of drifting. Calibration
  dimensions are left to a blind LLM grader by design
- evals/README.md: workflow and the deterministic-vs-judgment split
- gitignore eval run outputs and python bytecode
- fold into the unreleased 0.2.0 changelog (Added)

Validated: grader scores the off-brand fixture 0.14 (all canon checks fail) and
the on-brand control 1.0; live-asset lockup-signature matching confirmed.
Two guidance corrections to applying-bitwarden-branding from review feedback:

1. Radius on full-bleed surfaces. The 36px foundation was being applied to
   edge-to-edge headers, rounding only the bottom corners against straight page
   edges (looks broken). Clarify that the radius is for surfaces that float with
   space around them; an element that bleeds flush to the page/viewport edge
   should be inset (so all corners float) or left square on the flush edges.

2. Dark mode by default. The skill now ships BOTH light and dark surfaces:
   deliverables follow the device 'prefers-color-scheme' for the initial surface
   and expose a light/dark toggle that overrides it, with the dark surface
   derived from Deep Blue. Surface mode is no longer a per-deliverable question.
   Updated the bundled example to demonstrate the pattern (CSS @media default +
   data-theme override, JS-synced icon/label), and the token-file comment.

Folds into the unreleased 0.2.0 changelog (Changed + Fixed). Example validated:
parses clean, theme wiring present, no inline handlers.
@SaintPatrck SaintPatrck changed the title feat: add bitwarden-design-tools plugin (PoC) feat(design-tools): make applying-bitwarden-branding self-contained (build + review) Jun 22, 2026
The mock input fixtures carried comments and visible text that revealed the
answer key (on/off-brand verdicts, which choices were faults, even a pointer to
rubric.json). The skill under test reads the input verbatim, so those hints would
measure reading the answer key rather than detecting brand issues.

Strip all eval-meta from the fixtures so they are pristine artifacts; ground
truth stays in rubric.json and the eval README, which the skill never sees. Add a
blindness rule to the README. Grader unaffected (it strips comments): off-brand
deck still 0.14, on-brand control still 1.0.
The fixture filenames (offbrand-deck.html, onbrand-control.html) leaked the
verdict: the review prompts name the file, so the skill under test saw the answer
in the path. Rename to neutral, content-derived names (project-atlas-deck.html,
engineering-recap.html) and neutralize the experimenter-facing eval names and
subject keys too. Verdict descriptions stay in evals.json/rubric.json/README
(ground truth the skill never reads). Grader routing preserved (apply/review-tp/
review-fp/deck tokens intact); deck still 0.14, recap still 1.0.
Comment on lines -101 to -114
## Composing with other skills

- **`content-style-guide`.** Brand sits alongside content style. When reviewing user-visible
surfaces, walk both: this skill catches color, logo, and capitalization issues; the content
style guide catches voice, tone, sentence case, and accessibility.
- **`using-figma`.** Use `get_variable_defs` to check whether a design's colors are
library-bound and aligned to the brand palette; use `get_libraries` to confirm the right
design library is loaded before claiming a design is on-brand.
- **`preparing-design-handoff`.** Surface brand findings as part of the handoff gate — flag
them as Figma annotations or as open questions in the Epic when something is off-brand at
handoff time. Don't quietly fix.
- **`evolving-design-system-components`.** New patterns must respect the brand palette and the
36px radius system (with the button exception). The Component Library governance review
catches obvious violations, but raise them explicitly when sponsoring a pattern.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📓 Removed these cross-skill references. This stood out as a prompt smell. A leaf skill like this shouldn't know which workflows compose it. The dependency should run one way. The orchestrator knows about this skill and hands it context. This skill shouldn't reach back up to enumerate its callers or drive its siblings. Whatever process chains this skill should own "what runs next". This skill does its one job and reports done.

@withinfocus withinfocus left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this self-contained? Don't you want the skill to get the latest updates online?

Comment thread .cspell.json
"zeroization",
"zeroized"
"zeroized",
"offbrand",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ We alphabetize this.

Comment on lines +22 to +24
"deliverable",
"html",
"dashboard"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Are these useful? They come off as really generic to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. More likely to trigger false-positives than actually help. Will remove.

…iant

Running the eval surfaced that the on-brand control's 'defensible' elements were
not actually canon-compliant: the dark navies were invented neutrals (not derived
from Deep Blue) and a chart series used an off-palette purple, so the skill
correctly flagged them while the rubric had labeled them defensible.

Make them genuinely compliant so the orange CTA hover is the lone planted fault:
derive the dark surface from --bw-deep-blue via color-mix (the skill's own
pattern) and move the chart series to on-palette blue/teal tokens. Update the
rubric and README ground truth to match.

Blind re-run confirms the review now flags only the orange (one hard fault),
affirms palette/Inter/shield/radius/derived-surface, and treats the dark-only
mode as a brand-silent judgment call, not a violation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review-vnext Request a Claude code review using the vNext workflow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants