Skip to content

perf(validation): remove YAML parse hot path in permission scope validation#34797

Closed
Copilot wants to merge 4 commits into
mainfrom
copilot/fix-performance-regression-validation
Closed

perf(validation): remove YAML parse hot path in permission scope validation#34797
Copilot wants to merge 4 commits into
mainfrom
copilot/fix-performance-regression-validation

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 25, 2026

Validation benchmark regressed to ~29.7µs/op (+17.9% vs historical). Profiling showed ValidatePermissionScopeNames on the hot path, dominated by repeated scope-list construction and YAML unmarshal work.

  • Hot-path optimization: permission scope-name validation

    • Added a fast path to extract top-level permission keys from common block-style YAML without full yaml.Unmarshal.
    • Kept a strict fallback to YAML parsing for non-standard forms (e.g., flow mappings / complex structures), preserving behavior.
    • Hoisted reusable validation data to package-level cached state:
      • full known scope list
      • meta-key set (all, read-all, write-all, none)
      • preformatted scope preview string for error messages
    • Centralized single-key validation into a helper to keep error semantics unchanged.
  • Behavior-preserving coverage

    • Added focused tests for:
      • quoted permission keys
      • flow-mapping permission syntax (ensures fallback path compatibility)
scopeKeys, parsedFast := extractTopLevelPermissionScopeKeys(content)
if parsedFast {
    for _, scopeKey := range scopeKeys {
        if err := validateSinglePermissionScopeKey(scopeKey); err != nil {
            return err
        }
    }
    return nil
}
// fallback for non-standard YAML
var permsMap map[string]any
if err := yaml.Unmarshal([]byte(content), &permsMap); err != nil {
    return nil
}

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix performance regression in validation by optimizing workflow perf(validation): remove YAML parse hot path in permission scope validation May 25, 2026
Copilot AI requested a review from gh-aw-bot May 25, 2026 23:49
@pelikhan pelikhan marked this pull request as ready for review May 26, 2026 00:27
Copilot AI review requested due to automatic review settings May 26, 2026 00:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes ValidatePermissionScopeNames by avoiding full YAML unmarshalling on the hot path, while preserving strict behavior via a fallback parse for non-standard YAML forms. This targets a measured validation regression by reducing repeated allocation and YAML decode overhead.

Changes:

  • Added a fast-path extractor to scan block-style YAML and validate only top-level permission scope keys without yaml.Unmarshal.
  • Hoisted reusable scope validation data (scope list, meta-key set, error-preview string) into package-level cached state.
  • Added tests covering quoted permission keys and flow-mapping permission syntax (fallback path).
Show a summary per file
File Description
pkg/workflow/permissions_validation.go Adds cached validation state, a fast-path key extractor, and a helper to validate a single scope key; retains YAML unmarshal fallback for non-standard formats.
pkg/workflow/permissions_scope_validation_test.go Adds regression tests for quoted keys and flow mapping shorthand to ensure compatibility across fast/fallback paths.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 0

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 26, 2026

PR Code Quality Reviewer completed the code quality review.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 26, 2026

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 26, 2026

Design Decision Gate 🏗️ completed the design decision gate check.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 26, 2026

🧪 Test Quality Sentinel completed test quality analysis.

@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 70/100 — Acceptable

Analyzed 2 test case(s) (new rows in an existing table-driven test): 2 design tests, 0 implementation tests, 0 guideline violations.

📊 Metrics & Test Classification (2 tests analyzed)
Metric Value
New/modified tests analyzed 2 (table-driven rows)
✅ Design tests (behavioral contracts) 2 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 0 (0%) — both are valid-input paths
Duplicate test clusters 0
Test inflation detected No (test: +11 lines, production: +127 lines, ratio ≈ 0.09)
🚨 Coding-guideline violations 0

Test Classification Details

Test File Classification Issues Detected
"quoted scope keys are valid" pkg/workflow/permissions_scope_validation_test.go ✅ Design No error-path counterpart
"flow mapping shorthand remains valid" pkg/workflow/permissions_scope_validation_test.go ✅ Design No error-path counterpart

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 2 test cases — unit (//go:build !integration)
  • 🟨 JavaScript (*.test.cjs, *.test.js): 0 tests
⚠️ Suggestions — Not Blocking (2 note(s))

💡 "quoted scope keys are valid" — no error-path counterpart

Classification: Design test (valid-input path)
Observation: Confirms that "contents": read (quoted key syntax) is accepted after the fast-path refactor. The fast path presumably strips quotes before lookup — but there is no test case verifying that a malformed quoted key (e.g., "contnts": read) is still rejected with the expected typo suggestion.
Suggested addition: Add a wantErr: true row with a quoted-but-misspelled scope name to confirm the error path also works with the new fast-path strip-and-validate logic.


💡 "flow mapping shorthand remains valid" — no error-path counterpart

Classification: Design test (valid-input path)
Observation: Confirms that {contents: read, issues: write} falls back to YAML parsing and is accepted. There is no paired case confirming that {contnts: read} (typo inside a flow mapping) is still rejected.
Suggested addition: Add a wantErr: true row for a flow mapping with an unknown scope key to verify the YAML-fallback path also validates scope names.

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). Both new test cases are design tests that verify the behavioral contract is preserved after the performance refactor. The score is 70/100 due to no error-path coverage in the new cases; this is a minor gap, not a blocker.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

🧪 Test quality analysis by Test Quality Sentinel · sonnet46 1.6M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 70/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). Both new test cases are design tests verifying behavioral contracts are preserved after the performance refactor.

@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (138 new lines in pkg/workflow/, exceeding the default 100-line threshold) but does not have a linked Architecture Decision Record (ADR).

📄 Draft ADR committed: docs/adr/34797-fast-path-yaml-bypass-for-permission-scope-validation.md — review and complete it before merging.

🔒 This PR cannot merge until an ADR is linked in the PR body.

📋 What to do next
  1. Review the draft ADR committed to your branch — it was generated from the PR diff (fast-path YAML bypass + package-level cached validation state + centralized single-key validator)
  2. Complete the missing sections — confirm the alternatives reflect what you actually considered, refine the consequences, and add any context the agent could not infer (e.g., the exact benchmark name and before/after numbers)
  3. Commit the finalized ADR to docs/adr/ on your branch
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-34797: Fast-Path YAML Bypass for Permission Scope-Name Validation

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

❓ Why ADRs Matter

"AI made me procrastinate on key design decisions. Because refactoring was cheap, I could always say 'I'll deal with this later.' Deferring decisions corroded my ability to think clearly."

ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you.

📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs in this repo are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 34797-...md for PR #34797).

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · opus47 5.6M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnose, /tdd, and /zoom-out. The performance motivation is clear and the refactor is well-structured — commenting with observations rather than requesting changes.

📋 Key Themes & Highlights

Key Themes

  • Missing benchmark (/diagnose): The perf regression was measured, but no BenchmarkValidatePermissionScopeNames was added to prevent it from silently regressing again.
  • Thin fast-path test coverage (/tdd): extractTopLevelPermissionScopeKeys is only exercised indirectly; tab indentation, multi-line scalars (|/>), and mixed-indent inputs are not covered.
  • Silent skip on unexpected indentation (/zoom-out): The indent != baseIndent → continue path correctly handles value continuations, but also silently drops any scope key at an unexpected indent level, suppressing typo detection for that key.

Positive Highlights

  • ✅ Excellent factoring: validateSinglePermissionScopeKey is a clean, reusable helper with clear error semantics
  • ✅ Hoisting allScopes, validMeta, and the preview string to package-level vars is a textbook hot-path fix
  • ✅ The bracket-detection bail-out ({, }, [, ]) is a pragmatic and safe guard for the YAML fallback
  • ✅ The two new test cases target the exact failure modes the fast path could introduce

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.9M

Comments that could not be inline-anchored

pkg/workflow/permissions_scope_validation_test.go:104

[/diagnose] No benchmark added — the PR motivates the change with a specific regression measurement (+17.9%), but there is no BenchmarkValidatePermissionScopeNames to lock in the gain and detect future regressions.

<details>
<summary>💡 Suggested benchmark skeleton</summary>

func BenchmarkValidatePermissionScopeNames(b *testing.B) {
    yaml := `contents: read
issues: write
pull-requests: read
actions: write`
    b.ResetTimer()
    for i := 0; i &lt; b.N; i++ {
        _ = ValidatePe</details>

<details><summary>pkg/workflow/permissions_scope_validation_test.go:52</summary>

**[/tdd]** The new tests exercise `ValidatePermissionScopeNames` end-to-end but leave the fast-path extractor (`extractTopLevelPermissionScopeKeys`) untested for several structural YAML patterns it must handle or correctly reject.

&lt;details&gt;
&lt;summary&gt;💡 Suggested additional cases&lt;/summary&gt;

Missing test scenarios:
- **Tab-indented keys**: YAML allows tabs in some contextsthe fast path counts tabs as whitespace. Does `\t\tcontents: read` produce a correct `baseIndent`?
- **Multi-line scalar v</details>

<details><summary>pkg/workflow/permissions_validation.go:523</summary>

**[/zoom-out]** When `indent != baseIndent`, the fast path silently skips the line rather than bailing out to the YAML fallback. This is the right behaviour for multi-line scalar continuations, but it also silently drops any scope key whose indentation differs from the first keymeaning a typo in such a key would never surface a &quot;Did you mean&quot; error.

&lt;details&gt;
&lt;summary&gt;💡 Why this matters and a safer alternative&lt;/summary&gt;

Consider:
```yaml
contents: read
  pull-rquests: write   # indent=2, …

</details>

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — The fast-path optimisation has two correctness bugs and one misleading error-message defect that need to be fixed before merge.

### Blocking issues
  1. Fast path silently drops nested-indented key-value lines (line 525) — when indent != baseIndent and the line still has a colon, the fast path skips it and returns parsedFast=true. The YAML fallback that would have caught the anomaly never runs, so malformed YAML or genuinely nested map content is silently accepted instead of validated or rejected.

  2. strings.Trim strips unbalanced quotes (line 532) — a key like "write-all (missing closing quote) has its leading " stripped and validates as the valid meta-key write-all, masking the authoring error. Only a single balanced pair should be removed.

  3. "..." always appended to scopes preview (line 27) — minor but produces misleading error messages when the scope list has ≤ 10 entries.

🔎 Code quality review by PR Code Quality Reviewer · sonnet46 1.9M

}
if indent != baseIndent {
// Nested block content (value continuation), not a top-level scope key.
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast path silently skips deeper-indented key-value lines instead of falling back, diverging from YAML semantics.

💡 Details and fix

When indent != baseIndent, the code does continue. This is fine for plain scalar continuations, but if a deeper-indented line is itself a key: value pair (e.g. from an accidentally-indented permission or a nested map), the fast path silently drops it and returns parsedFast=true. The YAML fallback — which would surface the nested key or return a parse error — never runs.

Example: contents: read\n badscope: write is invalid YAML (the YAML parser would error → ValidatePermissionScopeNames returns nil). The fast path returns ["contents"] and parsedFast=true, silently accepting the document.

Fix: when a deeper-indented line contains a colon, bail out to the YAML parser rather than skipping:

if indent != baseIndent {
    if colonIdx > 0 {
        // Looks like a nested map key — hand off to YAML parser.
        return nil, false
    }
    continue // plain scalar value continuation, safe to skip
}

if key == "" || strings.HasPrefix(key, "-") {
return nil, false
}
key = strings.Trim(key, `"'`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strings.Trim strips all leading/trailing quote characters rather than one balanced pair, silently normalising malformed keys.

💡 Details and fix

strings.Trim(key, "\"'") removes every leading and trailing " or ' character. A malformed key like ""contents"" becomes contents and validates successfully. A key like "write-all (unbalanced quote, meaning a workflow authoring error) loses its leading quote and is validated as write-all — a valid meta key — masking the typo.

The intent is to unquote YAML keys that were written with explicit quotes, but only a single balanced pair should be removed:

if len(key) >= 2 {
    q := key[0]
    if (q == '"' || q == '\'' ) && key[len(key)-1] == q {
        key = key[1 : len(key)-1]
    }
}

If the quotes are unbalanced, leaving the key unchanged (or returning nil, false) lets the YAML fallback handle it correctly.

}
permissionScopeValidationValidScopesPreview = strings.Join(
permissionScopeValidationAllScopes[:min(10, len(permissionScopeValidationAllScopes))],
", ",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"..." is unconditionally appended to the scopes preview even when all scopes fit within the 10-item cap.

💡 Details and fix

If buildPermissionScopeValidationAllScopes() ever returns ≤ 10 entries — possible in tests or if the scope list is trimmed — the error message claims there are more valid scopes than shown, which is misleading.

// Fix:
if len(permissionScopeValidationAllScopes) > 10 {
    permissionScopeValidationValidScopesPreview = strings.Join(
        permissionScopeValidationAllScopes[:10], ", ",
    ) + "..."
} else {
    permissionScopeValidationValidScopesPreview = strings.Join(
        permissionScopeValidationAllScopes, ", ",
    )
}

@pelikhan pelikhan closed this May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[performance] Regression in Validation: +17.9% slower

4 participants