Skip to content

feat(analyzer): add phase-1 structured skill summaries#211

Open
rodboev wants to merge 4 commits into
NVIDIA:mainfrom
rodboev:pr/structured-skill-role-summary
Open

feat(analyzer): add phase-1 structured skill summaries#211
rodboev wants to merge 4 commits into
NVIDIA:mainfrom
rodboev:pr/structured-skill-role-summary

Conversation

@rodboev

@rodboev rodboev commented Jun 25, 2026

Copy link
Copy Markdown

Summary

SkillSpector currently scans AISOP/AISP bundles as generic directories and never says that a structured workflow format was found or which workflow roles were present. This phase-1 slice teaches recursive discovery and build-context to recognize valid *.aisop.json bundles, then emits one LOW informational summary finding without changing scoring or existing rule behavior.

Refs #130

This follows the phased implementation sketch and AISOP/AISP bundle examples in #130.

Root cause

src/skillspector/multi_skill.py only recognizes root SKILL.md layouts or immediate subdirectories that contain SKILL.md or skill.md, so --recursive misses structured subdirectories that are expressed only through valid AISOP/AISP bundle files. src/skillspector/nodes/build_context.py builds generic scan context and never classifies those bundles, while src/skillspector/nodes/analyzers/__init__.py has no structured-skill analyzer in the registry-driven graph. The bundle parser also needed to follow the issue schema itself, with workflow nodes at user.content.functions and constraint strings in functions.*.constraints, rather than an invented nested aisop.functions shape. The result is simple: valid structured bundles can be scanned, but the report has no structured-skill summary surface at all.

Diff Notes

  • Add src/skillspector/structured_skill.py as the shared detector for valid *.aisop.json AISOP/AISP bundles and their summarized metadata, reading workflow nodes from user.content.functions and AISP resources from user.content.aisp_contract.resources.
  • Bound structured bundle metadata walking in src/skillspector/structured_skill.py and fail closed on over-nested bundles, so adversarial functions or resources trees are ignored instead of crashing build_context() or recursive skill detection.
  • Reuse that detector in src/skillspector/multi_skill.py so --recursive can treat immediate structured subdirectories as skills while preserving root SKILL.md precedence.
  • Populate structured_skill_context in src/skillspector/nodes/build_context.py and src/skillspector/state.py without changing the public report schema.
  • Add src/skillspector/nodes/analyzers/structured_skill_roles.py and register it in src/skillspector/nodes/analyzers/__init__.py so the graph emits one LOW SSR-1 structured summary finding.
  • Add focused coverage in tests/test_multi_skill.py, tests/nodes/test_build_context.py, tests/nodes/analyzers/test_registry.py, tests/nodes/analyzers/test_structured_skill_roles.py, and tests/unit/test_cli.py.

Scope

  • Phase 1 only. This PR detects valid AISOP/AISP bundles and summarizes them with one additive LOW finding.
  • No severity adjustments, no meta_analyzer changes, and no changes to existing static, behavioral, MCP, or semantic findings.
  • No report-schema additions beyond the existing Finding fields.
  • No Pi extension changes in extensions/skillspector.ts.
  • No full role-aware score downgrades, no malformed-bundle warning surface, and no claim of complete AISOP/AISP spec coverage.

Verification

  • python -m pytest tests/test_multi_skill.py tests/nodes/test_build_context.py tests/nodes/analyzers/test_structured_skill_roles.py -q - pass, 37 passed, 1 warning
  • uv run ruff check src/ tests/ - pass
  • uv run ruff format --check src/ tests/ - pass
  • CI Lint & Test (Python 3.12), Lint & Test (Python 3.13), and DCO Check - pending rerun after push

Notes

  • This slice stops at detection and summary. Role-aware severity adjustments remain follow-up work on top of the new structured context.

rodboev added 2 commits June 25, 2026 07:39
Signed-off-by: Rod Boev <rod.boev@gmail.com>
Signed-off-by: Rod Boev <rod.boev@gmail.com>

@rng1995 rng1995 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes — the phase-1 structured-skill summary is well-built and thoroughly tested, but it introduces an unhandled-crash (DoS) path on adversarial input, which matters because the scanner's whole job is to ingest untrusted skills.

Blocking — unbounded recursion → uncaught RecursionError in the core build_context path.
structured_skill.py's walkers (_extract_function_names, _extract_constraint_anchors._walk, _extract_resource_anchors._walk) recurse per nesting level with no depth bound. A crafted *.aisop.json with deeply nested functions/constraints/resources (~2k levels) raises RecursionError. _parse_bundle_path only wraps json.loads in try/except (OSError, json.JSONDecodeError) and then calls _parse_bundle_payload(...) outside that guard, so the error propagates through extract_structured_skill_context() into build_context() — a core node, not an isolated analyzer — and crashes the whole scan. (json.loads itself tolerates the nesting; the pure-Python walk is what blows the stack. Reproduced at depth≈2000.) _is_structured_skill_bundle() in multi_skill.detect_skills has the same exposure during detection.

Suggested fix: bound the nesting depth in the walkers (return/skip past a sane cap) and/or have _parse_bundle_path treat a bundle that raises during parsing as "not a valid bundle" (e.g. also catch RecursionError/ValueError around _parse_bundle_payload) so a malformed bundle fails closed to None instead of crashing the scan. A regression test with a deeply nested bundle asserting the scan completes would lock this in.

Everything else looks good — the two-message contract validation is appropriately defensive, the SSR-1 finding is correctly LOW/informational, and the build_context/detection/registry wiring and tests are solid. Happy to re-review once the recursion is bounded.

Comment thread src/skillspector/structured_skill.py Outdated
rodboev added 2 commits June 27, 2026 14:55
Signed-off-by: Rod Boev <rod.boev@gmail.com>
Signed-off-by: Rod Boev <rod.boev@gmail.com>
@rodboev rodboev force-pushed the pr/structured-skill-role-summary branch from 6efdb55 to 655d925 Compare June 27, 2026 18:56
@rodboev

rodboev commented Jun 27, 2026

Copy link
Copy Markdown
Author

Pushed 655d925. structured_skill.py now bounds nested workflow and resource walking, and _parse_bundle_path() treats an over-nested bundle as invalid instead of letting it crash the scan. The branch also adds focused regressions for both the build_context() path and recursive child-skill detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants