Skip to content

fix(shared-infra): record skipped files in speckit.manifest.json#2483

Open
eldar702 wants to merge 7 commits into
github:mainfrom
eldar702:fix/2107-speckit-manifest-record-skipped
Open

fix(shared-infra): record skipped files in speckit.manifest.json#2483
eldar702 wants to merge 7 commits into
github:mainfrom
eldar702:fix/2107-speckit-manifest-record-skipped

Conversation

@eldar702
Copy link
Copy Markdown
Contributor

@eldar702 eldar702 commented May 7, 2026

Summary

In install_shared_infra (src/specify_cli/shared_infra.py), the skip branches in both the scripts loop and the templates loop now record each skipped file in speckit.manifest.json, so a fresh-manifest run against an already-populated .specify/ tree no longer writes an empty files field.

Fixes #2107

Problem

When install_shared_infra ran with force=False against a project that already had files under .specify/scripts/ and .specify/templates/ — but a fresh, empty manifest — every iteration hit the if dst.exists() and not force: skipped_files.append(...); continue branch. planned_copies and planned_templates stayed empty, the post-loop record loop had nothing to record, and manifest.save() serialised files: {}. The integration then believed nothing had been installed.

This bites users who delete or lose speckit.manifest.json, who extract .specify/ out-of-band, or who hit a code path where the manifest can't be loaded but the directory tree is intact.

Solution

Call manifest.record_existing(rel_skip) from inside both skip branches, but only when the path is not already tracked:

if dst_path.exists() and not force:
    rel_skip = dst_path.relative_to(project_path).as_posix()
    skipped_files.append(rel_skip)
    if rel_skip not in manifest.files:
        manifest.record_existing(rel_skip)
    continue

The guard matters: record_existing always re-hashes the on-disk content, so without it a customized template would have its manifest hash overwritten with the customized hash, defeating the user-modification detection that integration use relies on (test_use_preserves_modified_templates_unless_forced is the canonical regression test for that flow).

Symmetric change in both the scripts loop and the templates loop.

Test plan

  • New regression test TestSpeckitManifestRecordsSkippedFiles::test_install_shared_infra_records_skipped_files — populates .specify/ via a normal first run, deletes the manifest, runs install_shared_infra again with force=False (every file goes down the skip branch), asserts the saved manifest is non-empty and is a superset of the first run's files. Fails on main, passes after the fix.
  • Existing test_use_preserves_modified_templates_unless_forced continues to pass thanks to the if rel_skip not in manifest.files: guard — customized templates keep their original hash, so integration use still skips overwriting them.
  • pytest full suite: 2787 passed, 34 skipped — zero regressions.

Notes

This change was AI-assisted. The fix was selected after a 3-agent solution-design debate where two designers independently identified the same root cause (skip branch in install_shared_infra failing to record), giving high convergent confidence. A third designer initially proposed a fix in claude/__init__.py, but that targets the integration manifest (claude.manifest.json), not the shared-infrastructure manifest (speckit.manifest.json) the issue describes — convergent analysis correctly localized the right file.

A first attempt at the fix omitted the if rel_skip not in manifest.files: guard and broke test_use_preserves_modified_templates_unless_forced (customized templates were silently re-hashed). Adding the guard fixed the regression while still recovering from a lost-manifest scenario.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a shared-infrastructure tracking bug where install_shared_infra(..., force=False) could skip all existing .specify/scripts/ and .specify/templates/ files without recording them, resulting in speckit.manifest.json being saved with an empty files mapping (issue #2107).

Changes:

  • Record skipped (already-existing) shared scripts/templates into speckit.manifest.json during install_shared_infra when they are not already tracked.
  • Add a regression test ensuring a “lost manifest” reinstall reconstructs a non-empty manifest that includes all previously tracked files.
Show a summary per file
File Description
src/specify_cli/shared_infra.py Records skipped existing shared infra files into the shared manifest during install.
tests/integrations/test_integration_claude.py Adds a regression test covering the “lost manifest + skip branch” scenario.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

src/specify_cli/shared_infra.py:303

  • Same issue as the scripts loop: calling manifest.record_existing(rel_skip) in the skip branch will raise if dst exists but is not a regular file (e.g., a directory). Add an dst.is_file() guard or raise an explicit error so a non-file collision doesn’t fail with an opaque hashing/open error.
                # ``.specify/`` tree does not silently drop it (#2107).
                # Skip if already tracked — preserving the original hash
                # keeps user-modification detection working downstream.
                if rel_skip not in manifest.files:
                    manifest.record_existing(rel_skip)
  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread src/specify_cli/shared_infra.py Outdated
Comment thread tests/integrations/test_integration_claude.py Outdated
@eldar702 eldar702 marked this pull request as ready for review May 7, 2026 19:23
@eldar702 eldar702 requested a review from mnriem as a code owner May 7, 2026 19:23
@mnriem

This comment was marked as outdated.

Copy link
Copy Markdown
Collaborator

@mnriem mnriem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address Copilot feedback

eldar702 added a commit to eldar702/spec-kit that referenced this pull request May 10, 2026
Address Copilot review feedback on PR github#2483. The previous fix called
``manifest.record_existing(rel_skip)`` from the skip branch of both
loops in ``install_shared_infra``, which would crash with
``IsADirectoryError`` (or another ``OSError``) if a directory or other
non-regular-file happened to exist at the expected destination path —
since ``record_existing`` opens the file to compute its SHA-256.

Three coordinated fixes:

1. ``IntegrationManifest.record_existing`` now validates its
   precondition: it raises ``ValueError`` if the path is a symlink or
   is not a regular file. The docstring already promised "an
   already-existing file"; this enforces it. The symlink check runs on
   the un-resolved path because ``_validate_rel_path`` calls
   ``resolve()``, which would silently follow the symlink. Mirrors the
   existing ``_ensure_safe_manifest_destination`` precedent in the
   same module.

2. In ``install_shared_infra``'s scripts and templates skip branches,
   guard the ``record_existing`` call with ``dst.is_file()`` and wrap
   it in ``try/except (OSError, ValueError)``. A directory collision,
   permission error, or TOCTOU race no longer aborts the whole
   install — the user gets a per-path warning, the path still
   surfaces in ``skipped_files``, and the rest of the install
   continues.

3. ``_read_manifest_files`` in the regression test no longer falls
   back to ``data.get("_files")`` (Copilot's low-confidence finding):
   the silent fallback could mask a schema regression where the
   public ``files`` key is renamed. It now asserts ``"files" in data``
   and that the value is a dict.

Add two regression tests in ``TestSpeckitManifestRecordsSkippedFiles``
covering the directory-at-destination edge case for both the scripts
loop and the templates loop. Both verify (a) install does not crash,
(b) the non-file path is not recorded in the manifest, and (c) the
path still surfaces in the user-visible warning.

The "shared infrastructure file(s)" warning text is changed to
"path(s)" so it remains accurate when non-file entries appear in the
list.

Refs github#2107
@eldar702
Copy link
Copy Markdown
Contributor Author

Thanks for the review! Pushed 6790654 to address all three Copilot findings.

What changed

1. Hardened IntegrationManifest.record_existing (the choke point)

The function's docstring already promised "an already-existing file", but didn't enforce it. It now raises ValueError if the path is a symlink or is not a regular file — mirroring the existing _ensure_safe_manifest_destination precedent in the same module. The is_symlink() check runs on the un-resolved path because _validate_rel_path calls resolve(), which would otherwise silently follow the symlink and record the target.

2. Guarded both call sites in install_shared_infra (the bug Copilot flagged at lines 277 + 303)

if dst_path.is_file() and rel_skip not in manifest.files:
    try:
        manifest.record_existing(rel_skip)
    except (OSError, ValueError) as exc:
        console.print(
            f"[yellow]⚠[/yellow]  could not record {rel_skip} in manifest: {exc}"
        )

Belt-and-suspenders: the is_file() guard short-circuits the common case (directory collision), and the try/except absorbs the rare TOCTOU race or permission error so a single weird path can't abort the whole install. The path still surfaces via the existing skipped_files warning at the bottom of the function.

3. Tightened the test helper (Copilot's low-confidence finding)

_read_manifest_files no longer falls back to data.get("_files"). It now asserts "files" in data and that the value is a dict, with descriptive failure messages. A future schema regression that renames the public key now fails red instead of being silently masked.

4. New regression tests

Two new methods in TestSpeckitManifestRecordsSkippedFiles plant a directory at the expected destination of a script and a template, then assert (a) install completes, (b) the path is NOT recorded in the manifest, (c) the path still appears in the user-visible warning.

5. Cosmetic

The user-facing warning string "shared infrastructure file(s) already exist""shared infrastructure path(s) already exist", since non-file entries can now legitimately land in skipped_files.

Verification

  • All 1950 integrations tests + the new tests pass: pytest tests/integrations/ → passed
  • Full repo: pytest tests/ → 2789 passed, 34 skipped
  • Adversarial unit-checks confirm record_existing raises ValueError for directory, symlink, and out-of-root path; succeeds for a regular file.

Out of scope (intentionally)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 3/3 changed files
  • Comments generated: 3

Comment thread src/specify_cli/integrations/manifest.py
Comment thread src/specify_cli/integrations/manifest.py Outdated
Comment thread src/specify_cli/shared_infra.py Outdated
@mnriem
Copy link
Copy Markdown
Collaborator

mnriem commented May 12, 2026

Please address Copilot feedback and resolve conflicts

eldar702 added 3 commits May 15, 2026 19:44
`install_shared_infra` skipped files that already existed on disk
when `force=False`, but the skip branches in both the scripts loop
and the templates loop only appended to `skipped_files` without
calling `manifest.record_existing`. So when the function ran with a
fresh manifest against an already-populated `.specify/` tree (e.g.
after the manifest was deleted, corrupted, or extracted out of band),
every file went down the skip path, `planned_copies` /
`planned_templates` stayed empty, and `manifest.save()` wrote an
empty `files` field — leaving the integration believing nothing was
installed.

Record every skipped file in the manifest, but only when it is not
already tracked. This preserves the original hash for files that
were previously recorded so `check_modified()` (used by
`integration use` to decide whether a user has customized a
template) keeps working correctly.

Add `TestSpeckitManifestRecordsSkippedFiles` in
`tests/integrations/test_integration_claude.py` covering both the
fresh-skip path and the recover-after-lost-manifest path.

Fixes github#2107
Address Copilot review feedback on PR github#2483. The previous fix called
``manifest.record_existing(rel_skip)`` from the skip branch of both
loops in ``install_shared_infra``, which would crash with
``IsADirectoryError`` (or another ``OSError``) if a directory or other
non-regular-file happened to exist at the expected destination path —
since ``record_existing`` opens the file to compute its SHA-256.

Three coordinated fixes:

1. ``IntegrationManifest.record_existing`` now validates its
   precondition: it raises ``ValueError`` if the path is a symlink or
   is not a regular file. The docstring already promised "an
   already-existing file"; this enforces it. The symlink check runs on
   the un-resolved path because ``_validate_rel_path`` calls
   ``resolve()``, which would silently follow the symlink. Mirrors the
   existing ``_ensure_safe_manifest_destination`` precedent in the
   same module.

2. In ``install_shared_infra``'s scripts and templates skip branches,
   guard the ``record_existing`` call with ``dst.is_file()`` and wrap
   it in ``try/except (OSError, ValueError)``. A directory collision,
   permission error, or TOCTOU race no longer aborts the whole
   install — the user gets a per-path warning, the path still
   surfaces in ``skipped_files``, and the rest of the install
   continues.

3. ``_read_manifest_files`` in the regression test no longer falls
   back to ``data.get("_files")`` (Copilot's low-confidence finding):
   the silent fallback could mask a schema regression where the
   public ``files`` key is renamed. It now asserts ``"files" in data``
   and that the value is a dict.

Add two regression tests in ``TestSpeckitManifestRecordsSkippedFiles``
covering the directory-at-destination edge case for both the scripts
loop and the templates loop. Both verify (a) install does not crash,
(b) the non-file path is not recorded in the manifest, and (c) the
path still surfaces in the user-visible warning.

The "shared infrastructure file(s)" warning text is changed to
"path(s)" so it remains accurate when non-file entries appear in the
list.

Refs github#2107
… tests

Address Copilot review (2026-05-11, review id 4266902103):

1. `record_existing` was calling `(self.project_root / rel).is_symlink()`
   BEFORE validating containment. For absolute paths or paths containing
   `..`, this performed a filesystem stat outside the project root before
   `_validate_rel_path()` raised. Add a cheap lexical pre-check that
   delegates to `_validate_rel_path()` for the canonical error messages,
   so the symlink stat only ever runs on paths that are already lexically
   inside the project root.

2. Add focused unit tests in `tests/integrations/test_manifest.py` for
   the symlink and non-regular-file error paths, including:
     - symlink target rejection
     - dangling symlink rejection (caught by the symlink guard before
       the is_file check)
     - directory path rejection (is_file == False)
     - missing-path rejection (is_file == False)
     - absolute-path lexical pre-check
   The Copilot reviewer noted these guards had no focused coverage in
   `test_manifest.py`, only via the `test_integration_claude.py`
   regression test.

3. The third Copilot finding (repeated `dict(self._files)` copies via
   `manifest.files` in the skip branches) is already resolved on this
   branch by using `prior_hashes` — the function-scope snapshot taken at
   the top of `install_shared_infra` — for the membership check, instead
   of `manifest.files`.

AI disclosure: drafted with assistance from Claude (Opus 4.7).
@eldar702 eldar702 force-pushed the fix/2107-speckit-manifest-record-skipped branch from 6790654 to 5a6e25b Compare May 15, 2026 16:48
@eldar702
Copy link
Copy Markdown
Contributor Author

Thanks for the bump @mnriem! Pushed 5a6e25b (rebased on main) addressing both your asks.

1. Rebased onto current main

The conflict was the new _decide_overwrite / _safe_dest_or_bucket / _ensure_or_bucket_dir helpers from #2375 (the refresh-managed refactor). I integrated my skip-branch manifest-recording logic into the new if not write: ... else: skipped_files.append(rel) path of both loops, so the fix still applies even when refresh_managed=True decides not to overwrite a file. mergeStateStatus is now CLEAN.

2. Copilot re-review (review id 4266902103, 2026-05-11)

All three findings addressed:

(a) Lexical pre-check in record_existingmanifest.py. The old order did (self.project_root / rel).is_symlink() before _validate_rel_path(), so an absolute path or one with .. segments triggered a filesystem stat outside the project root before the path was rejected. Added a cheap lexical pre-check up front that delegates to _validate_rel_path for the canonical error messages; the symlink stat now only ever runs on paths that are already lexically inside the project root.

(b) Focused unit tests for the new error casestests/integrations/test_manifest.py. Added TestManifestRecordExistingErrors with five cases:

  • symlink-target rejection
  • dangling-symlink rejection (caught by the symlink guard before the is_file check)
  • directory-path rejection (is_file == False)
  • missing-path rejection (is_file == False)
  • absolute-path lexical pre-check

(c) Avoid repeated dict(self._files) copies in the skip branchesshared_infra.py. Reused prior_hashes — the function-scope snapshot already taken at the top of install_shared_infra for the refresh logic — for the membership check, so manifest.files (which copies on every access) is no longer called in the hot loops. This shares the existing snapshot rather than introducing a parallel cache.

Verification

  • Full test suite: 2925 passed, 35 skipped, 0 failed (43.8s).
  • Identity: eldar702 <eldarshlomi7@gmail.com> on the new commit.

🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 4

Comment thread src/specify_cli/shared_infra.py Outdated
Comment thread src/specify_cli/shared_infra.py Outdated
Comment on lines +169 to +175
# Check ``is_symlink()`` on the un-resolved path because
# ``_validate_rel_path`` resolves the path (which would follow
# the symlink and silently record the target instead).
if (self.project_root / rel).is_symlink():
raise ValueError(
f"Refusing to record symlinked manifest path: {rel}"
)
Comment thread src/specify_cli/integrations/manifest.py
…canonical-path guards

Address Copilot review id 4309888722 (2026-05-18) on PR github#2483:

1. Recovery semantics (shared_infra.py:371, 412) — install_shared_infra
   now passes ``recovered=True`` when re-recording a skipped existing
   file. This flag funnels into a new ``recovered_files`` array in the
   manifest JSON, so a future ``refresh_managed`` run can distinguish
   "hash I produced" from "hash I observed on a file that may be a user
   customization" and avoid silent overwrite without ``--refresh-shared-infra``.
   Schema is purely additive: ``files: dict[str, str]`` is unchanged; the
   new ``recovered_files: list[str]`` is omitted when empty.

2. Symlinked ancestor (manifest.py:172) — ``record_existing`` now walks
   every component of the rel path and rejects any symlinked ancestor,
   not just a symlinked leaf. Catches ``linked_dir/file.txt`` where
   ``linked_dir`` is a symlink, which previously slipped past the leaf-only
   ``is_symlink()`` check and was resolved through by ``_validate_rel_path``.
   Mirrors the component-walk pattern in ``_ensure_safe_manifest_directory``.

3. Misleading "escapes project root" message (manifest.py:168) — paths
   like ``dir/../file.txt`` normalize inside the project, so the old
   message lied about what was wrong. New message: "Manifest paths must
   be canonical; '..' segments are not allowed". Still rejects (canonical
   keys are required so ``check_modified``/``uninstall`` cannot key the
   same file under two paths).

Tests: 7 new test methods across TestManifestRecoveredFiles and
TestRecordExistingNewGuards covering all 4 Copilot findings. Full suite
passes locally.

🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).
@eldar702
Copy link
Copy Markdown
Contributor Author

Thanks @copilot-pull-request-reviewer for the additional pass. Pushed 00710c8 addressing all four findings.

1 & 2. Recovery semantics (shared_infra.py:371, shared_infra.py:412)

Both call sites now pass recovered=True. The flag flows through to a new top-level recovered_files: list[str] in the manifest JSON — additively, so the existing files: dict[str, str] schema is unchanged and the load-validator at line 336 still asserts isinstance(v, str) without ambiguity. Future refresh_managed can call manifest.is_recovered(rel) and decline to treat the recorded hash as a managed baseline until the user opts in with --refresh-shared-infra, defending against silent overwrite of customizations after manifest loss.

I deliberately chose this over hash-validation against the bundled bytes: processed templates undergo resolve_command_refs before being written, so on-disk bytes can never byte-equal the raw bundled bytes, and re-computing the processed bytes at recovery time requires user context that isn't recoverable. The recovered_files channel is honest: "we observed this file, but we cannot vouch it is the managed version."

3. Symlinked-ancestor walk (manifest.py:172)

Replaced the leaf-only is_symlink() check with a component-walk:

_walk = self.project_root
for part in rel.parts:
    _walk = _walk / part
    if _walk.is_symlink():
        raise ValueError(
            f"Refusing to record symlinked manifest path: {rel} "
            f"(symlinked at {_walk.relative_to(self.project_root).as_posix()})"
        )

This mirrors the pattern already in _ensure_safe_manifest_directory (manifest.py:67-72). New test test_rejects_symlinked_ancestor plants linked_dir -> real_dir and asserts record_existing("linked_dir/file.txt") raises before any resolution can follow the symlink.

4. .. rejection message (manifest.py:168)

You offered two options; I took the second — kept the strict rejection but replaced the misleading "escapes project root" message with: "Manifest paths must be canonical; '..' segments are not allowed (got {rel})". Manifest keys must be canonical so check_modified/uninstall don't key the same file under two paths; the old message was correct in intent but wrong in wording for dir/../file.txt-style paths.

Verification

  • Full suite: 2932 passed, 35 skipped, 0 failed.
  • New tests: 7 across TestManifestRecoveredFiles and TestRecordExistingNewGuards.
  • Diff: 3 files, +145/-11 LOC.
  • Identity: eldar702 <eldarshlomi7@gmail.com>.

🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 1

Comment thread src/specify_cli/integrations/manifest.py Outdated
@mnriem
Copy link
Copy Markdown
Collaborator

mnriem commented May 21, 2026

Please address Copilot feedback

Address Copilot review comment id 4309888722 round-5 (2026-05-21) on PR github#2483:

``is_recovered()`` previously checked ``self._recovered_files`` membership
with bare ``Path(rel).as_posix()``, while ``record_existing()`` stores keys
via ``_validate_rel_path(rel, root).relative_to(root).as_posix()``. The two
normalizations disagreed on absolute paths and paths that escape the
project root — ``is_recovered`` would silently return False for inputs that
``record_existing`` would have refused entirely.

The fix routes ``is_recovered`` through the same ``_validate_rel_path``
pipeline; ``ValueError`` from the validator is caught and converted to
False so query semantics stay exception-free (Python ``__contains__``
convention).

Tests: 2 new methods in ``TestManifestRecoveredFiles``:
- ``test_is_recovered_absolute_path_returns_false``
- ``test_is_recovered_escaping_path_returns_false``

🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).
@eldar702
Copy link
Copy Markdown
Contributor Author

Thanks @copilot-pull-request-reviewer / @mnriem. Pushed 25051ff addressing the round-5 finding.

The fix

is_recovered now normalizes its input through the same _validate_rel_path pipeline that record_existing uses for its stored keys — taking Copilot's verbatim option A. ValueError from the validator is caught and converted to False so query semantics stay exception-free (Python __contains__ convention: a not-in-set query never raises).

def is_recovered(self, rel_path: str | Path) -> bool:
    try:
        abs_path = _validate_rel_path(Path(rel_path), self.project_root)
        normalized = abs_path.relative_to(self.project_root).as_posix()
    except ValueError:
        return False
    return normalized in self._recovered_files

2 new tests in TestManifestRecoveredFiles: test_is_recovered_absolute_path_returns_false and test_is_recovered_escaping_path_returns_false.

Verification

  • Manifest tests: 43 passed (7 existing for the recovery feature + 2 new + 34 unrelated)
  • Full suite: 2934 passed, 35 skipped, 0 failed
  • Identity: eldar702 <eldarshlomi7@gmail.com>
  • Diff: 2 files, +35/-3 LOC

Question for @mnriem

This PR has now had 5 review rounds since it was opened on 2026-05-07 — your initial review, then 3 successive Copilot rounds each finding adjacent-but-distinct issues, then today's. Each fix has been small and tests stay green, but Copilot has no "approve" state, only "more findings" or silence. Could you confirm whether your "Please address Copilot feedback" comment is the merge gate, or whether Copilot signoff is itself required? Happy to iterate further if needed — just want to make sure we're converging.

🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).

@mnriem mnriem requested a review from Copilot May 26, 2026 12:14
@mnriem
Copy link
Copy Markdown
Collaborator

mnriem commented May 26, 2026

As a general rule of thumb Copilot feedback must be addressed until it no longer has anything to report. This keeps the corner cases covered. Note that everyone including maintainers have to go through that same cycle. I realize this takes a while and often looks like it just keeps on giving, but overall I think this is the best for the entire codebase

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 2

normalized = abs_path.relative_to(self.project_root).as_posix()
self._files[normalized] = _sha256(abs_path)
if recovered:
self._recovered_files.add(normalized)
Comment on lines +227 to +235
Input is normalized through the same ``_validate_rel_path`` pipeline that
``record_existing`` uses for its stored keys, so the two methods agree
on key format. Absolute paths and paths that escape the project root
return ``False`` (they cannot match the relative POSIX keys we store) —
consistent with Python's membership-predicate convention of not raising
on a not-in-set query.
"""
try:
abs_path = _validate_rel_path(Path(rel_path), self.project_root)
@mnriem
Copy link
Copy Markdown
Collaborator

mnriem commented May 27, 2026

Please address Copilot feedback. If not applicable, please explain why

…..' in is_recovered

Address Copilot Round-7 review comments on PR github#2483:

1. record_existing(recovered=False) and record_file now BOTH discard the
   path from _recovered_files. The marker is meant to flag "we observed
   this file but cannot vouch it's a managed baseline" — once the same
   path is re-recorded as managed (either explicitly or by writing fresh
   bytes), the marker is stale and must clear so refresh_managed and
   future is_recovered queries return the truthful answer.

2. is_recovered now applies the same canonical-key guard as record_existing
   (rejects absolute paths and '..' segments lexically before delegating
   to _validate_rel_path). Such paths can never be stored keys, so the
   query correctly returns False without depending on _validate_rel_path
   semantics that diverged from record_existing's stricter contract.

record_file docstring updated to mention the side-effect on recovered
markers.

Tests: 3 new methods in TestManifestRecoveredFiles covering
record_existing(false) clearing, record_file clearing, and is_recovered
dotdot rejection.
@eldar702
Copy link
Copy Markdown
Contributor Author

Thanks @copilot-pull-request-reviewer / @mnriem. Pushed 1dbf0c2 addressing both Round-7 findings.

Finding A — recovered marker now clears on managed re-record

record_existing(recovered=False) now discards the normalized key from _recovered_files, and record_file does the same. The intent of _recovered_files is "we observed this file but cannot vouch it's a managed baseline" — once the same path is re-recorded as managed (either explicitly or by writing fresh bytes), the marker is stale and must clear. record_file's docstring also gained a one-line note about the side effect.

Finding B — is_recovered now matches record_existing's canonicalization

Added a lexical guard to is_recovered that rejects absolute paths AND .. segments up-front (returns False, no raise — preserving the Python __contains__ convention). This matches record_existing's canonical-key guard exactly, so the docstring's equivalence claim is now true. _validate_rel_path still runs after the lexical guard for defense-in-depth on paths that escape the root after resolution.

Tests

3 new in TestManifestRecoveredFiles:

  • test_record_existing_clears_recovered_when_false
  • test_record_file_clears_recovered
  • test_is_recovered_rejects_dotdot_segment

Verification

  • Manifest tests: pytest tests/integrations/test_manifest.py -v — 46/46 green (including the 3 new)
  • Full suite: 2937 passed, 35 skipped, 0 failed
  • Diff: 2 files, +52/-7 LOC
  • Identity: eldar702 <eldarshlomi7@gmail.com>

🤖 AI disclosure: drafted with Claude (Opus 4.7).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 1

Comment thread tests/integrations/test_manifest.py Outdated
Comment on lines +358 to +362
# Copilot round-5 finding: passing an absolute path silently returned
# False because the stored keys are relative POSIX strings. Now the
# call normalizes through ``_validate_rel_path`` which raises on
# absolute inputs; we catch and return False so query semantics stay
# exception-free.
@mnriem
Copy link
Copy Markdown
Collaborator

mnriem commented May 27, 2026

Please address Copilot feedback

…al guard

Round 8 — addresses Copilot review comment on tests/integrations/test_manifest.py:362.

After Round-7 (1dbf0c2), is_recovered() rejects absolute paths and '..' segments
up front via a lexical guard, returning False without calling _validate_rel_path
at all. The test comments still described the prior "_validate_rel_path raises;
we catch" code path, which is misleading for readers.

Updated comments in both:
  - test_is_recovered_absolute_path_returns_false (Copilot's exact target)
  - test_is_recovered_escaping_path_returns_false (same comment-class issue;
    fixed preemptively to avoid a Round-9 finding on the same drift)

Pure documentation change. Test assertions and behavior unchanged; all manifest
tests still green.
@eldar702
Copy link
Copy Markdown
Contributor Author

Thanks @copilot-pull-request-reviewer / @mnriem. Pushed 7666030 — pure comment fix on the two is_recovered tests so they describe the actual Round-7 code path (early lexical is_absolute() / '..' in rel.parts guards) rather than the prior "_validate_rel_path raises; we catch" wording. No assertion or behavior change; same suite still green.

Pre-empted the parallel comment in test_is_recovered_escaping_path_returns_false while in the area — same comment-class drift would have been Round-9 material otherwise.

🤖 AI disclosure: drafted with Claude (Opus 4.7).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: speckit install creates manifest with empty files — no skills available

3 participants