Skip to content

Pyright-driven cleanup#380

Open
krokicki wants to merge 6 commits into
mainfrom
pyright-cross-platform-compat
Open

Pyright-driven cleanup#380
krokicki wants to merge 6 commits into
mainfrom
pyright-cross-platform-compat

Conversation

@krokicki

@krokicki krokicki commented Jun 4, 2026

Copy link
Copy Markdown
Member

In order to review the changes on the apps-improvements branch in my IDE, I wanted things to pass error-check in Pyright. This turned into a large refactor, including a new cross-platform locking layer that cleans up some of the code while expanding its capabilities.

With the new file locking layer, we don't need to do stuff like this:

try:
    import fcntl
except ImportError:
    fcntl = None  # type: ignore[assignment]

On the other hand, there is a lot of churn.

Base branch: this PR targets apps-improvements, not main. The diff is just the cross-platform / Pyright work.

Summary

  • Add a cross-platform compatibility layer (fileglancer/platform_compat.py) for optional POSIX-only behavior and portalocker-backed file locking.
  • Add Pyright support via pyrightconfig.json and a pixi run pyright-check task, configured with pythonPlatform: "All" so Linux/macOS/Windows are all validated in one run.
  • Make database models, app adapters, worker IPC/S3 paths, Jira handling, and server helpers Pyright-clean across platforms.

The cross-platform compatibility layer

platform_compat.py is the single place that touches platform-only modules (pwd, grp, fcntl/portalocker, Windows ctypes). It exposes:

  • FileLock / FileLockUnavailable — a non-blocking exclusive file lock backed by portalocker, replacing the raw fcntl.flock calls in the job poll loop. Same semantics (one worker wins the poll election per interval), no more try: import fcntl guards.
  • User / group helperseffective_user_home, named_user_home, user_home, username_for_uid, groupname_for_gid, group_names_for_user, current_uid/effective_uid/current_gid, group_ids_for_user. These wrap the pwd/grp/os.get*id calls and degrade gracefully on non-POSIX.
  • process_is_alive(pid) — a portable process-liveness probe (see the bug fix below).

filestore.py, sshkeys.py, user_worker.py, worker_pool.py, and apps/core.py were updated to route through these helpers instead of importing pwd/grp/fcntl directly.

Behavioral fix: local-executor job liveness on Windows

This is the one change that is not just typing. _poll_local_jobs used os.kill(pid, 0) as a liveness probe. On Windows os.kill routes through TerminateProcess, so it would kill the job it was polling, and it doesn't raise ProcessLookupError for a dead PID. process_is_alive() now keeps the os.kill(pid, 0) probe on POSIX and uses a non-destructive OpenProcess(SYNCHRONIZE) + WaitForSingleObject check on Windows.

This bug was dormant because tests/test_poll.py was skipped wholesale on Windows via a module-level pytest.importorskip("fcntl"). Removing the fcntl dependency (now FileLock) un-skipped the file and surfaced it.

SQLAlchemy 2.0 typed ORM

database.py migrates the models from the legacy declarative_base() + Column(...) style to the 2.0 typed style: a DeclarativeBase subclass with Mapped[...] annotations and mapped_column(...). This is what makes the models type-check cleanly; column nullability is now reflected in the types (Mapped[Optional[str]] etc.).

Smaller semantic changes bundled in

These came out of making the code provably correct for the type checker, and are slight behavior changes worth a look during review:

  • auth.py — revoke the session and return None when session_secret_key is unset, instead of hashing None.
  • issues.py — skip JIRA comments that lack created/updated rather than passing None into the required datetime fields (the old path could raise at model validation).
  • server.py — add 400 validation for the JIRA ticket endpoint body; normalize proxied-path URLs with rstrip("/") + HttpUrl to avoid double slashes; rename the second get_external_buckets handler to get_external_buckets_by_fsp (resolves a shadowed function name); tighten a few response-model return annotations.
  • settings.py — only derive jira_browse_url when atlassian_url is set.

Tests

  • tests/test_poll.py — dropped the module-level importorskip("fcntl"), reworked the lock tests onto the shared FileLock abstraction, and added a spawn fallback where fork is unavailable, so the poll suite runs on Windows.
  • tests/test_filestore.pyTestGetUserGroups now patches platform_compat.optional_module instead of the removed filestore.pwd/filestore.grp globals (same assertions).
  • tests/test_worker.py — uses platform_compat.current_uid.

@StephanPreibisch @JaneliaSciComp/fileglancer

krokicki and others added 6 commits June 3, 2026 17:23
Add a shared platform compatibility layer for portalocker-backed file locks and POSIX-only user helpers, wire Pyright into Pixi, and update tests to use the shared lock abstraction.

Co-authored-by: Codex <codex@openai.com>
Co-authored-by: Codex <codex@openai.com>
Behaviorally identical (p is param for that key), just aligns with the
surrounding p.flag/p.type usage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
filestore.py still called pwd/grp directly (via optional_module). Move
that logic behind new platform_compat helpers groupname_for_gid and
group_names_for_user (mirroring the existing username_for_uid), so the
POSIX user/group database is accessed in one place.

Also revert the earlier p.flag change in build_command: param.flag is
intentional there — the `if param.flag is None: continue` guard narrows
it to str for list.append, which p.flag (rebound from the dict) lacks.

Update TestGetUserGroups to patch platform_compat.optional_module
instead of the removed filestore.pwd/grp module globals.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
_poll_local_jobs used os.kill(pid, 0) as a liveness probe. On Windows
os.kill routes through TerminateProcess, so it would kill the target
process instead of inspecting it, and a recycled/lingering PID handle
made it succeed for already-exited processes — breaking the two
TestPollLocalJobs exited-process tests in CI.

Add platform_compat.process_is_alive(): POSIX keeps the os.kill(pid, 0)
probe (ProcessLookupError -> dead, PermissionError -> alive); Windows
uses a non-destructive OpenProcess(SYNCHRONIZE) + WaitForSingleObject
check. Collapse the success/PermissionError branches in _poll_local_jobs
into a single is-alive path (behavior unchanged on POSIX).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On Windows the os.name != "posix" early return ran before optional_module
was consulted, so TestGetUserGroups (which patches optional_module with
mock grp/pwd) got an empty set and failed in CI. The os.name guard was
redundant: on real Windows optional_module("grp") already returns None
since grp can't be imported. Drop the guard and rely on that — identical
behavior on real platforms, and the mocked tests now run cross-platform.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@krokicki

krokicki commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

@mkitti I'd appreciate your thoughts on this one.

@krokicki krokicki requested a review from mkitti June 5, 2026 15:09
Base automatically changed from apps-improvements to main June 5, 2026 15:10

@mkitti mkitti left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to use typing, dict[str, Any] is probably something we want to move away from when possible. It would be be better to create a dataclass with an asdict() method. You could also consider using collections.abc.Mapping.

Comment thread fileglancer/apps/pixi.py


def _normalize_task(defn) -> dict:
def _normalize_task(defn: Any) -> dict[str, Any]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be some point where the dict definition becomes more strict. Could we constrain the value type here?

@mkitti

mkitti commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

@d-v-b suggested looking into TypedDict: https://typing.python.org/en/latest/spec/typeddict.html

@d-v-b

d-v-b commented Jun 5, 2026

Copy link
Copy Markdown

please use typeddicts when you know something about the keys / values of your dicts 🙏

@mkitti mkitti left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread fileglancer/apps/pixi.py


def _convert_task_arg(arg) -> AppParameter:
def _convert_task_arg(arg: Any) -> AppParameter:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _convert_task_arg(arg: Any) -> AppParameter:
def _convert_task_arg(arg: str | Dict[str, Any]) -> AppParameter:

Comment thread fileglancer/database.py


def get_proxied_paths(session: Session, username: str, fsp_name: str = None, path: str = None) -> List[ProxiedPathDB]:
def get_proxied_paths(session: Session, username: str, fsp_name: Optional[str] = None, path: Optional[str] = None) -> List[ProxiedPathDB]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_proxied_paths(session: Session, username: str, fsp_name: Optional[str] = None, path: Optional[str] = None) -> List[ProxiedPathDB]:
def get_proxied_paths(session: Session, username: str, fsp_name: str | None = None, path: str | None = None) -> List[ProxiedPathDB]:

https://typing.python.org/en/latest/guides/modernizing.html

While Union and Optional are not considered obsolete, using the | (pipe) operator is often more readable.

Comment thread fileglancer/database.py


def get_tickets(session: Session, username: str, fsp_name: str = None, path: str = None) -> List[TicketDB]:
def get_tickets(session: Session, username: str, fsp_name: Optional[str] = None, path: Optional[str] = None) -> List[TicketDB]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_tickets(session: Session, username: str, fsp_name: Optional[str] = None, path: Optional[str] = None) -> List[TicketDB]:
def get_tickets(session: Session, username: str, fsp_name: str | None = None, path: str | None = None) -> List[TicketDB]:

Comment thread fileglancer/filestore.py
def from_stat(cls, path: str, absolute_path: str,
lstat_result: os.stat_result, stat_result: os.stat_result,
current_user: str = None, fsps: Optional[list] = None,
current_user: Optional[str] = None, fsps: Optional[list] = None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
current_user: Optional[str] = None, fsps: Optional[list] = None,
current_user: str | None] = None, fsps: list | None = None,

Comment thread fileglancer/filestore.py
Comment on lines 137 to 138
root_path: Optional[str] = None,
user_groups: Optional[set[str]] = None):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
root_path: Optional[str] = None,
user_groups: Optional[set[str]] = None):
root_path: str | None] = None,
user_groups: set[str] | None = None):

Comment thread fileglancer/filestore.py

def get_file_info(self, path: Optional[str] = None, current_user: str = None,
def get_file_info(self, path: Optional[str] = None, current_user: Optional[str] = None,
fsps: Optional[list] = None) -> FileInfo:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fsps: Optional[list] = None) -> FileInfo:
fsps: list | None = None) -> FileInfo:

Comment thread fileglancer/filestore.py


def yield_file_infos_paginated(self, path: Optional[str] = None, current_user: str = None,
def yield_file_infos_paginated(self, path: Optional[str] = None, current_user: Optional[str] = None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def yield_file_infos_paginated(self, path: Optional[str] = None, current_user: Optional[str] = None,
def yield_file_infos_paginated(self, path: sre | None = None, current_user: str | None = None,

Comment thread fileglancer/filestore.py
return file_infos, has_more, next_cursor, total_count, is_truncated

def yield_file_infos(self, path: Optional[str] = None, current_user: str = None,
def yield_file_infos(self, path: Optional[str] = None, current_user: Optional[str] = None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def yield_file_infos(self, path: Optional[str] = None, current_user: Optional[str] = None,
def yield_file_infos(self, path: str | None = None, current_user: str | None = None,

Comment thread fileglancer/filestore.py


def stream_file_contents(self, path: str = None, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:
def stream_file_contents(self, path: Optional[str] = None, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def stream_file_contents(self, path: Optional[str] = None, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:
def stream_file_contents(self, path: str | None = None, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:

Comment thread fileglancer/filestore.py
yield chunk

def stream_file_range(self, path: str = None, start: int = 0, end: int = 0, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:
def stream_file_range(self, path: Optional[str] = None, start: int = 0, end: int = 0, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def stream_file_range(self, path: Optional[str] = None, start: int = 0, end: int = 0, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:
def stream_file_range(self, path: str | None = None, start: int = 0, end: int = 0, buffer_size: int = DEFAULT_BUFFER_SIZE, file_handle = None) -> Generator[bytes, None, None]:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants