Skip to content

Improve mypy type checking coverage #1

Description

@craigtracey

Background

The current mypy configuration disables many error codes to avoid false positives from polymorphic patterns common in this codebase:

disable_error_code = ["attr-defined", "union-attr", "call-arg", "misc", "override", "arg-type", "assignment", "valid-type", "var-annotated", "return-value", "list-item"]

While this allows CI to pass without overwhelming noise, it also masks legitimate type errors that could catch bugs.

Root Causes

1. Polymorphic Provider Configurations

Different molecule providers have different configuration schemas (e.g., GithubProviderConfig, GitlabProviderConfig, ArgoProviderConfig), but are all typed as MoleculeProviderConfig in base classes. Accessing provider-specific attributes like api_url, registry_type, or url triggers attr-defined errors.

2. Entity Type Variance

Methods return list[Entity] but some implementations return list[GithubRepository] or other specific entity types. Lists are invariant in Python, causing list-item and return-value errors.

3. API Response Unions

API responses are often typed as dict[str, Any] since they come from external sources. Accessing specific keys triggers union-attr errors.

4. Legacy Deprecated Code

EntityDefinitionSpec is deprecated but still used, causing valid-type errors.

Proposed Solutions

Phase 1: Add Type Annotations

  • Add # type: ignore[attr-defined] comments to known-safe polymorphic accesses
  • Document why each ignore is safe
  • This makes intentional suppressions explicit vs blanket disabling

Phase 2: Improve Type Narrowing

  • Use isinstance() checks before accessing provider-specific attributes
  • Use TypedDict for common API response structures
  • Use cast() where runtime types are known but mypy can't infer

Phase 3: Fix Structural Issues

  • Remove deprecated EntityDefinitionSpec usage
  • Consider using protocols or generic types for polymorphic configs
  • Use covariant return types where appropriate

Migration Strategy

  1. Enable one error code at a time (start with valid-type, debug-statements)
  2. Fix errors for that code or add explicit # type: ignore with justification
  3. Remove code from disable_error_code list
  4. Repeat until only necessary suppressions remain

Success Criteria

  • Most error codes removed from disable_error_code
  • Any remaining disabled codes have documented justification
  • Explicit # type: ignore comments explain why each is safe
  • CI catches legitimate type errors while avoiding false positives

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/refactorcode change that has no impact on functionality (e.g. tech debt or refactor)

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions