Skip to content

fix(variants): RaggedVariants.to_packed()/rc_() on sliced/reordered views#210

Merged
d-laub merged 10 commits into
mainfrom
fix/ragged-variants-pack-lazy-views
Jun 8, 2026
Merged

fix(variants): RaggedVariants.to_packed()/rc_() on sliced/reordered views#210
d-laub merged 10 commits into
mainfrom
fix/ragged-variants-pack-lazy-views

Conversation

@d-laub

@d-laub d-laub commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Summary

RaggedVariants.to_packed() and rc_() previously crashed on sliced / reversed / fancy-indexed views (non-canonical awkward layouts — IndexedArray/ListArray). This fixes both:

  • Numeric fields (e.g. start) work for free via the upstream seqpro IndexedArray unbox fix (seqpro 0.15.1).
  • Doubly-nested alt/ref: new _decompose_alleles + numba _pack_alleles kernel, gated by _is_canonical_alleles so the canonical hot path is byte-for-byte unchanged; non-canonical views go through the kernel.
  • rc_() on a non-canonical view materializes a contiguous copy then recurses (returns a new object; the sole caller uses the return value).

Depends on seqpro >= 0.15.1 (IndexedArray unbox fix) and genoray >= 2.9.2 (relaxes its seqpro<0.15 pin) — both now on PyPI.

Test plan

  • tests/dataset/test_flat_variants.py — 19 tests incl. to_packed reverse/fancy/explicit-ListArray, rc_ reverse/fancy/mixed-mask, ploidy=2 reordered, canonical fast-path regression. All green against the published seqpro 0.15.1 / genoray 2.9.2.
  • Local -m "not slow" regression: 618 passed, 0 failed.
  • ruff + pyrefly clean.
  • Canonical hot path (_is_canonical_alleles True for freshly-built arrays) is untouched — guarded by the existing test_rc_matches_awkward* / test_to_packed_matches_awkward* tests.

🤖 Generated with Claude Code

d-laub and others added 10 commits June 8, 2026 01:30
to_packed() and rc_() crash on any sliced/reversed/fancy-indexed
RaggedVariants (IndexedArray/ListArray layouts) for both alt/ref and
numeric fields. Design resolves lazy views via numba packing (seqpro
fancy-index for numeric, new _pack_alleles kernel for doubly-nested
alt/ref); no ak.to_packed, canonical hot path unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Numeric-field half is an upstream seqpro unbox()/_extract_list_offsets()
gap on IndexedArray (record-index + field-extract). Fix seqpro first
(project Indexed* in the layout walkers), release, bump gvl pin; gvl
numeric fields then need no special handling. alt/ref doubly-nested
half stays gvl-owned (numba _pack_alleles kernel + generalized
_alt_layout_parts). rc_ returns a new object for reordered input.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two-repo, TDD plan: Part A fixes seqpro unbox()/_extract_list_offsets()
IndexedArray gap (land+release first); Part B bumps the pin and adds a
numba _pack_alleles kernel + layout decomposition for non-canonical
alt/ref, with rc_ delegating to to_packed() for reordered views.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…on-canonical views

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… kernel

Gate the alt/ref branch in RaggedVariants.to_packed() on _is_canonical_alleles:
canonical layouts take the existing fast path unchanged; IndexedArray/ListArray
layouts (produced by fancy-indexing or reversals) fall through to _decompose_alleles
+ _pack_alleles for a correct numba-based gather.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…otation polish

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
seqpro 0.15.1 carries the IndexedArray unbox fix that powers
RaggedVariants.to_packed()/rc_() on sliced/reordered views; genoray 2.9.2
relaxes its seqpro pin to <0.16 so the two co-resolve.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@d-laub d-laub merged commit 5b8dab0 into main Jun 8, 2026
7 checks passed
@d-laub d-laub deleted the fix/ragged-variants-pack-lazy-views branch June 8, 2026 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant