Skip to content

preallocate encoder and ripple hot path scratch#12

Open
jl33-ai wants to merge 1 commit into
LorenFrankLab:mainfrom
jl33-ai:preallocate-encoder-and-ripple-scratch
Open

preallocate encoder and ripple hot path scratch#12
jl33-ai wants to merge 1 commit into
LorenFrankLab:mainfrom
jl33-ai:preallocate-encoder-and-ripple-scratch

Conversation

@jl33-ai
Copy link
Copy Markdown

@jl33-ai jl33-ai commented May 29, 2026

extends the same latency hygiene pass from #9 (decoder) to the encoder and ripple processes.

encoder

  • `send_joint_prob`: `msg.tobytes()` -> `[msg, MPI.BYTE]` so MPI gets the numpy buffer directly. on a 9 ntrode rig at ~20 spikes/sec/tetrode that is ~180 fewer bytes-allocations per second, more during bursts.
  • `Encoder.get_joint_prob`: the per-spike `in_range` bool array (length = current mark count, up to `bufsize=18001`) was allocated fresh on every spike. now there is one `_in_range_buf` allocated in `init` and the loop operates on a `[:mark_idx]` view with `.fill(True)` + `np.logical_and(..., out=)`.

I sanity checked the rewritten loop against the original on random inputs; same final mask, same count.

ripple

  • `send_ripple`: same `.tobytes()` -> `[msg, MPI.BYTE]` change.

deliberately not in this PR

`EnvelopeEstimator.add_new_data` runs at 1500hz and has its own per-sample allocations (`b * x` products, `np.sum`, `np.sqrt`, `** 2`). cleaning that up requires switching from "return a fresh array" to "return a persistent buffer that the caller must consume before the next call," which is a real API change. happy to do it as a follow up PR with a more careful review of the downstream consumers (the per-trode ripple detection loop and the stats updater both touch `env`).

no algorithmic change in this PR. all consumers receive identically-shaped data.

extends the same latency hygiene pass from the decoder PR to the encoder and ripple processes.

encoder:
- send_joint_prob: msg.tobytes() -> [msg, MPI.BYTE]. on a 9 ntrode rig running ~20 spikes/sec/tetrode that's ~180 bytes-allocs/sec, more during bursts.
- Encoder.get_joint_prob: the per-spike in_range bool array (length = current mark count, up to bufsize=18001) was allocated fresh on every call. now a single _in_range_buf is allocated once in __init__ and we operate on a [:mark_idx] view with .fill(True) + np.logical_and(..., out=). semantics preserved, verified against the original loop on random inputs.

ripple:
- send_ripple: same .tobytes() -> [msg, MPI.BYTE] change.

didn't touch EnvelopeEstimator.add_new_data even though it has its own per-LFP-sample allocations (b*x products, sum, sqrt). that one needs to switch from returning fresh arrays to returning persistent buffers, which is a meaningful API change worth its own PR with its own review.

no algorithmic change. all consumers of the rewritten paths still receive identically-shaped data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant