preallocate encoder and ripple hot path scratch#12
Open
jl33-ai wants to merge 1 commit into
Open
Conversation
extends the same latency hygiene pass from the decoder PR to the encoder and ripple processes. encoder: - send_joint_prob: msg.tobytes() -> [msg, MPI.BYTE]. on a 9 ntrode rig running ~20 spikes/sec/tetrode that's ~180 bytes-allocs/sec, more during bursts. - Encoder.get_joint_prob: the per-spike in_range bool array (length = current mark count, up to bufsize=18001) was allocated fresh on every call. now a single _in_range_buf is allocated once in __init__ and we operate on a [:mark_idx] view with .fill(True) + np.logical_and(..., out=). semantics preserved, verified against the original loop on random inputs. ripple: - send_ripple: same .tobytes() -> [msg, MPI.BYTE] change. didn't touch EnvelopeEstimator.add_new_data even though it has its own per-LFP-sample allocations (b*x products, sum, sqrt). that one needs to switch from returning fresh arrays to returning persistent buffers, which is a meaningful API change worth its own PR with its own review. no algorithmic change. all consumers of the rewritten paths still receive identically-shaped data.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
extends the same latency hygiene pass from #9 (decoder) to the encoder and ripple processes.
encoder
I sanity checked the rewritten loop against the original on random inputs; same final mask, same count.
ripple
deliberately not in this PR
`EnvelopeEstimator.add_new_data` runs at 1500hz and has its own per-sample allocations (`b * x` products, `np.sum`, `np.sqrt`, `** 2`). cleaning that up requires switching from "return a fresh array" to "return a persistent buffer that the caller must consume before the next call," which is a real API change. happy to do it as a follow up PR with a more careful review of the downstream consumers (the per-trode ripple detection loop and the stats updater both touch `env`).
no algorithmic change in this PR. all consumers receive identically-shaped data.