Skip to content

Restore KVzip forward after context errors 🤖🤖🤖#244

Open
fallintoplace wants to merge 3 commits into
NVIDIA:mainfrom
fallintoplace:fix/kvzip-forward-restore
Open

Restore KVzip forward after context errors 🤖🤖🤖#244
fallintoplace wants to merge 3 commits into
NVIDIA:mainfrom
fallintoplace:fix/kvzip-forward-restore

Conversation

@fallintoplace

@fallintoplace fallintoplace commented Jun 16, 2026

Copy link
Copy Markdown

Summary

  • Restore model.model.forward from an inner finally around the KVzip context yield.
  • Preserve the normal path where the original forward is restored before KVzip's post-yield compression/scoring phase.
  • Merge the branch with the latest NVIDIA:main and drop the extra test file per maintainer feedback.

Root cause

KVzipPress.__call__ monkey-patches model.model.forward before yielding to the caller, but the original method was restored only after a normal return from the with block. If the caller raised inside the context, the outer cleanup removed hooks and reset internal state without restoring forward, leaving the model patched after exit.

Validation

  • make style
  • UV_PYTHON=3.12 uv run pytest tests/presses/test_head_compression.py -k "KVzipPress"

Signed-off-by: Minh Vu <vuhoangminh97@gmail.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@SimJeg

SimJeg commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

/ok to test 6f2012f

@SimJeg

SimJeg commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Hi @fallintoplace, thanks for your contribution. Could you merge your branch with main so that we can run tests ? Also for this PR, the additional test is not needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants