Parametrize the Cohere logit_scale fold test with a scale other than 1.0#1348
Open
EphraiemSarabamoun wants to merge 1 commit into
Open
Conversation
The test_embed_and_unembed_weights_differ test guarded the fold assertion behind an if logit_scale == 1.0: pytest.skip(...) branch. The fixture model trl-internal-testing/tiny-CohereForCausalLM ships logit_scale=0.0625, so the guard was dead code and the no-op case was never covered. Parametrize logit_scale over a non-trivial value (0.0625) and 1.0. The non-trivial case sets cfg.logit_scale before process_weights so the fold runs and the assertion that embed.W_E and unembed.weight differ fires. The 1.0 case is kept as a regression guard that asserts the weights stay tied when the fold is a no-op. Closes TransformerLensOrg#1325.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This fixes issue #1325.
The test test_embed_and_unembed_weights_differ in tests/integration/model_bridge/test_cohere_adapter.py guarded its fold assertion behind an inline if logit_scale == 1.0 then pytest.skip branch. The fixture model trl-internal-testing/tiny-CohereForCausalLM ships logit_scale=0.0625, so that guard was dead code for the current model and the no-op case it was meant to skip was never actually reached. The result was that the test claimed to cover the fold while the skip branch only added noise.
This change parametrizes logit_scale over a non-trivial value of 0.0625 and over 1.0. For each value the test boots a fresh bridge, sets cfg.logit_scale before calling process_weights, and lets the fold run with the parametrized value. The fold reads cfg.logit_scale inside preprocess_weights, so setting it before process_weights is what makes the parametrization actually flow into the fold path.
The non-trivial 0.0625 case exercises the fold and asserts that embed.W_E and unembed.weight are no longer identical, which is the behavior the test was always meant to verify. The 1.0 case stays as a regression guard and asserts the two weights remain tied, since the fold is a no-op at unit scale. The inline pytest.skip is removed.
This satisfies the acceptance criteria in the issue. The test now parametrizes logit_scale with a non-1.0 value plus 1.0 as the optional no-op regression guard, the non-1.0 case fires the assertion with no skip, and the inline skip is gone.
Verification. In an isolated Python 3.12 venv with the package installed editable alongside pytest, torch, and transformers, the targeted run pytest tests/integration/model_bridge/test_cohere_adapter.py -k test_embed_and_unembed_weights_differ -x -q passed both parametrized cases (2 passed, 21 deselected). The tiny model downloaded and the fold path ran. black and isort were run against the changed file and report no changes needed.