Skip to content

Parametrize the Cohere logit_scale fold test with a scale other than 1.0#1348

Open
EphraiemSarabamoun wants to merge 1 commit into
TransformerLensOrg:mainfrom
EphraiemSarabamoun:fix/1325-cohere-logit-scale-fold-test
Open

Parametrize the Cohere logit_scale fold test with a scale other than 1.0#1348
EphraiemSarabamoun wants to merge 1 commit into
TransformerLensOrg:mainfrom
EphraiemSarabamoun:fix/1325-cohere-logit-scale-fold-test

Conversation

@EphraiemSarabamoun
Copy link
Copy Markdown

This fixes issue #1325.

The test test_embed_and_unembed_weights_differ in tests/integration/model_bridge/test_cohere_adapter.py guarded its fold assertion behind an inline if logit_scale == 1.0 then pytest.skip branch. The fixture model trl-internal-testing/tiny-CohereForCausalLM ships logit_scale=0.0625, so that guard was dead code for the current model and the no-op case it was meant to skip was never actually reached. The result was that the test claimed to cover the fold while the skip branch only added noise.

This change parametrizes logit_scale over a non-trivial value of 0.0625 and over 1.0. For each value the test boots a fresh bridge, sets cfg.logit_scale before calling process_weights, and lets the fold run with the parametrized value. The fold reads cfg.logit_scale inside preprocess_weights, so setting it before process_weights is what makes the parametrization actually flow into the fold path.

The non-trivial 0.0625 case exercises the fold and asserts that embed.W_E and unembed.weight are no longer identical, which is the behavior the test was always meant to verify. The 1.0 case stays as a regression guard and asserts the two weights remain tied, since the fold is a no-op at unit scale. The inline pytest.skip is removed.

This satisfies the acceptance criteria in the issue. The test now parametrizes logit_scale with a non-1.0 value plus 1.0 as the optional no-op regression guard, the non-1.0 case fires the assertion with no skip, and the inline skip is gone.

Verification. In an isolated Python 3.12 venv with the package installed editable alongside pytest, torch, and transformers, the targeted run pytest tests/integration/model_bridge/test_cohere_adapter.py -k test_embed_and_unembed_weights_differ -x -q passed both parametrized cases (2 passed, 21 deselected). The tiny model downloaded and the fold path ran. black and isort were run against the changed file and report no changes needed.

The test_embed_and_unembed_weights_differ test guarded the fold assertion
behind an if logit_scale == 1.0: pytest.skip(...) branch. The fixture model
trl-internal-testing/tiny-CohereForCausalLM ships logit_scale=0.0625, so the
guard was dead code and the no-op case was never covered.

Parametrize logit_scale over a non-trivial value (0.0625) and 1.0. The
non-trivial case sets cfg.logit_scale before process_weights so the fold runs
and the assertion that embed.W_E and unembed.weight differ fires. The 1.0 case
is kept as a regression guard that asserts the weights stay tied when the fold
is a no-op. Closes TransformerLensOrg#1325.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant