Parametrize the Cohere logit_scale fold test with a scale other than 1.0 by EphraiemSarabamoun · Pull Request #1348 · TransformerLensOrg/TransformerLens

EphraiemSarabamoun · 2026-05-30T13:55:29Z

This fixes issue #1325.

The test test_embed_and_unembed_weights_differ in tests/integration/model_bridge/test_cohere_adapter.py guarded its fold assertion behind an inline if logit_scale == 1.0 then pytest.skip branch. The fixture model trl-internal-testing/tiny-CohereForCausalLM ships logit_scale=0.0625, so that guard was dead code for the current model and the no-op case it was meant to skip was never actually reached. The result was that the test claimed to cover the fold while the skip branch only added noise.

This change parametrizes logit_scale over a non-trivial value of 0.0625 and over 1.0. For each value the test boots a fresh bridge, sets cfg.logit_scale before calling process_weights, and lets the fold run with the parametrized value. The fold reads cfg.logit_scale inside preprocess_weights, so setting it before process_weights is what makes the parametrization actually flow into the fold path.

The non-trivial 0.0625 case exercises the fold and asserts that embed.W_E and unembed.weight are no longer identical, which is the behavior the test was always meant to verify. The 1.0 case stays as a regression guard and asserts the two weights remain tied, since the fold is a no-op at unit scale. The inline pytest.skip is removed.

This satisfies the acceptance criteria in the issue. The test now parametrizes logit_scale with a non-1.0 value plus 1.0 as the optional no-op regression guard, the non-1.0 case fires the assertion with no skip, and the inline skip is gone.

Verification. In an isolated Python 3.12 venv with the package installed editable alongside pytest, torch, and transformers, the targeted run pytest tests/integration/model_bridge/test_cohere_adapter.py -k test_embed_and_unembed_weights_differ -x -q passed both parametrized cases (2 passed, 21 deselected). The tiny model downloaded and the fold path ran. black and isort were run against the changed file and report no changes needed.

The test_embed_and_unembed_weights_differ test guarded the fold assertion behind an if logit_scale == 1.0: pytest.skip(...) branch. The fixture model trl-internal-testing/tiny-CohereForCausalLM ships logit_scale=0.0625, so the guard was dead code and the no-op case was never covered. Parametrize logit_scale over a non-trivial value (0.0625) and 1.0. The non-trivial case sets cfg.logit_scale before process_weights so the fold runs and the assertion that embed.W_E and unembed.weight differ fires. The 1.0 case is kept as a regression guard that asserts the weights stay tied when the fold is a no-op. Closes TransformerLensOrg#1325.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parametrize the Cohere logit_scale fold test with a scale other than 1.0#1348

Parametrize the Cohere logit_scale fold test with a scale other than 1.0#1348
EphraiemSarabamoun wants to merge 1 commit into
TransformerLensOrg:mainfrom
EphraiemSarabamoun:fix/1325-cohere-logit-scale-fold-test

EphraiemSarabamoun commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EphraiemSarabamoun commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant