Skip to content

Add GPT-OSS architecture adapter tests#1341

Merged
jlarson4 merged 1 commit into
TransformerLensOrg:devfrom
RecreationalMath:gpt_oss-adapter-test
May 28, 2026
Merged

Add GPT-OSS architecture adapter tests#1341
jlarson4 merged 1 commit into
TransformerLensOrg:devfrom
RecreationalMath:gpt_oss-adapter-test

Conversation

@RecreationalMath
Copy link
Copy Markdown
Contributor

Description

Adds a unit test suite for GPTOSSArchitectureAdapter under tests/unit/model_bridge/supported_architectures/, following the existing adapter-test pattern (modeled on the merged mixtral and olmoe suites). It needs no model downloads or real checkpoints: it uses tiny programmatic TransformerBridgeConfig objects, plus small synthetic tensors and a fake attention module for the behavioral tests, so it runs on CPU in seconds.

The suite (44 tests) covers:

  • Adapter config defaults (RMSNorm, rotary, gated MoE MLP, eps_attr = "variance_epsilon").
  • Weight conversions: the four QKVO weights (GPT-OSS has no projection biases), with GQA-aware head counts and the no-n_key_value_heads fallback.
  • Numerical round-trips: the rearrange conversions are actually run on synthetic HF-shaped weight tensors, asserting the split-head output shapes and lossless reversion.
  • Component-mapping structure, bridge types, and HF module paths, including the MoE bridge with no exposed submodules (the entire MoE block is wrapped opaquely).
  • Factory registration and dispatch via select_architecture_adapter.
  • GQA forward hook shapes: a fake attention module wired into the bridge confirms Q surfaces n_heads while K/V surface n_key_value_heads.
  • setup_hook_compatibility rotary-embedding wiring on a bridge model: a no-op when bridge_model is None or lacks rotary_emb, sets rotary on each block's attention bridge when present, skips blocks without attention, and setup_no_processing_hooks (the backward-compat alias) produces the same wiring.
  • Architecture guards against drift.

Contributes to #1302 (GPT-OSS checkbox).

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@jlarson4
Copy link
Copy Markdown
Collaborator

Excellent test suite, thank you @RecreationalMath

@jlarson4 jlarson4 merged commit e124a06 into TransformerLensOrg:dev May 28, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants