Skip to content

Add T5 architecture adapter test#1349

Open
RecreationalMath wants to merge 2 commits into
TransformerLensOrg:devfrom
RecreationalMath:t5-adapter-test
Open

Add T5 architecture adapter test#1349
RecreationalMath wants to merge 2 commits into
TransformerLensOrg:devfrom
RecreationalMath:t5-adapter-test

Conversation

@RecreationalMath
Copy link
Copy Markdown
Contributor

Description

Adds a unit test suite for T5ArchitectureAdapter under tests/unit/model_bridge/supported_architectures/. It needs no model downloads or real checkpoints as it uses tiny programmatic TransformerBridgeConfig objects, so it runs on CPU in seconds.

The suite (50 tests) covers:

  • Adapter config defaults: RMSNorm, positional_embedding_type = "relative_positional_bias", final_rms = False, supports_fold_ln = False (fold-LN would corrupt T5 RMSNorm weights), and the gated-MLP switch driven by cfg.is_gated_act.
  • Component-mapping structure: eight top-level keys covering the encoder stack (pos_embed, encoder_blocks, encoder_ln_final), the decoder stack (decoder_pos_embed, decoder_blocks, decoder_ln_final), plus shared embed and unembed.
  • Top-level bridge types and HF module paths, including the two distinct relative-attention-bias paths (one per stack, on block.0.layer.0).
  • Encoder block: pre-norm, single self-attention with q/k/v/o LinearBridges and requires_relative_position_bias=True, plain MLPBridge (or GatedMLPBridge in Flan-T5).
  • Decoder block: self-attention + cross-attention + three layer norms; cross-attention is flagged is_cross_attention=True and the encoder self-attention must not be.
  • Flan-T5 gated MLP variant: when cfg.is_gated_act=True, both encoder and decoder MLPs become GatedMLPBridge with wi_0 (gate), wi_1 (in), wo (out).
  • Factory registration: dual-registered, T5ForConditionalGeneration and MT5ForConditionalGeneration both dispatch back to T5ArchitectureAdapter.
  • Architecture guards: encoder must not grow a cross_attn or ln3; weight conversions stay empty.

Contributes to #1302 (T5 checkbox).

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant