use MTP if model_dir and draft_model_dir are equal by suspicious-pineapple · Pull Request #424 · theroyallab/tabbyAPI

suspicious-pineapple · 2026-06-09T12:13:18Z

Why should this feature be added?
this seems to be the minimal set of changes needed to make MTP work, on latest exl3 dev branch.

Examples
MTP is enabled if the main model is the same as the draft model. otherwise it behaves normally
..maybe this would more sanely be exposed as a config option?

Additional context
tested with https://huggingface.co/turboderp/Qwen3.6-27B-MTP-exl3 (gotta download the safetensors file and put it in the model dir, i assume it will be included by default in future quants, where supported)

this seems to be the minimal set of changes needed to make MTP work tested with <https://huggingface.co/turboderp/Qwen3.6-27B-MTP-exl3>

randoentity · 2026-06-10T10:01:03Z

I can't get this to work yet.

I'm on exllamav3 9c5009efaa2cda8ed341369123bb4acfe18ae300
tabbyAPI 2e50555 + patch-2

Using https://huggingface.co/turboderp/Qwen3.6-27B-MTP-exl3 and UnstableLlama_Qwen3.6-27B-exl3-8.00bpw
3 draft module layers get loaded but it raises an error on generation.

AI generated report below:

Bug Report: AttributeError during MTP Draft Model Generation

Description

When initiating a chat completion with Multi-Token Prediction (MTP) enabled via the ExLlamaV3 backend, the generation process crashes. The error indicates that a linear module's inner component (self.inner) is None when atte
mpting to perform a forward pass during draft model iteration.

Steps to Reproduce

Configure TabbyAPI with the ExLlamaV3 backend.
Load a model/architecture that utilizes MTP or a draft model.
Send a chat completion request to trigger streaming generation.
Monitor the server logs.

Expected Behavior

The model should successfully iterate through draft tokens using MTP and stream the completion without crashing.

Actual Behavior

The server raises an AttributeError: 'NoneType' object has no attribute 'forward' and aborts the generation request.

Error Log & Traceback Analysis

Critical Error:

AttributeError: 'NoneType' object has no attribute 'forward'
File "exllamav3/exllamav3/modules/linear.py", line 426, in forward
    x = self.inner.forward(x, params, out_dtype)

Call Stack Highlights:

tabbyAPI/backends/exllamav3/model.py initiates generate_gen.
exllamav3/exllamav3/generator/generator.py calls iterate_draftmodel_mtp_gen.
At generator.py:525, it attempts: batch_logits = self.model.modules[self.model.logit_layer_idx].forward(batch_state, params)
The forward pass enters linear.py:426 where self.inner is unexpectedly None.

Potential Causes

The draft model's linear layers were not correctly initialized or loaded from the state dictionary.
Architecture mismatch between the loaded model weights and the ExLlamaV3 module definition for MTP layers.
Missing or corrupted weight tensors for the specific logit layer index used in MTP drafting.

Environment

Python Version: 3.13
Backend: ExLlamaV3
Application: TabbyAPI
Date: 2026-06-10

Note: This bug report was drafted with the assistance of AI based on the provided traceback log.

use MTP if model_dir and draft_model_dir are equal

0b2a3cc

this seems to be the minimal set of changes needed to make MTP work tested with <https://huggingface.co/turboderp/Qwen3.6-27B-MTP-exl3>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use MTP if model_dir and draft_model_dir are equal#424

use MTP if model_dir and draft_model_dir are equal#424
suspicious-pineapple wants to merge 1 commit into
theroyallab:mainfrom
suspicious-pineapple:patch-2

suspicious-pineapple commented Jun 9, 2026

Uh oh!

randoentity commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

suspicious-pineapple commented Jun 9, 2026

Uh oh!

randoentity commented Jun 10, 2026

Bug Report: AttributeError during MTP Draft Model Generation

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Error Log & Traceback Analysis

Potential Causes

Environment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants