via:
Multi-token-prediction in Gemma 4
An overview of how Multi-Token Prediction (MTP) drafters are making Gemma 4 models up to 3x faster at inference.
An overview of how Multi-Token Prediction (MTP) drafters are making Gemma 4 models up to 3x faster at inference.