Ring-mini-2.0

Ant Group / InclusionAI

MIT MoE sigmoid-routing aux-loss-free MTP reasoning

Type

MoE

Total params

16.0B

Active params

1.4B

Sparsity

1/32

Context

32,768

Train tokens

20.0T

Benchmarks

Benchmark	Category	Measured	Claimed	Setup
GPQA Diamond	reasoning	37.88	—	0-shot
GSM8K	math	79.76	—	5-shot
MMLU-Pro	knowledge	54.52	—	5-shot
HumanEval+	code	65.24	—	0-shot
AIME 2024	math	10.00	—	0-shot

First independent benchmark of this model. Ring is the reasoning-tuned sibling of Ling-mini-2.0, same bailing_moe architecture. Requires transformers==4.57.0 and vllm==0.10.0 with bailing_moe_v2 patch.