Tag
moe
7 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.
Mixtral 8x7B released on: 2023-12-11.
410aec4f418f2b11 · 2 sources · 95% confidence
Mixtral 8x7B architecture: Sparse Mixture-of-Experts (8 experts × 7B params, 2 experts routed per token).
ad79b14fafb362cd · 2 sources · 100% confidence
Sparsely-Gated Mixture-of-Experts (MoE) introduced in paper: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (Shazeer et al., 2017).
2d6d7f61f1db6493 · 1 source · 100% confidence
Switch Transformer introduced in paper: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (Fedus et al., 2021).
3d9c14b9379038c9 · 2 sources · 100% confidence
MoE Mixtral 8x22B released on: 2024-04-10 by Mistral AI.
4335bf51bf0fc14f · 2 sources · 100% confidence
Llama 4 released on: 2025-04-05 by Meta — Scout + Maverick + Behemoth lineup.
d5ce871dc69e7b04 · 2 sources · 100% confidence
Mixture of Experts (MoE) revival popularized in: Shazeer et al. 2017 — outrageously large neural networks via sparse gating.
f068236101568ad7 · 2 sources · 100% confidence