Mixture of Experts (MoE) revival popularized in: Shazeer et al. 2017 — outrageously large neural networks via sparse gating.
Subject
Mixture of Experts (MoE) revival
Object
Shazeer et al. 2017 — outrageously large neural networks via sparse gating
Primary source · preprint · 2017-01-23
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer — arXiv (Shazeer, Mirhoseini, Maziarz, Davis, Le, Hinton, Dean / Google Brain)