Sparse Mixture of Experts

🚀

Sparse Mixture of Experts

Sparse Mixture of Experts is a neural network architecture that activates only a subset of specialized experts per input token to increase model capacity efficiently.

Code & Development