AI Model Achieves 99% Performance Using Just 12.5% of Experts
Researchers at the Allen Institute for AI and UC Berkeley have developed EMO, a domain-specialized mixture-of-experts model that retains 99% of its performance while using only 12.5% of its expert components, according to a May 2026 report from The Decoder. The system enables efficient deployment in memory-constrained environments, potentially transforming AI infrastructure and enterprise adoption in the U.S.
EMO, described in a study conducted by the two U.S.-based institutions, employs a novel approach to mixture-of-experts architectures. Traditional systems activate multiple specialized components to handle different tasks, but EMO identifies optimal subsets through advanced routing mechanisms. This efficiency maintains high accuracy while significantly reducing computational requirements.
Mixture-of-experts models typically require activating numerous specialized sub-models for complex tasks. EMO’s innovation lies in its ability to achieve near-complete performance with minimal expert activation, as demonstrated through experiments on vision and language benchmarks. The system’s efficiency could lower costs for businesses deploying AI in edge computing, mobile devices, and other resource-limited settings.
The research team, led by Allen Institute and UC Berkeley, emphasized potential applications for U.S. industries seeking to implement AI without high computational overhead. With growing demand for on-device AI processing, the model addresses key challenges in scaling AI technologies while maintaining performance standards.