Remove llama-meta
article thumbnail

What is mixture of experts (MoE)?

The Zapier Blog

I'll dig into the nitty gritty in a moment, but it's worth noting that the majority of the most powerful open models like DeepSeek V3 and DeepSeek R1, Meta Llama 4 Maverick and Scout, and Qwen 3 235B now use a mixture-of-experts architecture, and there have been persistent rumors that OpenAI's GPT models have used it since GPT-4.