Recent releases
Open LLMs are on fire right now! 🔥
Mistral AI just released Pixtral-12B, a vision models that seems to perform extremely well! From Mistral’s own benchmark, it beats the great Qwen2-7B and Llava-OV. But Mistral’s benchmarks evaluate in Chain-of-Thought, and even in CoT they show lower scores for other models than the scores already published in non-CoT, which is very strange… Evaluation is not a settled science!
But it’s only the last of a flurry of great models: here is my top 5 for this week:
❶ Llama-3.1-8B Omni, a model built upon Llama-3.1-8B-Instruct, that simultaneously generates text and speech response with an extremely low latency of 250ms (Moshi, Kyutai’s 8B, did 140ms)
❷ Fish Speech V1.4, text-to-speech model that supports 8 languages 🇬🇧🇨🇳🇩🇪🇯🇵🇫🇷🇪🇸🇰🇷🇸🇦 with extremely good quality for a light size (~1GB weights) and low latency
❸ DeepSeek-V2.5, a 236B model with 128k context length that combines the best of DeepSeek-V2-Chat and the more recent DeepSeek-Coder-V2-Instruct. Depending on benchmarks, it ranks just below Llama-3.1-405B. Released with custom ‘deepseek’ license, quite commercially permissive.
❹ Solar Pro published by Upstage: a 22B model (so inference fits on a single GPU) that comes just under Llama-3.1-70B performance : MMLU: 79, GPQA: 36, IFEval: 84
❺ MiniCPM3-4B, a small model that claims very impressive scores on par with larger like Llama-3.1-8B
Let’s keep looking, more good stuff is coming our way 🔭