Recent releases

less than 1 minute read

Open LLMs are on fire right now! 🔥

Mistral AI just released Pixtral-12B, a vision models that seems to perform extremely well! From Mistral’s own benchmark, it beats the great Qwen2-7B and Llava-OV. But Mistral’s benchmarks evaluate in Chain-of-Thought, and even in CoT they show lower scores for other models than the scores already published in non-CoT, which is very strange… Evaluation is not a settled science!

But it’s only the last of a flurry of great models: here is my top 5 for this week:

❶ Llama-3.1-8B Omni, a model built upon Llama-3.1-8B-Instruct, that simultaneously generates text and speech response with an extremely low latency of 250ms (Moshi, Kyutai’s 8B, did 140ms)

❷ Fish Speech V1.4, text-to-speech model that supports 8 languages 🇬🇧🇨🇳🇩🇪🇯🇵🇫🇷🇪🇸🇰🇷🇸🇦 with extremely good quality for a light size (~1GB weights) and low latency

❸ DeepSeek-V2.5, a 236B model with 128k context length that combines the best of DeepSeek-V2-Chat and the more recent DeepSeek-Coder-V2-Instruct. Depending on benchmarks, it ranks just below Llama-3.1-405B. Released with custom ‘deepseek’ license, quite commercially permissive.

❹ Solar Pro published by Upstage: a 22B model (so inference fits on a single GPU) that comes just under Llama-3.1-70B performance : MMLU: 79, GPQA: 36, IFEval: 84

❺ MiniCPM3-4B, a small model that claims very impressive scores on par with larger like Llama-3.1-8B

Let’s keep looking, more good stuff is coming our way 🔭

https://media.licdn.com/dms/image/v2/D4E22AQHUKXlzm93osQ/feedshare-shrink_2048_1536/feedshare-shrink_2048_1536/0/1726087882520?e=1729123200&v=beta&t=1rIWNAz4qRg46PLGOYbBGQdf3FvtI3GDgcsgFkcFCNo