39 posts analyzed over the last 12 weeks
Over the last 12 weeks
Average across all posts
This month vs. previous
Growth slope (6 weeks)
Best week: 20 avr. (293 avg. likes)
Niels Rogge
Machine Learning Engineer at ML6 & Hugging Face
Hugging Face just released "ML-Intern"! 🔥 It's an open-source implementation of the real research loop that ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements id…
Niels Rogge
Machine Learning Engineer at ML6 & Hugging Face
This week, Mistral AI released a new model, Medium 3.5, but it wasn't well-received. 🇫🇷🥐😥 Various people noticed that it uses an outdated architecture based on Llama 2 and is priced higher than models such as DeepSe…
Maor Shlomo
Founder at Base44 | Prev: CEO and Co-Founder at Explorium | Forbes 30 under 30
We’re introducing a new model benchmark. And it’s a different kind of benchmark. (Basemark? Vibench?) A different kind because it’s breathing, constantly updated from millions of builders. Not a closed set of tasks. F…
Tom Aarsen
🤗 Sentence Transformers & NLTK maintainer, MLE @ Hugging Face
BidirLM-Omni-2.5B-Embedding is live! A single bidirectional encoder that embeds text, images, and audio into the same space. Here's the details: Benchmark sweep: 🥇 #1 open-data model on MTEB Multilingual V2 (text, #15 …
Ethan Mollick
Associate Professor at The Wharton School. Author of Co-Intelligence
I find that open weights models over-perform on benchmarks compared to actual real-world usage, and the new Kimi 2.6 Thinking feels like no exception. For example, a small amount of use will show that Kimi is not as good…
Nandan Mullakara
Follow for Agentic AI, Gen AI & RPA trends | Co-author: Agentic AI & RPA Projects | Favikon TOP 200 in AI | Oanalytica Who’s Who in Automation | Founder, Bot Nirvana | Ex-Fujitsu Head of Digital Automation
𝗜 𝗸𝗲𝗲𝗽 𝘀𝗲𝗲𝗶𝗻𝗴 𝘁𝗵𝗲 𝘀𝗮𝗺𝗲 𝗳𝗮𝗶𝗹𝘂𝗿𝗲 𝗶𝗻 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻 𝗮𝗳𝘁𝗲𝗿 𝟭𝟬+ 𝘆𝗲𝗮𝗿𝘀. Different company. Different tools. Different team. 𝗦𝗮𝗺𝗲 𝗿𝗼𝗼𝘁 𝗰𝗮𝘂𝘀𝗲 every single time. Nobody …
Niels Rogge
Machine Learning Engineer at ML6 & Hugging Face
One of my favorite benchmarks lately is BrowseComp-Plus. It was introduced in August 2025 by the University of Waterloo as means to better benchmark agentic search/RAG. It improved upon the original BrowseComp introduce…
Nick Saraev
Founder at Maker School: the straightest-line path to building an AI agency (2K+ members, ~$250K MRR) | Co-founder at LeftClick, an AI growth agency serving multibillion dollar portfolio companies.
Opus 4.7 dropped a few days ago and half the internet is treating it like some massive breakthrough. It isn't. It's a marginal step up over 4.6, where some benchmarks move 3-4 points, one or two actually regress, and t…
Nick Saraev
Founder at Maker School: the straightest-line path to building an AI agency (2K+ members, ~$250K MRR) | Co-founder at LeftClick, an AI growth agency serving multibillion dollar portfolio companies.
The system card for Claude Mythos Preview is 244 pages of "holy crap." This is the most capable model ever released by any lab. It's exceptional at automation, software engineering, general reasoning, and—a little conce…
Paolo Perrone
No BS AI/ML Content | ML Engineer with a Plot Twist 🥷100M+ Views 📝
Google just released the first truly open source American LLM. Apache 2.0 licensed. No restrictions. No "contact us if you profit" clauses. Gemma 4 is actually free. The comparison is absurd: Kimi K2.5: 600GB+ downlo…