- The Rundown AI
- Posts
- The AI model leaderboard
The AI model leaderboard
PLUS: Rabbit r1 suffers major security breach
Sign Up | Advertise | Tools | AI University
Welcome, AI enthusiasts.
The AI world’s favorite open LLM scoreboard just got a major upgrade, and Alibaba’s Qwen 2 is on top of the podium (for now).
Hugging Face’s new benchmarks are set to change how we evaluate top models — a task becoming more difficult every day as AI continues to accelerate. Let’s explore…
In today’s AI rundown:
Hugging Face updates Open LLM Leaderboard
NBC rolls out AI vocals for Olympic recaps
Enhance videos with Krea AI upscaling
Rabbit R1 hit with major security flaw
5 new AI tools & 4 new AI jobs
More AI & tech news
Read time: 4 minutes
LATEST DEVELOPMENTS
OPENAI
Image source: Hugging Face
The Rundown: Hugging Face just introduced a new upgrade to its Open LLM Leaderboard, adding new benchmarks and evaluation methods to help address the recent plateau in LLM performance gains.
The details:
The leaderboard now features six new benchmarks designed to be more challenging and less prone to contamination.
Initial rankings show Qwen2-72B-Instruct leading the pack, followed by Meta's Llama-3-70B-Instruct and Mixtral 8×22b.
A new normalized scoring system adjusts for baseline performance, providing a more fair comparison across different evaluation types.
The upgrade also introduces a ‘maintainer's highlight’ category and community voting system to prioritize the most relevant models.
Why it matters: As LLMs approach human-level performance on most tasks, finding new ways to evaluate them is becoming more difficult — and more crucial. This revamp helps guide researchers and developers towards more targeted improvements, providing a more nuanced assessment of model capabilities.
TOGETHER WITH HEDONOVA
The Rundown: Hedonova is simplifying the complex process of investing in alternative assets — enabling investors to access a diverse portfolio media royalties, pre-IPO startups, wine, fine art, and more.
Hedonova’s benefits include:
A simple, single access point to a wide range of alternative assets
SEC regulation and award-winning returns, outperforming the S&P 500 by 200% since 2019
A low minimum investment of just $10k
Get started today and start discovering the power of alternative investments.
THE OLYMPICS & AI
Image source: NBC
The Rundown: NBC is launching an AI-generated version of legendary sportscaster Al Michaels to narrate personalized Olympic highlight reels on its Peacock streaming service for the 2024 Paris Games.
The details:
Subscribers can customize the 10-minute recap packages based on preferred sports, athletes, and content types, narrated by an AI clone of Michael’s voice.
The AI system was trained on Michaels' past NBC broadcasts to recreate his signature style, with the broadcaster giving his approval for the process.
NBC said they estimate nearly 7M unique variations of recaps generated throughout the Olympics.
Human editors will reportedly review all AI-generated content for accuracy before being released to viewers.
Why it matters: The launch of A.I. Michaels marks a major leap into the tech for a media giant, something we’ve seen reluctance or even outright dismissal of in the past for fear of backlash. The tide is changing — and things like AI-recreated voices are gradually moving from controversial to the norm.
AI TRAINING
The Rundown: Krea AI recently released a new video upscaling feature that allows users to improve the quality of their blurry videos for free.
Step-by-step:
Sign up for free or login to Krea AI.
Click on the "Upscale & Enhance" button in the main dashboard
Upload your video and customize the enhancement settings: Upscaling factor, framerate prompt, mode, AI strength, and resemblance
Click "Enhance", wait a few minutes, and check your new enhanced video 🎉
PRESENTED BY BRILLIANT
The Rundown: Brilliant's interactive courses demystify AI — helping you stay competitive in today's tech-driven world with just minutes of daily learning.
Brilliant’s platform allows you to:
Unravel the mysteries of AI through expert-designed, interactive lessons
Apply your knowledge to solve real-world tech challenges
Transform learning into a daily habit with bite-sized, gamified content
Join 10 million learners worldwide and start your 30-day free trial today. P.S. — enjoy 20% off a premium annual subscription, exclusive to The Rundown readers.
RABBIT
Image source: Rabbit Inc.
The Rundown: A group of developers just discovered a major vulnerability in Rabbit’s R1 AI assistant device, potentially exposing user’s private data and chat responses.
The details:
A community-led group called Rabbitude uncovered hardcoded API keys in Rabbit’s codebase, which allowed access to all R1 responses.
The group gained access to the codebase in mid-May, saying the Rabbit team was aware of the issue but failed to take action.
Rabbitude said the vulnerability could allow bad actors to disable all r1 devices, alter voices and responses, and access private messages.
Rabbit acknowledged an ‘alleged data breach’ via a Discord post, but claims no customer data was leaked.
Why it matters: Despite massive hype in the first wave of consumer AI standalone devices, the Rabbit r1 has been nothing short of a disaster so far. Already facing major criticism over the companion’s limited capabilities, this security breach only furthers the skepticism surrounding the early AI hardware market entrants.
NEW TOOLS & JOBS
📈 June - AI-powered customer analytics for product-focused teams
📲 Pygma - Personal AI social media manager
🚀 AppFlowy - Open-sourced alternative to Notion, manage wiki & projects with AI
🖥️ VisualSitemaps - Autogenerate visual sitemaps for UX and SEO
⚖️ Created by Human - AI rights licensing platform for creators
QUICK HITS
Free workshop: AI for Strategic Decision-Making (July 9). If you're still mostly using AI for copy generation, join Section's free workshop on putting it to work as a thought partner. Enroll here.*
Figma unveiled Figma AI, a series of new AI-powered features to its design platform, including Visual and Asset Search, AI text tools, image generation, and quick prototyping.
YouTube is reportedly in negotiations with record labels to license songs for the company’s AI-powered music generation tools, with plans to launch new features later this year.
Israel announced plans to build its first supercomputer, investing $250M in a national AI program to maintain its global leadership in the tech space.
Formation Bio secured $372M in funding to advance AI-driven drug development, with the Sam Altman-backed startup boosting its valuation above $1B.
Opera launched a new R2 update to its Opera One browser, featuring new AI-powered image generation and recognition, an AI Voice Output, and Page Context Mode for summaries and translations during web browsing.
*Sponsored listing
THAT’S A WRAP
SPONSOR US
Get your product in front of over 600k+ AI enthusiasts
Our newsletter is read by thousands of tech professionals, investors, engineers, managers, and business owners around the world. Get in touch today.
FEEDBACK
How would you rate today's newsletter?Vote below to help us improve the newsletter for you. |
If you have specific feedback or anything interesting you’d like to share, please let us know by replying to this email.
Reply