Claude learns to use the computer

PLUS: Open-source video generation levels up

Welcome, AI enthusiasts.

Anthropic's Claude isn't just chatting anymore — it's clicking, typing, and scrolling its way through computers like a human.

The AI agent dam seems to be breaking open this week, and AI capabilities are getting more hands-on by the day (literally). Let’s get into it…

In today’s AI rundown:

  • Anthropic's AI now navigates computers like a human

  • Genmo drops open-source AI video model

  • Master public speaking with ChatGPT

  • Ideogram debuts AI Canvas workspace

  • 5 new AI tools & 5 new AI jobs

  • More AI & tech news

Read time: 4 minutes

LATEST DEVELOPMENTS

ANTHROPIC

Image source: Anthropic

The Rundown: Anthropic just introduced a new capability called ‘computer use’, alongside upgraded versions of its AI models, which enables Claude to interact with computers by viewing screens, typing, moving cursors, and executing commands.

The details:

  • Claude can now autonomously navigate computer interfaces, performing complex tasks across multiple applications and websites.

  • Anthropic said it taught the model ‘general computer skills’ instead of creating a standalone tool, helping it operate more like a human.

  • The upgraded Sonnet 3.5 significantly improves coding and tool use, outperforming other models (including o1-preview) on key benchmarks.

  • A new Haiku 3.5 model matches the capabilities of previous high-end models at lower cost and higher speed.

  • Anthropic highlighted that computer use is still imperfect (including some hilarious examples), encouraging testing on low-risk tasks until skills improve.

Why it matters: While many hoped for Opus 3.5, Anthropic’s Sonnet and Haiku upgrades pack a serious punch. Plus, with the new computer use embedded right into its foundation models, Anthropic just sent a warning shot to tons of automation startups—even if the capabilities aren’t earth-shattering... yet.

TOGETHER WITH ASSEMBLYAI

The Rundown: AssemblyAI is revolutionizing Speech AI with best-in-class accuracy and speed, empowering you to build the next generation of voice-enabled products.

AssemblyAI delivers:

  • Top-tier accuracy rates reaching 95%

  • Significantly reduced hallucinations — up to 30% fewer than industry leaders

  • Blazing fast conversion with 63 minutes of audio processed in 35 seconds

  • Hassle-free, code-free updates for continuous improvement

Put our API to the test. Start building for free today.

GENMO

Image source: Genmo

The Rundown: AI startup Genmo just launched Mochi 1, a new open-source video generation model that claims to rival closed competitors like Runway, Pika, and Kling — while being freely available to developers and researchers.

The details:

  • Mochi is built on a new 10B parameter architecture called AsymmDiT, making it the largest open-source video generation model ever released.

  • The model focuses heavily on motion quality and prompt adherence, generating 480p videos at 30fps for up to 5.4 seconds.

  • Mochi surpassed top models like Kling, Runway Gen-3, Luma’s Dream Machine, and Pika in motion quality and prompt adherence during testing.

  • A higher-definition version, Mochi 1 HD, with 720p support and image-to-video capabilities, is planned for release later this year.

  • Genmo also announced that it secured $28.4M in Series A funding, with Mochi-1 being the company’s first step toward building ‘world simulators.’

Why it matters: Open-source AI video is officially competing with the top of the market. Genmo’s Mochi is an extremely impressive release that showcases how competitive the video generation landscape is about to become — especially with the major dominos (Sora, Midjourney?) still to come.

AI TRAINING

The Rundown: ChatGPT's Voice mode can be customized to simulate a live audience, providing real-time feedback and follow-up questions to improve your public speaking skills.

Step-by-step:

  1. Download the ChatGPT app and access Custom Instructions in settings.

  2. Set ChatGPT to respond with "mhm" during your speech until you say "Done".

  3. Start a new chat, activate voice mode, and provide the practice prompt.

  4. Deliver your speech section by section, saying "Done" after each part.

Pro tip: Use ChatGPT's custom instructions to ensure it doesn’t interrupt you or provide unnecessarily long responses.

PRESENTED BY SONAR

The Rundown: Sonar’s AI capabilities improve the quality of every AI-generated line of code — helping you navigate the new risks emerging from the LLM developer landscape.

Check out Sonar’s guide to find:

  • A deep dive into the OWASP LLM Top 10 and its implications

  • Strategies to detect and prevent security flaws in AI-generated code

  • The vital link between code quality and robust security measures

Download your free guide and stay ahead of emerging AI security threats.

IDEOGRAM

Image source: Ideogram

The Rundown: Ideogram just unveiled a new AI-powered workspace called Canvas, introducing advanced tools like Magic Fill and Extend to combine image editing and generation for new creative workflows.

The details:

  • Canvas provides an endless digital board on which users can generate, organize, and seamlessly blend AI-generated and uploaded images.

  • Magic Fill allows precise editing of selected image areas, enabling tasks like object replacement, text addition, and background alteration.

  • The Extend feature expands images beyond their original dimensions while maintaining style consistency, even with text.

  • Ideogram also features an API, allowing developers to incorporate the new features into their own applications

Why it matters: The design industry is no stranger to AI tools (Photoshop, Canva) — but Ideogram’s latest release feels like the exact type of fastball that AI and design novices can really make magic with. The examples shown also illuminate how drastically creative workflows are changing in the AI era.

NEW TOOLS & JOBS

  • ⚙️ Softr for Notion - Turn Notion databases into portals and apps

  • 📊 CapGo AI - AI-powered spreadsheet for market research and lead enrichment

  • 📸 Pixyer - AI background generator for professional product photos

  • 💸 Hero - Use AI to scan, price, and list your stuff in seconds

  • 💻 AIxBlock - Comprehensive platform to productize AI models with decentralized computing resources

  • 💻 Walmart - Senior, Software Engineer

  • 🏗️ Palantir Technologies - Production Infrastructure, Product Manager

  • 📊 Mistral AI - Data Quality Specialist, AI Tutor (Fixed term)

  • 📞 Glean - Senior Manager, Customer References

  • 🛡️ Coreweave - Senior Governance, Risk & Compliance Analyst

QUICK HITS

Runway debuted Act-One, a new feature that generates expressive character performances from a single video and image without motion capture or rigging.

Stability AI released Stable Diffusion 3.5, featuring Large and Large-Turbo models that improve customization, efficiency, and diversity of outputs.

Cohere enhanced its Embed 3 model with multimodal capabilities, enabling enterprises to perform RAG-style searches across text and image content.

Chipotle launched a new conversational AI hiring platform called ‘Ava Cado,’ which the restaurant says can accelerate the hiring process by up to 75%.

Asana introduced AI Studio, a no-code platform for teams to design and deploy AI agents to automate business workflows.

Canva unveiled Dream Lab, a new image generator powered by Leonardo AI — alongside a series of new AI features added to the platform’s Visual Suite.

Inflection AI launched Agentic Workflows, enabling its enterprise systems to take trusted actions for various business use cases.

THAT’S A WRAP

That's it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve The Rundown experience for you.

Login or Subscribe to participate in polls.

See you soon,

Rowan, Joey, Zach, and Alvaro—aka The Rundown Team

Reply

or to participate.