The Rundown AI
Posts
Amazon's new AI browser agent

Amazon's new AI browser agent

PLUS: AI turns brain signals into instant speech

Rowan Cheung
April 01, 2025

Good morning, AI enthusiasts. Amazon is the latest tech giant to get into the AI agent game — releasing Nova Act, a browser-controlling system outperforming similar offerings from OpenAI and Anthropic.

Set to be introduced to millions of users via the upcoming Alexa+ revamp, will Nova Act be one of the first mainstream moments for real-world agentic adoption?

In today’s AI rundown:

Amazon’s Nova Act AI browser agent
Runway’s new Gen-4 video model
Place your products into any scene
AI turns brain signals into instant speech
4 new AI tools & 4 job opportunities

LATEST DEVELOPMENTS

AMAZON

🤖 Amazon’s Nova Act AI browser agent

Image source: Amazon AGI Labs

The Rundown: Amazon AGI Labs just unveiled Nova Act, an AI agent system that can control web browsers to perform tasks independently, alongside a developer SDK that enables the creation of agents capable of completing multi-step tasks across the web.

The details:

Nova Act outperforms competitors like Claude 3.7 Sonnet and OpenAI’s Computer Use Agent on reliability benchmarks across browser tasks.
The SDK allows devs to build agents for browser actions like filling forms, navigating websites, and managing calendars without constant supervision.
The tech will power key features in Amazon's upcoming Alexa+ upgrade, potentially bringing AI agents to millions of existing Alexa users.
Nova Act was developed by Amazon's SF-based AGI Lab, led by former OpenAI researchers David Luan and Pieter Abbeel, who joined the company last year.

Why it matters: Amazon hasn’t been the first name that comes to mind for AI, but its massive Alexa user base will make it one of the first to bring the tech to mainstream consumer applications. With current agents still error-prone, Nova Act's real-world performance could make or break initial public trust in autonomous AI assistants.

TOGETHER WITH UNSTRUCTURED

⛽ Fuel MCP with data from anywhere

The Rundown: Connecting LLMs to real-world apps just got easier — and now it’s time to make your data just as accessible. In this live session, Unstructured shows how to supercharge MCP with data from anywhere.

Join to learn:

The key components of an MCP server and how to build and deploy one
How MCP standardizes data access and enhances interoperability between AI tools
How to integrate MCP with the Unstructured API for automated document processing

RUNWAY

🎬 Runway’s new Gen-4 video model

Image source: Runway

The Rundown: Runway just introduced Gen-4, a new AI model that brings increased consistency and control to video generations – with enhancements designed to be incorporated into professional cinematic workflows.

The details:

Gen-4 shows strong consistency in characters, objects, and locations throughout video sequences, with improved physics and scene dynamics.
The model can generate detailed 5-10 second videos at 1080p resolution, with features like ‘coverage’ for scene creation and consistent object placement.
Runway describes the tech as "GVFX" (Generative Visual Effects), positioning it as a new production workflow for filmmakers and content creators.
Early adopters include major entertainment companies, with the tech being used in projects like Amazon productions and Madonna's concert visuals.

Why it matters: AI video has seen the same leap in quality and control that AI images initially went through — and this next generation of models will go a long way towards taking the tools from unreliable novelties to capabilities that can readily be incorporated into workflows for creating professional films, ads, and more.

AI TRAINING

🖼️ Place your products into any scene

The Rundown: In this tutorial, you will learn how to use Google Gemini's image editing capabilities to quickly insert your products into any scene with just a product image and simple text prompts.

Step-by-step:

Head over to Google AI Studio, select the Image Generation model, upload your base scene, and type "Output this exact image" to establish the scene.
Upload your product image that you want to place in the scene.
Write a specific placement instruction like "Add this product to the table in the previous image."
Save the creations and use Google Veo 2 video generator to transform your images into smooth product videos.

Pro tip: You can create a series of product placements showing different angles and uses before converting to video for more engaging content.

PRESENTED BY SANA

📈 Do real work with AI

The Rundown: Sana’s AI agent platform transforms your organization's collective knowledge into powerful AI assistants that understand your business context and automate complex workflows across departments.

With Sana, you can:

Create specialized AI agents for any team or role without writing code
Bring all your AI capabilities together under a single, intuitive interface
Seamlessly integrate into existing tools and data sources with enterprise-ready security

Launch powerful AI agents for your teams.

AI RESEARCH

🧠 AI turns brain signals into instant speech

Image source: UC Berkeley

The Rundown: Researchers at UC Berkeley and UCSF developed an AI that can transform brain signals into speech with only a one-second delay — a breakthrough in brain-computer interfaces and a major improvement over previous systems.

The details:

Signals are decoded from the brain's motor cortex, converting intended speech into words almost instantly compared to the 8-second delay of earlier systems.
The AI model can then generate speech using the patient's pre-injury voice recordings, creating more personalized and natural-sounding output.
The system also successfully handled words outside its training data, showing it learned fundamental speech patterns rather than just memorizing responses.
The approach is compatible with various brain-sensing methods, showing versatility beyond one specific hardware approach.

Why it matters: A whole new world is coming for patients who've lost the ability to speak due to conditions like ALS, stroke, or severe paralysis. By solving the latency problem, this tech could dramatically improve quality of life and normalcy in communication for patients, restoring speech in a way previously thought impossible.

QUICK HITS

🛠️ Trending AI Tools

🤖 Gemini 2.5 Pro Exp - Google’s No. 1 ranked LLM, now available to free users
🗣️ ElevenLabs RAG - Equip your voice agent with large knowledge bases
🎥 Higgsfield DoP - AI video model with camera effects, motion, and control
👨‍💻 HeroUI Chat - Turn prompts or screenshots into production-ready UIs

💼 AI Job Opportunities

🅿️ Metropolis - Senior Operations Manager
🧑‍💻 Waymo - Software Engineer, Analysis Infrastructure
🛡️ C3 AI - Site Reliability Engineer
🤝 Faculty - Senior People Partner

📰 Everything else in AI today

OpenAI raised $40B from SoftBank and others at a $300B post-money valuation — marking the biggest private funding round in history.

Sam Altman announced that OpenAI will release its first open-weights model since GPT-2 in the coming months and host pre-release dev events to make it truly useful.

Sam Altman also shared that the company added 1M users in an hour due to 4o’s viral image capabilities, surpassing the growth during ChatGPT’s initial launch.

Manus introduced a new beta membership program and mobile app for its viral AI agent platform, with subscription plans at $39 or $199 / mo with varying usage limits.

Luma Labs released Camera Motion Concepts for its Ray2 video model, enabling users to control camera movements through basic natural language commands.

Apple pushed its iOS 18.4 update, bringing Apple Intelligence features to European iPhone users—alongside visionOS 2.4 with AI smarts for the Vision Pro.

Alphabet’s AI drug discovery spinoff Isomorphic Labs raised $600M in a funding round led by OpenAI investor Thrive Capital.

Zhipu AI launched "AutoGLM Rumination," a free AI agent capable of deep research and autonomous task execution — increasing China's AI agent competition.

COMMUNITY

🎥 Join our next live workshop

Join our next special workshop today at 3 PM EST to learn how to automate complex AI workflows with Max Brodeur-Urbas, Founder and CEO of Gumloop.

RSVP here. Not a member? Join The Rundown University on a 14-day free trial.

🤝 Share The Rundown, get rewards

We’ll always keep this newsletter 100% free. To support our work, consider sharing The Rundown with your friends, and we’ll send you more free goodies.

That's it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve The Rundown experience for you.

See you soon,

Rowan, Joey, Zach, Alvaro, and Jason—The Rundown’s editorial team

Reply

or to participate.