OpenAI's Voice Mode is finally here!

PLUS: Google's new Gemini models

Welcome, AI enthusiasts.

After a multi-month wait, OpenAI finally announced that Advanced Voice Mode is rolling out to all ChatGPT Plus and Team subscribers this week (outside of the EU).

But with no current o1 integration, image upload option, or live video, will it live up to the hype? Let’s get into it…

In today’s AI rundown:

  • OpenAI rolls out Advanced Voice Mode

  • Google releases production-ready models

  • Customize images fast with PuLID-FLUX

  • James Cameron joins Stability AI’s board

  • 6 new AI tools & 4 new AI jobs

  • More AI & tech news

Read time: 4 minutes

LATEST DEVELOPMENTS

OPENAI

Image source: OpenAI

The Rundown: OpenAI is finally rolling out an enhanced Advanced Voice Mode (AVM) to all ChatGPT Plus and Teams subscribers this week, featuring new voices and improved functionality to make AI interactions feel more natural and personalized.

The details:

  • The initial rollout for OpenAI’s new Advanced Voice Mode started in July, but it only ever reached a select few ChatGPT users.

  • During the delay, OpenAI updated its AVM to integrate Custom Instructions and Memory, allowing for more personalized interactions and conversation recall.

  • OpenAI also improved AVM’s ability to understand accents and claims smoother, faster conversations, while adding five new nature-inspired voices (and removing the “Sky” voice that sounded like Scarlett Johansson).

  • AVM will not yet be available in several regions, including the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein.

Why it matters: With OpenAI CEO Sam Altman writing about AI agents and superintelligence, ChatGPT Advanced Voice Mode feels more relevant than ever. If we’re going to interact with AI every day—it has to sound and feel human—which is exactly what AVM is attempting to accomplish.

Editors note: If you still don’t have access to Advanced Voice Mode on your ChatGPT app, try uninstalling and reinstalling the app.

TOGETHER WITH HUBSPOT

The Rundown: Maximize your workplace potential with HubSpot’s ChatGPT guide – your roadmap to AI-driven success for curious minds.

With this guide, you’ll learn to:

  • Streamline daily tasks with AI, from inbox management to product planning

  • Apply ChatGPT effectively and ethically in our work environment

  • Stay ahead of the curve by integrating AI tools into your skill set

Download your free guide now and start transforming your workday with ChatGPT.

GOOGLE

Image source: Google

The Rundown: Google just announced significant updates to its Gemini AI models, including performance improvements, cost reductions, and increased accessibility for developers.

The details:

  • Two new production-ready models came out today: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002, offering improved quality across various tasks, including a 20% boost in math-related benchmarks.

  • Pricing for Gemini 1.5 Pro has been reduced by over 50% for both input and output on prompts under 128K tokens, while rate limits have been increased significantly.

  • The models boast 2x faster output and 3x lower latency compared to previous versions, with improvements in long context understanding and vision capabilities.

  • Google also updated its default filter settings, giving developers more control over model configuration for their specific use cases.

Why it matters: Google is iterating quickly and pushing the boundaries of affordability for developers building with AI. While this isn’t Gemini 2 — it is a significant upgrade over the experimental models and will help builders create faster, smarter, cheaper applications.

AI TRAINING

The Rundown: Hugging Face’s new PuLID-Flux space offers a tuning-free solution for quick image customization with your own likeness using just one reference photo.

Step-by-step:

  1. Visit PuLID-Flux Hugging Face space (also available in Replicate)

  2. Upload your reference image in the "ID Image" section

  3. Play with the different parameters it offers and write a descriptive prompt for your desired output

  4. Click "Generate" and refine as needed

Pro tip: Adjust the "timestep to start inserting ID" parameter to balance fidelity and creativity.

PRESENTED BY SECTION

The Rundown: Section is hosting a free, virtual, full-day conference focused on getting and proving AI ROI.

Listen into:

  • Candid sessions led by AI leaders, experts, and ethicists

  • Real AI success stories and case studies

  • AI ROI from every perspective, from CFO to VCs

RSVP now for this free event on November 14, 2024.

STABILITY AI

Image source: Midjourney

The Rundown: James Cameron, the acclaimed director of Titanic, Avatar, and The Terminator, recently joined the board of directors at Stability AI, the company behind the popular Stable Diffusion text-to-image AI model.

The details:

  • Cameron, known for pushing technological boundaries in filmmaking, sees the convergence of generative AI and CGI as “the next wave” in visual media creation.

  • Stability AI’s CEO, Prem Akkaraju, formerly led visual effects company WETA Digital, highlighting the firm’s focus on creative applications of AI.

  • The move comes as Hollywood grapples with AI’s potential, with some studios embracing the technology while others express concerns over content rights.

Why it matters: Just days after Lionsgate teamed up with AI startup Runway to create a custom video generation model, this move by one of Hollywood’s biggest directors could signal a significant shift in how influential filmmakers are thinking about navigating AI.

NEW TOOLS & JOBS

  • 🎨 Adobe GenStudio - Helps marketing teams measure on-brand content

  • 🔎 FactBot by Snopes - Fact-checking for urban legends and misinformation

  • 💸 JustPaid - Automate invoice follow-ups and payment tracking

  • 💻 ell - A lightweight prompt engineering framework for language models

  • 🧪 Pathway - Helps product teams test UX solutions and gather insights

  • 🎥 Tubit AI - AI that summarizes YouTube videos for a deeper understanding

  • 👥 People.ai - HR Coordinator

  • 🧠 UiPath - Chief of Staff / Technical Advisor

  •  Tempus - Quality Assurance Associate

  • 🛠️ DeepL - Engineering Manager

QUICK HITS

Warner Bros. Discovery adopted Google Cloud’s AI for caption generation, aiming to cut production time and costs for unscripted programming.

Intel launched Xeon 6 processors and Gaudi 3 AI accelerators, doubling performance for AI workloads and offering improved price and performance compared to Nvidia’s H100.

OpenAI increased API access for o1 models, adding tier 4 to the list of authorized users at 100 requests per minute and upping tier 5 users to 1000 requests per minute.

Suno AI announced a new cropping feature available to AI-generated songs, allowing Pro and Premier users to adjust the start and end of their creations.

Duolingo introduced AI-powered Adventures mini-games and a Video Call feature to enhance language learning through immersive, practical experiences for its users.

Apple unveiled its plan to roll out Siri’s major AI-powered updates gradually, with the most significant enhancements expected in iOS 18.3, likely launching in January 2025.

THAT’S A WRAP

SPONSOR US

Get your product in front of over 650k+ AI enthusiasts

Our newsletter is read by thousands of tech executives, investors, engineers, managers, and business owners around the world. Get in touch today.

FEEDBACK

How would you rate today's newsletter?

Vote below to help us improve the newsletter for you.

Login or Subscribe to participate in polls.

If you have specific feedback or anything interesting you’d like to share, please let us know by replying to this email.

Reply

or to participate.