OpenAI's DevDay updates revealed

PLUS: Microsoft Copilot gets new Vision, Voice boosts

Welcome, AI enthusiasts.

OpenAI's DevDay may have skipped the spectacle this time with no live stream — but we caught the event live and secured exclusive details on new releases.

With four new major developer-focused announcements, and a private Rundown Q&A with OpenAI’s Head of Product, we’ve got a big one today. Let’s get into it…

In today’s AI rundown:

  • OpenAI makes 4 major announcements at DevDay

  • Microsoft Copilot gets voice, vision upgrade

  • Exclusive DevDay Q&A with OpenAI’s Olivier Godement

  • Extend images for free with HuggingFace

  • 5 new AI tools & 4 new AI jobs

  • More AI & tech news

Read time: 4 minutes

LATEST DEVELOPMENTS

OPENAI

Image source: Rowan Cheung @ Dev Day

The Rundown: OpenAI just held its DevDay 2024 event, unveiling a suite of new API features and improvements designed to make its AI systems more accessible, efficient, and cost-effective for developers to build with.

The details:

  • Realtime API enables speech-to-speech application building using the same model that powers Advanced Voice, with the ability to choose from six voices.

  • Model Distillation simplifies fine-tuning smaller models using outputs from larger ones, making training more accessible to developers.

  • Prompt Caching reduces costs by nearly 50% across models and speeds up responses by up to 80% when reusing recent input tokens in API calls.

  • New Vision Fine-Tuning allows models to be trained with both images and text, allowing developers to optimize tasks like image recognition and analysis.

Why it matters: While this year’s DevDay may have lacked the traditional hype of a typical OpenAI event, the releases are still set to have a tremendous impact. These API updates not only enable the creation of entirely new, exciting experiences but also lower the barrier to entry, for builders across OpenAI’s platform.

TOGETHER WITH SYNTHFLOW

The Rundown: Synthflow’s AI-powered phone calls enable interactions that are indistinguishable from human conversations — revolutionizing the way businesses handle customer service.

With Synthflow, you can:

  • Create lifelike AI voices that speak naturally in multiple languages

  • Design custom conversation flows to handle various scenarios

  • Integrate seamlessly with your existing systems for efficient call handling

  • Scale your customer service without compromising on quality

Try Synthflow today and experience the future of customer communication.

MICROSOFT

Image source: Microsoft

The Rundown: Microsoft just announced a slew of AI upgrades coming to its Copilot assistant for Windows PCs, including new vision and voice capabilities, personalization enhancements, a re-release of the controversial Recall feature, and more.

The details:

  • Copilot Voice allows users to interact with natural speech, adding conversational and intuitive communication similar to OpenAI’s Voice Mode.

  • Copilot Vision enables the AI to understand and interact with web content a user is viewing, offering context-aware help within the Microsoft Edge browser.

  • ‘Think Deeper’ gives Copilot new enhanced reasoning capabilities using chain-of-thought reasoning powered by OpenAI’s o1 model.

  • Microsoft’s ‘Recall’ feature is set to return, requiring an opt-in with upgraded privacy and security measures.

  • Microsoft AI CEO Mustafa Suleyman highlighted Copilot’s ability to ultimately ‘act on your behalf’ and adapt to user’s personal preferences and needs.

Why it matters: Microsoft is bringing the heat with these major Copilot upgrades, levelling up the assistant to align with the latest cutting-edge AI features across the industry — while bringing users one step closer to a truly agentic experience.

OPENAI DEVDAY

Image source: Rowan Cheung / The Rundown

The Rundown: We caught up with OpenAI Head of Product Olivier Godement after he led the main keynote at Tuesday’s DevDay event for some exclusive insights on the new Realtime API (Godement’s responses are summarized for brevity).

On the Realtime API: Godement says that “Until right now, voice has been a second activity“, and that the Realtime API is going to make AI significantly more accessible because many people in the real world prefer to speak over reading or texting.

On real-world use cases: Godement believes the Realtime API will have a “no-brainer” impact on customer support, education, and coaching. He also believes there will be many ‘non-obvious‘ use cases that are hard to predict now.

On pricing: Converted to seconds, audio input is ~6 cents per minute, and output is ~24 cents per minute. While currently high, Godement confirmed that there are “huge pricing decreases on the roadmap.”

On the Twitter misinterpretation: Godement also mentioned a misinterpretation of pricing after the announcement—when users mentioned how much it costs per hour, they multiplied cost as if the input/output were constant. However, whenever humans talk, there is silence—it’s not a constant flow. The model won’t charge you for silence.

On future modalities: For now, Realtime API only supports text and audio. However, Godement believes that image and video are the next milestones on the road to agents that can perceive the world just like a human. He also mentioned that image and video understanding specifically, will “turbocharge customer support” when the model has the ability to understand pixels on a screen in real-time.

PRESENTED BY INNOVATING WITH AI

The Rundown: Innovating with AI’s new program, AI Consultancy Project, equips AI enthusiasts with all the resources to capitalize on the rapidly growing AI consulting market – which is set to 8x to $54.7B by 2032.

The program offers:

  • Tools and framework to find clients and deliver top-notch services

  • A 6-month roadmap to build a 6-figure AI consulting business

  • Student landing their first AI client in as little as 3 days

AI TRAINING

The Rundown: Hugging Face's free AI image outpainting tool allows users to extend their images with custom aspect ratios for various use cases, such as optimizing images for any social media platform.

Step-by-step:

  1. Visit the "diffusers-image-outpaint" Hugging Face space.

  2. Upload your image to expand.

  3. Set your desired aspect ratio and alignment (e.g., 1:1, middle).

  4. Adjust advanced settings like output size and input image resize.

  5. Click "Generate" and watch AI expand your image!

NEW TOOLS & JOBS

  • 🎥 Video SDK 3.0 - Build and integrate real-time multimodal AI characters

  • 📭 Inbox Zero - An open-source, AI personal assistant for email

  • 👩🏻‍💻 Graphite - Your AI code review companion

  • 📚 Ello - An AI reading companion for children offering personalized support

  • 🗣️ VivaChat - FaceTime video chat with realistic AI personas

QUICK HITS

OpenAI founding member Durk Kingma announced that he is joining Anthropic, reuniting with several former OpenAI employees and highlighting the company’s mission of responsible AI development in his X post.

Pika Labs unveiled Pika 1.5, a new video generation model upgrade featuring enhanced effects, realistic movement, longer clip creation, and cinematic capabilities.

Anyscale unveiled major upgrades to its AI platform at Ray Summit 2024, including a GPU-native Ray architecture, RayTurbo for enhanced performance, Ray Data for unstructured data processing, and more.

U.S. AI chipmaker Cerebras officially filed for an IPO, with the Sam Altman-backed Nvidia competitor expected to be valued at between $7-8B.

Meta released the open-source code and developer suite for its Segment Anything Model (SAM) 2.1, an upgraded version of its image and video segmentation tool.

Nvidia introduced NVLM 1.0, an open-source family of multimodal models that achieve SOTA performance on vision-language and text tasks.

Pinterest launched Performance+, a suite of new AI tools for advertisers that includes the ability to create background images for products and automation features for ad campaigns.

THAT’S A WRAP

That's it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve The Rundown experience for you.

Login or Subscribe to participate in polls.

See you soon,

Rowan, Joey, Zach, and Alvaro—aka The Rundown Team

Reply

or to participate.