The Rundown AI / Articles / AI / Claude (finally) gets a voice
AI

Claude (finally) gets a voice

PLUS: Patients control AI and robotics with thought

Zach Mink

May 28, 2025

Read Online | Sign Up | Advertise

Good morning, AI enthusiasts. The last major AI holdout just officially joined the voice movement, with Anthropic finally giving its assistant the ability to speak.

As usual with Anthropic, it’s better late than never — and with the rollout of shiny new models and now brand new voice, the AI giant is shipping once again.


In today’s AI rundown:

  • Anthropic’s new Voice Mode for Claude

  • Synthesia co-founder’s 3D world AI startup

  • Automate project meeting documentation

  • Study: AI learns reasoning through self-confidence

  • 4 new AI tools & 4 job opportunities

LATEST DEVELOPMENTS

ANTHROPIC

🗣️ Anthropic’s new Voice Mode for Claude

Image source: Anthropic

The Rundown: Anthropic just announced the launch of its new Voice mode for its Claude mobile apps, becoming one of the last major AI labs to enable users to have natural spoken conversations with its AI assistant.

The details:

  • The beta feature is set to arrive for English-speaking users in the coming weeks and will run on Claude's latest Sonnet 4 model.

  • Users can flow naturally between speaking and typing, with five voice personalities available and real-time transcription displayed during chats.

  • Voice mode also integrates with Google Workspace for paid subscribers, allowing Claude to access calendars, docs, and Gmail with voice commands.

  • Free users receive 20-30 voice messages a month, with paid tiers getting “significantly higher” usage limits.

Why it matters: With all the major labs now offering voice modes, the competition shifts to execution — with aspects like latency, integrations, and the underlying model quality all playing a role in the user experience. The capabilities also are a jarring difference from the old-gen voices like Siri, showing how behind it truly is.

TOGETHER WITH POSTMAN

🚀 Skip the setup, ship the agent

The Rundown Postman’s Agent Generator delivers complete turnkey infrastructure with zero server setup, enabling developers to build and deploy AI agents instantly without friction.

With Agent Generator, you can:

  • Instantly spin up agent workflows

  • Works with OpenAI, LangChain & more

  • Test, debug, and deploy—all in Postman

Skip the setup and start building today.

SPAITIAL

🌐 Synthesia co-founder’s 3D world AI startup

Image source: SpAItial

The Rundown: Synthesia co-founder Matthias Niessner just unveiled SpAItial, a new startup aimed at creating AI systems capable of generating interactive 3D environments from texts and images.

The details:

  • The company is building Spatial Foundation Models (SFMs) that understand 3D space natively and can grasp geometry, physics, and material properties.

  • SpAItial's founding team includes former leaders from Synthesia, Google, and Meta, bringing expertise in 3D AI and neural rendering technologies.

  • Early demos generated photorealistic 3D rooms from simple text prompts, with applications spanning gaming, construction, VR, and robotics.

Why it matters: While AI has mastered generating 2D images and videos, creating coherent, spatially aware 3D worlds remains a challenge. This new breed of models could enable anyone to create complex virtual environments with just a few words — tackling what many consider to be the next frontier in AI.

AI TRAINING

📊 Automate project meeting documentation

The Rundown: In this tutorial, you will learn how to create an automated system with Zapier Agents that can turn meeting recordings into transcripts, summaries, and actionable task lists in Google Docs.

Step-by-step:

  1. Visit Zapier Agents and create a “New Agent”

  2. Configure your agent to trigger when new audio files are uploaded to a specified folder in Google Drive

  3. Add three essential tools: ChatGPT to transcribe the audio, ChatGPT again to summarize and extract action points, and Google Docs to compile everything into a single document

  4. Test your setup with a sample recording and activate your agent

Pro tip: At the start of each meeting, ask participants to clearly state their names before speaking and explicitly mention action item assignments to help the AI more accurately attribute tasks to team members.

PRESENTED BY ENCORD

📊 One platform for all your AI data needs

The Rundown: Encord is a consolidated platform for multimodal AI data management, curation, and annotation, enabling teams to accelerate model iteration cycles with balanced, accurately labeled datasets.

Leading AI teams use Encord’s fully customizable multimodal interface to:

  • Evaluate GenAI outputs across video, audio, and text in record time

  • Create VLA datasets with synchronized video, instruction, and trajectory data

  • Unite PDF, image, video, audio, and DICOM labeling in a single interface

Try Encord today.

AI RESEARCH

☺️ Study: AI learns reasoning through self-confidence

Image source: UC Berkeley and Yale

The Rundown: Researchers from UC Berkeley and Yale introduced INTUITOR, an AI training method that enables language models to improve their reasoning using internal confidence signals — eliminating the need for correct answers or external feedback.

The details:

  • INTUITOR measures how confident an AI feels about each word it generates, using this "gut feeling" as a guide for learning.

  • Instead of needing correct answers to learn (like traditional AI training), the system rewards the AI when it produces responses it feels confident about.

  • When tested on math problems, the method performed just as well as conventional training, but showed even better results on programming tasks.

  • The AIs also began showing human-like reasoning behaviors — breaking down complex problems, planning, and explaining their thinking step-by-step.

Why it matters: Just as intuition and confidence play a large role in human learning, this study shows AI is succeeding within the same system. This self-directed approach could be especially valuable for tasks where there's no clear "right answer" or where human expertise is limited, allowing AI to venture into unexplored knowledge areas.

QUICK HITS

🛠️ Trending AI Tools

💼 AI Job Opportunities

📰 Everything else in AI today

Mistral launched Agents API for enterprise apps, introducing connectors for coding, web search, and image generation alongside memory and multi-agent orchestration.

Meta is reportedly restructuring its AI organization into two distinct teams focused on AI products and AGI foundations, aiming to accelerate the company’s development.

Anthropic’s Claude 4 Sonnet model achieved a new SOTA on the ARC-AGI-2 benchmark, surpassing o3 for the top spot on the leaderboard.

Google DeepMind teased SignGemma, an upcoming model capable of translating sign language into text.

Salesforce acquired cloud data management firm Informatica for $8B, strengthening the infrastructure powering its agent-based products and platforms.

The Browser Company revealed that it will no longer be working on its Arc browser, instead fully pivoting to developing its AI-first Dia browser as a separate product.

COMMUNITY

🎥 Join our next live workshop

Join our next workshop this Friday, May 30th, at 4 PM EST with Dr. Alvaro Cintas, The Rundown’s AI professor. By the end of the workshop, you’ll confidently be able to use AI coding agents to improve your development workflow.

RSVP here. Not a member? Join The Rundown University on a 14-day free trial.

🤝 Share The Rundown, get rewards

We’ll always keep this newsletter 100% free. To support our work, consider sharing The Rundown with your friends, and we’ll send you more free goodies.

See you soon,

Rowan, Joey, Zach, Alvaro, and Jason—The Rundown’s editorial team

Stay Ahead on AI.

Join 1,000,000+ readers getting bite-size AI news updates straight to their inbox every morning with The Rundown AI newsletter. It's 100% free.