Claude (finally) gets a voice
PLUS: Patients control AI and robotics with thought
Read Online | Sign Up | Advertise
Good morning, AI enthusiasts. The last major AI holdout just officially joined the voice movement, with Anthropic finally giving its assistant the ability to speak.
As usual with Anthropic, it’s better late than never — and with the rollout of shiny new models and now brand new voice, the AI giant is shipping once again.
In today’s AI rundown:
Anthropic’s new Voice Mode for Claude
Synthesia co-founder’s 3D world AI startup
Automate project meeting documentation
Study: AI learns reasoning through self-confidence
4 new AI tools & 4 job opportunities
LATEST DEVELOPMENTS
ANTHROPIC
🗣️ Anthropic’s new Voice Mode for Claude

Image source: Anthropic
The Rundown: Anthropic just announced the launch of its new Voice mode for its Claude mobile apps, becoming one of the last major AI labs to enable users to have natural spoken conversations with its AI assistant.
The details:
The beta feature is set to arrive for English-speaking users in the coming weeks and will run on Claude's latest Sonnet 4 model.
Users can flow naturally between speaking and typing, with five voice personalities available and real-time transcription displayed during chats.
Voice mode also integrates with Google Workspace for paid subscribers, allowing Claude to access calendars, docs, and Gmail with voice commands.
Free users receive 20-30 voice messages a month, with paid tiers getting “significantly higher” usage limits.
Why it matters: With all the major labs now offering voice modes, the competition shifts to execution — with aspects like latency, integrations, and the underlying model quality all playing a role in the user experience. The capabilities also are a jarring difference from the old-gen voices like Siri, showing how behind it truly is.
TOGETHER WITH POSTMAN
🚀 Skip the setup, ship the agent
The Rundown Postman’s Agent Generator delivers complete turnkey infrastructure with zero server setup, enabling developers to build and deploy AI agents instantly without friction.
With Agent Generator, you can:
Instantly spin up agent workflows
Works with OpenAI, LangChain & more
Test, debug, and deploy—all in Postman
SPAITIAL
🌐 Synthesia co-founder’s 3D world AI startup

Image source: SpAItial
The Rundown: Synthesia co-founder Matthias Niessner just unveiled SpAItial, a new startup aimed at creating AI systems capable of generating interactive 3D environments from texts and images.
The details:
The company is building Spatial Foundation Models (SFMs) that understand 3D space natively and can grasp geometry, physics, and material properties.
SpAItial's founding team includes former leaders from Synthesia, Google, and Meta, bringing expertise in 3D AI and neural rendering technologies.
Early demos generated photorealistic 3D rooms from simple text prompts, with applications spanning gaming, construction, VR, and robotics.
Why it matters: While AI has mastered generating 2D images and videos, creating coherent, spatially aware 3D worlds remains a challenge. This new breed of models could enable anyone to create complex virtual environments with just a few words — tackling what many consider to be the next frontier in AI.
AI TRAINING
📊 Automate project meeting documentation

The Rundown: In this tutorial, you will learn how to create an automated system with Zapier Agents that can turn meeting recordings into transcripts, summaries, and actionable task lists in Google Docs.
Step-by-step:
Visit Zapier Agents and create a “New Agent”
Configure your agent to trigger when new audio files are uploaded to a specified folder in Google Drive
Add three essential tools: ChatGPT to transcribe the audio, ChatGPT again to summarize and extract action points, and Google Docs to compile everything into a single document
Test your setup with a sample recording and activate your agent
Pro tip: At the start of each meeting, ask participants to clearly state their names before speaking and explicitly mention action item assignments to help the AI more accurately attribute tasks to team members.
PRESENTED BY ENCORD
📊 One platform for all your AI data needs
The Rundown: Encord is a consolidated platform for multimodal AI data management, curation, and annotation, enabling teams to accelerate model iteration cycles with balanced, accurately labeled datasets.
Leading AI teams use Encord’s fully customizable multimodal interface to:
Evaluate GenAI outputs across video, audio, and text in record time
Create VLA datasets with synchronized video, instruction, and trajectory data
Unite PDF, image, video, audio, and DICOM labeling in a single interface
AI RESEARCH
☺️ Study: AI learns reasoning through self-confidence

Image source: UC Berkeley and Yale
The Rundown: Researchers from UC Berkeley and Yale introduced INTUITOR, an AI training method that enables language models to improve their reasoning using internal confidence signals — eliminating the need for correct answers or external feedback.
The details:
INTUITOR measures how confident an AI feels about each word it generates, using this "gut feeling" as a guide for learning.
Instead of needing correct answers to learn (like traditional AI training), the system rewards the AI when it produces responses it feels confident about.
When tested on math problems, the method performed just as well as conventional training, but showed even better results on programming tasks.
The AIs also began showing human-like reasoning behaviors — breaking down complex problems, planning, and explaining their thinking step-by-step.
Why it matters: Just as intuition and confidence play a large role in human learning, this study shows AI is succeeding within the same system. This self-directed approach could be especially valuable for tasks where there's no clear "right answer" or where human expertise is limited, allowing AI to venture into unexplored knowledge areas.
QUICK HITS
🛠️ Trending AI Tools
⚙️ Claude Code - Anthropic’s agentic coding tool, now generally available
🧠 Nemotron AceReason - Nvidia’s math and code reasoning model
🦙 Llama-Factory - Fine-tune and train open-source LLMs with no code
▶️ OpusClip Thumbnail - One-click AI thumbnail generator
💼 AI Job Opportunities
🎧 Meta - Software Engineering Manager, Audio
🛠️ Palantir Technologies - Systems Engineer
🕴️ OpenAI - Executive Recruiter
🤝 Horizon3 - Partner Success Manager
📰 Everything else in AI today
Mistral launched Agents API for enterprise apps, introducing connectors for coding, web search, and image generation alongside memory and multi-agent orchestration.
Meta is reportedly restructuring its AI organization into two distinct teams focused on AI products and AGI foundations, aiming to accelerate the company’s development.
Anthropic’s Claude 4 Sonnet model achieved a new SOTA on the ARC-AGI-2 benchmark, surpassing o3 for the top spot on the leaderboard.
Google DeepMind teased SignGemma, an upcoming model capable of translating sign language into text.
Salesforce acquired cloud data management firm Informatica for $8B, strengthening the infrastructure powering its agent-based products and platforms.
The Browser Company revealed that it will no longer be working on its Arc browser, instead fully pivoting to developing its AI-first Dia browser as a separate product.
COMMUNITY
🎥 Join our next live workshop
Join our next workshop this Friday, May 30th, at 4 PM EST with Dr. Alvaro Cintas, The Rundown’s AI professor. By the end of the workshop, you’ll confidently be able to use AI coding agents to improve your development workflow.
RSVP here. Not a member? Join The Rundown University on a 14-day free trial.
🤝 Share The Rundown, get rewards
We’ll always keep this newsletter 100% free. To support our work, consider sharing The Rundown with your friends, and we’ll send you more free goodies.
See you soon,
Rowan, Joey, Zach, Alvaro, and Jason—The Rundown’s editorial team
Stay Ahead on AI.
Join 1,000,000+ readers getting bite-size AI news updates straight to their inbox every morning with The Rundown AI newsletter. It's 100% free.