The Rundown AI / Articles / AI / Google tops OpenAI's math breakthrough — 9 to 1

Google tops OpenAI's math breakthrough — 9 to 1

PLUS: Build an AI secretary that plans your day

Zach Mink

May 25, 2026

Good morning, AI enthusiasts. Last week, OpenAI made headlines after announcing its AI cracked an 80-year-old mathematics problem. Turns out, it wasn’t the only one. Google DeepMind quietly did them eight (not one) better.

The company’s AlphaProof Nexus autonomously solved nine open Erdős problems — considered some of the hardest unsolved questions in math — at a cost of a few hundred dollars per problem.

In today’s AI rundown:

Google’s AI cracks nine unsolved math problems
The Rundown Roundtable: Our AI use cases
Build an AI secretary that plans your day
Claude Mythos finds 10,000+ critical vulnerabilities
4 new AI tools, community workflows, and more

LATEST DEVELOPMENTS

GOOGLE

🧮 Google’s AI cracks nine unsolved math problems

Image source: Images 2.0 / The Rundown

The Rundown: Google DeepMind’s AlphaProof Nexus, an AI system that generates machine-verified mathematical proofs, solved nine open Erdős problems, including two unsolved for 56 years, just a day after OpenAI claimed its own Erdős breakthrough.

The details:

The system paired an LLM with Lean, a proof assistant, to generate machine-verified proofs for the nine problems spanning combinatorics and graph theory.
Each problem cost a few hundred dollars to solve, with the AI also proving 44 open conjectures from the Online Encyclopedia of Integer Sequences.
A simpler version of the agent matched the results but cost more, and problems requiring new mathematical constructions remained out of reach.
OpenAI’s win last week saw its AI disprove an 80-year-old Erdős conjecture — months after walking back a claim of solving 10 novel problems.

Why it matters: Google’s progress on math problems unsolved for decades shows how fast AI is moving toward original solutions, and how formal verification changes the game. The system generates proofs, verifies them in Lean, and repeats until one passes. Over time, this will help researchers make novel discoveries at machine speed.

TOGETHER WITH GOOGLE FOR STARTUPS

💡 Master agentic AI with Google for Startups

The Rundown Startup School: Agentic AI is an immersive global training program that helps founders and developers move beyond basic chatbots to build robust, production-ready autonomous workflows using Google Cloud.

In the program, you’ll explore:

Prototype realtime voice AI with Gemini Live.
Leverage multimodal RAG for advanced data grounding.
Build bidirectional vision agents for data extraction.

THE RUNDOWN ROUNDTABLE

💡 The Rundown Roundtable: Our AI use cases

Image source: Ideogram / The Rundown

The Rundown: The Rundown Roundtable is a weekly feature where we poll members of The Rundown staff about how we use AI in our work and daily lives.

Mayur, Content Manager: My Downloads folder had been filled with random files for the last six months, and I’d been procrastinating on cleaning it up. So I gave Claude Cowork access to the folder, and asked it to “help me organize this folder.”

In a few minutes, it went through almost 100GB of files, arranged them by file type and folders, and removed duplicate files in just a few minutes, something that could have cost me hours of work.

Shubham, Editor: I used Claude to help my brother-in-law build a Shopify website for his e-commerce store. By sharing screenshots, design references, and business requirements, I had Claude generate Liquid code, troubleshoot theme issues, modify page layouts, and explain exactly where changes needed to be made within Shopify.

Instead of digging through forums and documentation for every issue, Claude acted like an on-demand Shopify developer, helping with everything. It sped up development, allowing the site to move from concept to launch much faster.

AI TRAINING

📆 Build an AI secretary that plans your day

The Rundown: In this guide, you will learn how to use Codex or Claude Code to build an AI taskmaster that checks Slack, Gmail, and your calendar every morning. It turns the mess into a prioritized to-do list that gets better every day you use it.

Step-by-step:

Create a folder, open Claude Code inside it, and ask it to create a skill that looks at your Slack, Gmail, and calendar daily, prioritizes tasks high to low, and puts them at the top of MonoNote.md with date, feedback, status checkboxes
Tell the agent to create MonoNote.md and task-rules.md, and have it add any high/low priority rules you have per task into task-rules.md
Run the skill and review the list. MonoNote.md should have today’s date at the top, grouped tasks, source links, and checkboxes you can use all day
After first run, ask the agent to create the automation. It will review yesterday’s feedback, roll over bumped tasks, update the rules, then generate the new list

Pro tip: Create a weekly audit skill that scans your task list, finds repeated tasks, and suggests which ones you could automate with AI.

PRESENTED BY UNWRAP

🎧 How Oura listens to its customers with AI

The Rundown: What does it actually look like to build a product around your customers, not just say you do? Oura has done it. Join the Oura team on May 27 to learn about how they unify member feedback across product, engineering, and leadership, and how real member voices shape every decision they make.

What you’ll learn in the session:

The role AI plays in surfacing what members are actually saying
The workflows turning customer input into roadmap decisions
Lessons for any team where customer feedback is part of the job
Live Q&A with leaders from both Oura and Unwrap

Save your spot here. If you can’t make it on the 27th, no worries! Register, and you’ll automatically get the recording after the session.

ANTHROPIC

🛡️ Claude Mythos finds 10,000+ critical vulnerabilities

Image source: Anthropic

The Rundown: Anthropic shared the first results from Project Glasswing, revealing that Claude Mythos Preview and its ~50 partners have found 10,000+ high- or critical-severity vulnerabilities in just one month.

The details:

Cloudflare alone found 2K bugs with a false positive rate better than human testers. Mozilla found and fixed 271 vulnerabilities in Firefox 150.
Anthropic also scanned 1,000+ open-source projects, with Mythos flagging 6,202 as high/critical. After independent triage, 62% (or nearly 3,900) held up.
Mythos detection went beyond vulnerability flagging, with one partner bank using Mythos to detect and block a $1.5M fraudulent wire transfer.
Now, Glasswing will expand to additional partners, including U.S. and allied governments, with a general release of Mythos-class models to follow.

Why it matters: Anthropic says Mythos remains gated because no company — including itself — has safeguards strong enough to prevent misuse. But with OpenAI ramping up its cyber models and Chinese players catching up, equally capable (or better) AI will emerge. When it does, how fast the world can patch will be the real test.

QUICK HITS

🛠️ Trending AI Tools

⚡ CData Connect AI - Give ChatGPT, Claude, Copilot, or any AI tool live, governed, read & write access to your business data in one unified layer*
🤖 DeepSeek V4-Pro - DeepSeek’s flagship AI with 9x less cost than rivals
🚀 Gemini 3.5 Flash - Google’s new flash model, 4x faster at half price
🧠 Polsia - AI co-founder that plans, builds, and operates businesses 24/7

*Sponsored Listing

📰 Everything else in AI today

DeepSeek permanently cut V4-Pro pricing by 75%, bringing it down to $0.435 per million input tokens and $0.87 per million output tokens, far below closed-source rivals.

Perplexity open-sourced Bumblebee, a scanner for macOS and Linux that checks for risky packages, extensions, and AI tool configs during supply-chain incidents.

NVIDIA released NV-Generate-MR-Brain, a foundation model that generates synthetic 3D brain MRI scans and annotations, to accelerate medical imaging AI development.

McKinsey is rethinking its billing model as AI reduces the value of billable hours and clients demand fees tied to business outcomes, FT reported.

The White House approved $9B to help U.S. spy agencies acquire advanced AI chips amid concerns they’re falling behind in deploying frontier models, the NYT reported.

Starbucks scrapped its AI inventory system after nine months, citing persistent miscounts and mislabeled products across North American stores.

COMMUNITY

🤝 Community AI workflows

Every newsletter, we showcase how a reader is using AI to work smarter, save time, or make life easier.

Today’s workflow comes from reader Alicia in Fresno, CA:

“After searching on my own for universities across multiple websites, comparing options, and trying to find a program that may work with my full-time job and income level, I asked ChatGPT to help me search and compare the universities’ programs, costs, and whether they are in-person or online.

I then asked ChatGPT to build questions to ask an admissions counselor. ChatGPT has helped me come up with all the important questions I had not thought about, helping me to narrow down my decision of which school program may better fit my current living and working situation.”

How do you use AI? Tell us here.