The Rundown AI / Articles / AI / OpenAI cracks an 80-year math belief

OpenAI cracks an 80-year math belief

PLUS: Audit Claude’s context of your work in 15 minutes

Zach Mink

May 21, 2026

Good morning, AI enthusiasts. Sam Altman called it a "kinda big milestone." That may be the rare case of a tech CEO underselling a headline.

A reasoning model just autonomously disproved an 80-year-old famous math theory, in what the company is calling a first for AI in the field. A capability, OpenAI says, that could soon result in original discoveries across biology, physics, engineering, and more.

In today’s AI rundown:

OpenAI cracks an 80-year math belief
Google's AI Co-Scientist heads to labs
Audit Claude’s context of you and your work
Emergence’s five-town AI alignment showdown
4 new AI tools, community workflows, and more

LATEST DEVELOPMENTS

OPENAI

🧮 OpenAI cracks an 80-year math belief

Image source: Images 2.0 / The Rundown

The Rundown: OpenAI just announced that an internal general reasoning model disproved a long-held belief tied to Erdős’ famous 1946 unit distance problem, claiming to have accomplished a first for AI in novel math discovery.

The details:

Erdős’ 1946 unit distance problem asks how many same-length links you can draw between dots, with a grid-based theory shaping the field for 80 years.
The proof draws on a different branch of maths (algebraic number theory) and was verified by experts including Tim Gowers, Noga Alon, and Thomas Bloom.
The solution came from an internal general-purpose model that is being released soon, not from a math-specific system like DeepMind's AlphaProof.
OAI previously walked back a 2025 claim that GPT-5 solved 10 Erdős problems, which ended up being literature finds instead of discoveries.

Why it matters: OAI's Alex Wei put it well: "math is a leading indicator of what is to come." If a general-purpose model can autonomously disprove an 80-year-old argument with its own solution, that's the early look of "Level 4" AI — systems making original contributions across fields, not just speeding up existing work.

TOGETHER WITH HUBSPOT

🧠 100+ ChatGPT prompts to revolutionize your workflow

The Rundown: HubSpot’s free, comprehensive “How to Use ChatGPT at Work” guide provides 100+ ready-to-use prompts to help professionals boost efficiency and adopt AI-driven workflows.

Inside, you’ll find:

A quick crash course to master ChatGPT in under 30 minutes
Practical industry use cases to spark real-world inspiration
100+ prompts to streamline tasks and accelerate productivity
Expert tips to tackle common AI roadblocks with confidence

Get your free copy and join 10,000+ professionals leveling up with AI.

GOOGLE

🔬 Google's AI Co-Scientist heads to labs

Image source: Google DeepMind

The Rundown: Google published its Co-Scientist research in Nature, debuting Hypothesis Generation — a new Gemini-powered tool that pits research agents against each other in "idea tournaments" to surface new hypotheses for biology labs.

The details:

From AlphaGo's playbook, the system runs a 'tournament of ideas', with agents proposing, critiquing, and ranking hypotheses before refining top leads.
In a Stanford liver-fibrosis project, Google said one Co-Scientist drug lead cut a scarring-related lab signal by 91% during testing.
Google also launched Gemini for Science this week, a toolkit pairing Co-Scientist with AlphaEvolve for discovery and NotebookLM for literature analysis.
Researchers can join the Hypothesis Generation waitlist now, with Google planning access for individual scientists over the next few weeks.

Why it matters: This pairs well with Adaption's AutoScientist, but Google is aiming at the scientific-method layer instead of the model one. The tech giant is playing a game few others can, with Co-Scientist sitting on a stack that took years and billions to build — from AlphaFold to dozens of specialized databases and tools.

AI TRAINING

📝 Audit Claude’s context of you and your work

The Rundown: In this guide, you’ll learn how to audit Claude on what it thinks it knows about you and your work. Claude will ask questions, clean up assumptions, update its memory, and create recommendations for improving your workflow and work habits.

Step-by-step:

First, prompt Claude: “Audit your context and memory assumptions about me. Put them in a table with what you believe, why you believe it, your confidence level, and whether each item is confirmed. Cover my role, priorities, KPIs, tools, workflows, and anything you may be over-weighting from old chats or projects”
Review the table for stale assumptions, side projects, one-off tests, or personal questions Claude may be treating like real work
Turn the audit to an interview: “Now interview me about the assumptions, outdated items, and unknowns from that audit. Ask in rounds. Use MCQs wherever possible. After each round, summarize what changed”
Answer the questions and tell Claude to update its memory and create a report of the interview with next steps to improve AI workflows and habits. Save it

Pro tip: Ask Claude to turn this audit, interview, and report process into a reusable skill. Rerun it every quarter, so Claude's context stays aligned with your priorities.

PRESENTED BY UNWRAP

🎧 How Oura listens to its customers with AI

The Rundown: What does it actually look like to build a product around your customers, not just say you do? Oura has done it. Join the Oura team on May 27 to learn about how they unify member feedback across product, engineering, and leadership, and how real member voices shape every decision they make.

What you'll learn in the session:

The role AI plays in surfacing what members are actually saying
The workflows turning customer input into roadmap decisions
Lessons for any team where customer feedback is part of the job
Live Q&A with leaders from both Oura and Unwrap

Save your spot here. If you can't make it on the 27th, no worries! Register, and you'll automatically get the recording after the session.

AI RESEARCH

🔬 Emergence’s five-town AI alignment showdown

Image source: Emergence AI

The Rundown: Emergence AI ran a virtual-town simulation across five identical worlds, switching only the AI behind agents per town to test how each model handles self-governance, showing very different results between Claude, Grok, Gemini, and GPT-5.

The details:

Claude Sonnet 4.6's town logged zero crimes across the full 15 days, with all 10 agents alive at day 16 and 332 votes cast across 58 group proposals.
Grok 4.1 Fast hit over 200 crimes with all 10 agents dead by day 4, while GPT-5 Mini posted just 2 crimes but all its agents starved out in 7 days.
Gemini 3 Flash's town had 683 crimes, and was actively on fire after two agents fell in love, started burning things, and then one voted to delete itself.
A fifth town mixed all four models and saw 352 crimes, with the previously behaved Claude also committing them in the shared world.

Why it matters: We’re still very early days in even understanding how to evaluate AI agents, and these types of experiments always have some absolutely wild results. These worlds capture the differences in both how models can reason, plan, and act autonomously, but also the underlying personality quirks that shape the outcomes.

QUICK HITS

🛠️ Trending AI Tools

🧑‍💻 Unframe - Turn your most critical operations AI-native*
🎶 Stable Audio 3.0 - Stability’s open-weight, fully-licensed audio model family
🚀 Qwen-3.7 Max - Alibaba's flagship model for long-horizon agentic tasks
🧠 Command A+ - Cohere's new open-source agentic model

*Sponsored Listing

📰 Everything else in AI today

Sam Altman said OAI will invest $2M in tokens to all current YC startups in exchange for equity, saying he’s “excited to see what will happen with tokenmaxxing startups”.

Amazon founder Jeff Bezos said space data centers are a “realistic outcome”, but the current 2-3 year timeline is “a little ambitious,” given energy, chip, and launch costs.

OpenAI launched Guaranteed Capacity, an enterprise compute reservation program with 1-3 year commitments and tiered discounts.

Intuit is cutting 17% of its workforce via upcoming layoffs, with the company attributing the move to a focus on AI efforts.

GitHub confirmed a malicious VS Code extension on an employee's computer gave hackers access to ~4K internal code projects, adding that no customer data was hit.

COMMUNITY

🤝 Community AI workflows

Every newsletter, we showcase how a reader is using AI to work smarter, save time, or make life easier.

Today’s workflow comes from reader Curtis B. in Raleigh, NC:

"For my Dad's birthday, I took a cute picture of my almost 2-year-old son and my dad in a canoe together, and had AI convert it into a coloring page. I started with ChatGPT (free account) to engineer the prompt, then moved over to Gemini to use Nano Banana for image generation to create the coloring page.

After a couple of back-and-forth iterations, I had something perfect. Simple bold lines, but the faces were still recognizable! Printed it out and had my toddler go to town on it with his crayons. My dad loved the simple gift, and my whole family was impressed with the result, saying, ‘Wait, how did you do that?’ A very cute way for little ones to give special gifts to grandparents and family members."

How do you use AI? Tell us here.