The Rundown AI / Articles / AI / Exclusive: Microsoft's hybrid AI vision
AI

Exclusive: Microsoft's hybrid AI vision

PLUS: Patients control AI and robotics with thought

Rowan Cheung

June 1, 2025

Read Online | Sign Up

Good morning, AI enthusiasts. Microsoft has been all over the news lately with its bold vision to completely rebuild Windows PCs around AI.

But while everyone's been debating whether AI should live in the cloud or on your device, the tech giant quietly chose both through a unique hybrid AI approach.

So, we partnered up with Microsoft and Pavan Davuluri, Corporate Vice President of Windows and Devices, to find out more through an exclusive Q&A on Copilot+ PCs, NPU architecture, and the future roadmap of Windows.


In today’s AI rundown:

  • Microsoft’s bold hybrid AI vision

  • Enabling on-device AI experiences

  • How AI workloads are distributed

  • Windows evolves toward autonomous AI agents

EXCLUSIVE Q&A PAVAN DAVULURI

HYBRID AI

👀 Microsoft’s bold hybrid AI vision

Image source: Kiki Wu / The Rundown

The Rundown: Microsoft is fundamentally restructuring Windows around a hybrid AI architecture that dynamically routes workloads between local neural processing units (NPUs) and cloud compute—positioning itself to control both ends of the spectrum.

Cheung: “Why is Windows betting on a hybrid AI approach that blends both local and cloud together?”

Davuluri: “Our thesis, when we started the Copilot+ PC journey last year, was to bring highly accelerated AI compute to the edge in an energy-efficient form factor.”

Davuluri added: “The long-term vision and true differentiation will stem from our ability to compute and provide context appropriately for the underlying experience, whether it be client-based, cloud-based, or a combination of both.”

Cheung: “When Microsoft introduced Copilot+ PCs last year, it established a 40+ TOPS NPU as the new performance benchmark for AI PCs. What was the rationale behind this requirement?”

Davuluri: “We believe technology should adapt to you, not the other way around, and to make the vision a reality, we needed to raise the bar for what was possible to run sustained AI workloads on a device.”

Davuluri added: "We had some intuition on the trajectory of how AI and AI-compute silicon were evolving and given memory boundedness at scale—where we would have a requirement that was scalable and still pushed what was possible on client silicon."

Why it matters: Microsoft is building infrastructure to capture value from AI workloads whether AI's future is local, cloud, or both. By designing Copilot+ PCs that scale with advancing models and forcing the industry to meet their 40+ TOPS standard, the company is betting that their hardware becomes more valuable over time.

PRACTICAL BENEFITS

🖥️ Enabling on-device AI experiences

Image source: Kiki Wu / The Rundown

The Rundown: Microsoft is now delivering AI experiences that run entirely on-device through Copilot+ PCs, breaking the traditional model where advanced AI features require cloud subscriptions, usage tokens, or constant internet connectivity.

Cheung: "Day-to-day, what tangible improvements can users actually feel from NPU-powered Windows?"

Davuluri: "Copilot+ PCs are the only PCs where you can find professional-grade AI editing tools like Relight and super resolution in Photos, or Cocreator, with no subscription or tokens required."

Davuluri added: "You can run AI efficiently without draining your battery, without an internet connection, and when you want the security promise of your data staying on device, you get that too."

Cheung: "Why is running local models a benefit to users?"

Davuluri: "Local models on a device have advantages that can complement cloud-powered AI experiences, particularly in areas of privacy, latency, and offline usage—a good example being our first agent in Windows, the agent in Settings."

Davuluri added: "As local SLMs improve with reasoning capabilities, the potential applications increase—especially if you look at the benchmarks of something like Phi-4 Reasoning and the fact that we can now run 14B parameter models on-device.”

Why it matters: With on-device NPUs, Microsoft can run AI locally and eliminate some subscription fees while delivering enhanced privacy and performance. As small language models (SLMs) continue to advance, Copilot+ PCs shift users toward highly capable AI they truly own rather than ‘rent’ through cloud services.

NPU

🧠 How AI workloads are distributed

Image source: Kiki Wu / The Rundown

The Rundown: To handle new AI experiences, Microsoft added a third processor to PCs—the neural processing unit (NPU)—which changes how AI computation works by offloading AI tasks from the CPU and GPU, allowing each to focus on what they do best.

Cheung: “What are the practical advantages of using NPUs versus GPUs or CPUs for AI workloads?”

Davuluri: "CPUs were built to process scalars—multiplying two numbers. When GPUs came along, they were optimized for vectors—multiplying many numbers together in parallel. NPUs are domain specific silicon purpose-built to run computation for models such as neural networks."

Davuluri added: "Because the NPU adds a third processor to your Windows PC, it also adds the unique benefit of freeing the GPU and CPU to do what they're best at, while enabling the AI workloads to run efficiently and consistently in the background."

Davuluri added: “This is the path to pervasive AI.”

Cheung: “What are some concrete examples of how NPU efficiency enables unique experiences for users?“

Davuluri: “A good example is Recall. This is something that runs in the background, doesn't eat up battery life, which is the type of AI feature that enables a host of new experiences that weren't possible before at this price point.”

Davuluri added: "A notable feature of the NPU on the Copilot+ PC is that it is an open platform, which allows for the execution of any model using Windows ML, a high-performance local inference runtime built directly into Windows."

Why it matters: Microsoft's Copilot+ PCs integrate next-gen NPUs from AMD, Intel, and Qualcomm—engineered to offload and accelerate complex AI tasks locally. This enables advanced features like Recall, Live Translations, and Super Resolution, while allowing developers to explore and innovate in ways we’ve only begun to imagine.

AGENTIC FUTURE

🔮 Windows evolves toward autonomous AI agents

Image source: Kiki Wu / The Rundown

The Rundown: Microsoft is building toward a future where Windows becomes an agentic platform, with AI that runs long-running reasoning loops locally, understands context across applications, and can autonomously complete complex tasks.

Cheung: "Over the next few years, where will Copilot+ PCs benefit the most as small language models accelerate? How do you see AI workloads evolving on-device?"

Davuluri: "I think over time what you're going to find is that we're going to evolve to a place where agentic experiences are going to become more front and center for people on a daily basis."

Davuluri added: "Where we see significant potential is AI being able to perform tasks on your PC asynchronously through long-running reasoning loops. This occurs entirely on the PC, allowing efficient computation and reasoning with NPUs."

Cheung: "What do you see as the biggest greenfield opportunity areas for local AI—what use cases will be the most transformational for users?"

Davuluri: "Local AI will be transformational in enabling always-on AI experiences. Things like deep personalization and context setting, making it easier to use the computer to get what you need. Commanding rather than pointing."

Davuluri added: "Just like we do not use the full capacity of our brain in every moment, having a stack that scales AI compute from client to cloud is a core capability we're building into Windows to bring the best of both worlds together."

Why it matters: Microsoft is reimagining Windows as the platform where AI becomes proactive rather than reactive. With local processing handling context while cloud manages reasoning, next-gen experiences could shift from "point and click" to "command and delegate"—changing how we interact with future computers.

Stay Ahead on AI.

Join 1,000,000+ readers getting bite-size AI news updates straight to their inbox every morning with The Rundown AI newsletter. It's 100% free.