Microsoft Unveils Homegrown AI Stack to Reduce OpenAI Reliance

Microsoft is assembling the pieces to make its own intelligence the default
The company's strategy extends beyond models to control over pricing and product roadmap.

Microsoft unveiled MAI-Thinking-1 reasoning model and six others trained from scratch, claiming performance matching or exceeding Claude and GPT models in independent evaluations. New Cobalt 200 processors and Maia inference accelerators co-designed with models promise cost efficiency, with one tuning example showing 87% task completion versus 13% baseline.

  • Seven in-house MAI models trained from scratch, including MAI-Thinking-1 reasoning model with 35 billion active parameters
  • Cobalt 200 processors and Maia 200 inference accelerators co-designed with models; Frontier Tuning example showed task completion jump from 13% to 87%
  • Most products remain in private or expanded private preview; independent benchmarking not yet available
  • Majorana 2 quantum chip claims 20-second qubit lifetime and path to 1 million qubits by 2029
  • Microsoft still depends on Nvidia for frontier-scale training compute

Microsoft launched seven in-house AI models, custom silicon, and quantum chips at Build 2026, signaling a strategic shift toward self-sufficiency and reduced dependence on OpenAI and Anthropic partnerships.

Microsoft walked onto the stage at Build 2026 in San Francisco with a message that had been building for months: the company no longer needs to rent its artificial intelligence from anyone else. Over two days, the software giant unveiled seven homegrown AI models, announced custom processors designed specifically for its own systems, demonstrated a next-generation quantum chip, and wrapped everything into a platform that runs across Windows, Azure and GitHub. The throughline was unmistakable—ownership. After years of building Copilot on top of OpenAI's technology, and more recently leaning on Anthropic, Microsoft was declaring itself capable of supplying its own intelligence, its own silicon, and its own runtime.

Mustafa Suleyman, who leads Microsoft's AI division, presented the centerpiece: seven models spanning reasoning, coding, image generation, voice and transcription, all trained from scratch on licensed data without borrowing from competitors' work. The flagship reasoning model, called MAI-Thinking-1, uses a sparse mixture of experts design with roughly 35 billion active parameters and can handle a context window of 256,000 tokens. It remains in private preview for now, available through Microsoft Foundry rather than to the general public. According to Microsoft's own testing, human raters preferred it to Anthropic's Claude Sonnet 4.6, and it matched Claude Opus 4.6 on the SWE-bench Pro coding benchmark—though these results come from Microsoft's internal evaluations and have not yet been tested independently. A smaller, more efficient coding model called MAI-Code-1-Flash is already reaching GitHub Copilot users inside the editor.

The company frames its model development as a hill-climbing machine, a training pipeline designed to improve with each cycle as computing power scales globally. This framing connects directly to silicon. Microsoft co-designed the MAI models with its Maia 200 inference accelerator and reported efficiency gains from pairing the two. The company also introduced Frontier Tuning, a technique that applies reinforcement learning within a customer's own compliance boundary, allowing a model to adapt to how a specific business actually operates. In one internal example, task completion jumped from 13 percent to 87 percent after tuning, and a version adapted for Excel work matched a frontier OpenAI model at up to ten times lower cost. Again, these figures come from Microsoft and await outside verification.

The model announcements sat atop a broader infrastructure push. Azure Cobalt 200 virtual machines, now in preview, can deliver up to a 50 percent improvement in processor performance depending on the workload, and Microsoft is targeting them at Linux-based agentic AI. The company also added Azure HorizonDB, a Postgres-compatible service for AI applications with vector search and connections into Foundry and Fabric. A version of Fabric with GPU acceleration ran up to seven times faster than three rival cloud warehouses in Microsoft's internal testing during May. On the agent side, Microsoft moved its Agent 365 software development kit to general availability and reorganized its knowledge layer around Foundry IQ, which unifies Work IQ, Fabric IQ, Azure SQL, file search and external sources, with Web IQ added for live web grounding. A new GitHub Copilot desktop app pushes Copilot beyond chat into managing tasks and pull requests. Microsoft also showed MDASH, a multi-model scanning system in expanded private preview that pairs Defender with GitHub to find and fix vulnerabilities, alongside Windows containers that isolate agents under policy. The Surface RTX Spark Dev Box, built with Nvidia, delivers roughly one petaflop of local AI compute, and a concept device called Project Solara imagines machines that run agents in place of applications. The most distant bet was Majorana 2, Microsoft's next quantum chip, which the company claims has an average qubit lifetime of 20 seconds and reliability 1,000 times higher than its previous generation, with a path toward one million qubits on a chip that fits in a palm.

Microsoft's push toward independence makes strategic sense. The company's reliance on OpenAI has defined its AI strategy since 2023, when it built Copilot on GPT models and committed billions to the partnership. Owning models and co-designing them with its own Maia and Cobalt chips gives Microsoft room to negotiate on cost and to set its own roadmap rather than wait on a partner. That positioning moves it closer to Google, which pairs Gemini with custom tensor processing units, and Amazon, which pairs Nova models with Trainium silicon. Both rivals have argued for years that owning the full stack lowers cost and tightens integration, and Microsoft is now making that case with its own parts.

Yet the partnerships on stage were a reminder that independence has limits. Satya Nadella appeared with Nvidia's Jensen Huang and Qualcomm's Cristiano Amon, because Microsoft still depends on Nvidia for training compute and on chip partners for the devices that run its agents. Self-sufficiency in models does not extend to the silicon that trains them at frontier scale. The benchmarks represent the most obvious gap. Every performance figure Microsoft shared came from its own evaluations, and much of what the company showed is not yet generally available. MAI-Thinking-1 is in private preview, the Cobalt 200 virtual machines are in preview, MDASH is in expanded private preview and Project Solara is a concept device, so buyers cannot test most of these claims against their own workloads today. The Frontier Tuning result rests on a single internal example, which is encouraging but far from a pattern across industries. The quantum timeline is years out, and quantum roadmaps across the industry have a long history of slipping. MAI also does not replace OpenAI or Anthropic inside Copilot today, where those models still handle most production traffic, so the practical dependency remains even as the strategic one loosens.

For buyers, none of this calls for ripping anything out. The right read is optionality. Microsoft is giving customers a path to run first-party models where cost or data residency matters, while keeping OpenAI and Anthropic available for the work those models do best. Watch Frontier Tuning closely if your organization runs high volumes of repetitive, well-defined tasks, because a model tuned inside your compliance boundary at lower cost is a real budget lever once the benchmarks hold up outside Microsoft's labs. The harder work is governance. Agents that act across Windows, Azure and GitHub need identity, policy and audit controls in place before they scale, not after. Decision makers who treated Build as a model launch missed the larger move. Microsoft is assembling the pieces to make its own intelligence the default inside its own platform, and the leverage that creates over pricing and roadmap will outlast any single benchmark.

Microsoft is assembling the pieces to make its own intelligence the default inside its own platform, and the leverage that creates over pricing and roadmap will outlast any single benchmark.
— Analysis from the announcement
A Conversa do Hearth Outra perspectiva sobre a história
Inventor

Why does Microsoft need its own models if OpenAI's are already working inside Copilot?

Model

Cost and control. If you're running billions of Copilot queries, you're paying OpenAI for each one. A tuned model running on your own infrastructure, or on Azure at a lower rate, changes the economics. But it's also about roadmap. Right now Microsoft has to wait for OpenAI to ship features. With its own models, it sets the pace.

Inventor

The benchmarks all come from Microsoft. How much should we trust them?

Model

Not much, yet. That's the honest answer. MAI-Thinking-1 looks competitive on paper, but it's in private preview. No one outside Microsoft has run it on their own workloads. The Frontier Tuning example—jumping from 13 percent to 87 percent—is a single data point. It's encouraging, but it's not a pattern.

Inventor

So this is all vaporware?

Model

No. The Cobalt 200 chips are real and in preview. GitHub Copilot is shipping code today. But most of what Microsoft showed is not yet in customers' hands. That's the gap between strategy and execution.

Inventor

What about the quantum chip?

Model

Majorana 2 is years away. Quantum roadmaps slip constantly. Microsoft is claiming 20-second qubit lifetimes and a path to a million qubits on a palm-sized chip by 2029. That's ambitious. We'll know more when it actually ships.

Inventor

If Microsoft owns the models and the chips, does it still need Nvidia?

Model

For now, yes. Nvidia trains the frontier models. Microsoft's Maia accelerators are for inference—running models, not training them. To build the next generation of MAI models, Microsoft still needs Nvidia's GPUs. That dependency doesn't go away.

Inventor

What should a CIO actually do with this announcement?

Model

Watch Frontier Tuning. If you have repetitive, well-defined tasks—think customer service, data entry, routine coding—a model tuned inside your own compliance boundary at lower cost is real money. But don't rip out OpenAI yet. The practical dependency is still there. This is about optionality, not replacement.

Quer a matéria completa? Leia o original em Forbes ↗
Fale Conosco FAQ