AMD Finally Built a Chip That Beats Nvidia. It Still Won't Win — Here's Why.

Why “better” almost never beats “default”

Jun 21, 2026

TLDR;

AMD’s MI355X beats Nvidia’s B200 on inference - AMD’s own benchmarks show ~30% higher throughput on Llama 3.1 405B, and independent testing (SemiAnalysis) found ~40% cheaper per-token economics on one specific workload.
The MI400 series goes further on paper: 432GB of HBM4, 19.6 TB/s bandwidth, a next-gen 2nm-class process. On a spec sheet, AMD wins.
And it still won’t dethrone Nvidia, because Nvidia isn’t selling chips. It’s selling 20 years of software nobody wants to rewrite.
Nvidia still holds ~80% of the AI accelerator market. AMD sits around 5–7%.
“Better” is a product story. “Default” is a switching-cost story. AMD keeps winning the first and losing the second.

For about a decade, the AMD-versus-Nvidia conversation had a tired rhythm to it. AMD announces a chip. The chip looks competitive on paper. Everyone writes “the Nvidia killer is here.” Six months pass. Nothing changes. Nvidia’s market share doesn’t move a single percentage point. Repeat next year.

So I understand if you’ve learned to roll your eyes.

But this time the spec sheet is genuinely uncomfortable for Nvidia. And I want to walk you through why “uncomfortable for Nvidia” and “actually losing to AMD” are two completely different sentences.

First, the part where AMD actually wins

Let me give credit where it’s due, because it’s earned.

On inference: the part of AI that runs a model rather than trains it, AMD’s MI355X beats Nvidia’s B200. Not “is competitive with.” Beats. AMD’s own benchmarks put it roughly 30% ahead in throughput on a big Llama 3.1 model.

And it’s not just the vendor talking its book: SemiAnalysis’s independent testing found the MI355X running about 40% cheaper per token than the B200 on one specific workload — driven by a lower per-GPU rental cost, not magic. (Worth saying plainly: that 40% is one model on one setup, not a universal law. But for a company renting GPUs by the hour, even a workload-specific edge of that size is the whole ballgame.)

Then there’s the MI400 line, which is frankly a monster.

Now, before a chip-literate reader fires up the comment section: yes, I’m holding AMD’s next-gen part up against Nvidia’s current one. That’s deliberate, because the B200 is what’s actually shipping in racks today, and “what can I buy now” is the comparison most buyers actually make. But let’s be honest about the real fight.

AMD’s MI400 won’t race the B200 - it’ll race Nvidia’s Vera Rubin, due in the second half of 2026. And against Rubin (288GB HBM4, ~13 TB/s), AMD’s memory lead shrinks from roughly 2.25x to about 1.5x. Still a lead. Just a narrower one. So when AMD’s marketing slides the B200 numbers next to its MI400, remember they’re comparing to the chip Nvidia is about to replace.

So if the question is “did AMD build a chip that beats Nvidia,” the honest answer in 2026 is: on several metrics that matter, yes - even against the same-generation part.

Now here’s the uncomfortable second half.

Nobody buys a chip. They buy everything attached to it.

Here is the thing the spec-sheet crowd keeps missing.

When a company commits to Nvidia, they are not buying a chip. They are buying CUDA — Nvidia’s software platform — and CUDA launched in 2006. That’s twenty years of libraries, tools, tutorials, Stack Overflow answers, optimized frameworks, and roughly four million developers who already know how it works.

Every major AI framework was tuned for CUDA first. Your engineers learned on CUDA. Your existing models run on CUDA. Your production pipeline is held together with CUDA-shaped duct tape.

So when AMD shows up with a faster, cheaper chip, the real question a CTO asks isn’t “is this chip better?” It’s “how many engineer-months do I burn rewriting and re-validating everything to save 30%?”

And for most companies, the answer is: too many.

That’s the moat. It was never the silicon. It’s the cost of leaving.

“Better” is a product problem. “Default” is a human one.

This is the lesson I keep coming back to with clients, and it goes way beyond chips.

Being the better product is an engineering achievement. Being the default is a behavioral one. The default wins not because it’s superior but because changing it is annoying, risky, and expensive — and humans are extremely good at avoiding annoying, risky, and expensive.

You don’t switch banks because a competitor offers slightly better terms. You switch when staying becomes genuinely painful.

Nvidia has spent twenty years making sure staying is never painful. That’s the actual product.

Where I’d push back on my own argument

I’d be a lazy analyst if I left it there, so let me argue against myself.

The moat is eroding at the edges. Tools like OpenAI’s Triton compiler now generate optimized code for both Nvidia and AMD, which chips away at CUDA’s exclusivity. Inference - the price-sensitive half of the market that’s growing fastest — depends less on hand-tuned CUDA tricks, which is exactly where AMD’s cost advantage bites hardest. Microsoft already runs serious inference workloads on AMD hardware. The wall has cracks.

But cracks are not collapse. Eroding from 80% to, say, 70% over several years is a real win for AMD shareholders - and still not “winning.” It’s becoming a respectable second supplier in a market that desperately wants a second supplier. That’s a great business. It is not a throne.

The verdict

AMD finally built the chip. They earned the headline. On memory, on bandwidth, on inference economics, the hardware argument is theirs to make.

But the AI buildout doesn’t run on the best chip. It runs on the chip everyone already knows how to use. Until AMD makes leaving Nvidia as painless as buying its silicon is appealing, the spec sheet will keep saying “AMD wins” while the market share keeps saying “Nvidia, still.”

Better doesn’t beat default. It never has. Just ask anyone who ever built a superior product and watched the incumbent keep the customers anyway.

Al Anany

Discussion about this post

Ready for more?