It indicates an expandable section or menu, or sometimes previous / next navigation options. Homepage

Inside Amazon's struggle to crack Nvidia's AI-chip dominance

Photo collage featuring Andy Jassy, CEO of Amazon, holding a GPU chip in front of Nvidia Headquarters
Amazon CEO Andy Jassy. Andrej Sokolow/Getty Images; F. Carter Smith/Bloomberg via Getty Images; Alyssa Powell/BI
  • Amazon's AI chips lag far behind Nvidia GPUs, with low adoption rates among large cloud customers.
  • Nvidia's dominance with its CUDA platform poses significant challenges for Amazon's AI chips.
  • Amazon is working with the open-source community to improve AI-chip adoption and market share.
Advertisement

Amazon is struggling to compete with Nvidia's dominant AI chips, with low usage, "compatibility gaps," and project-migration issues putting millions of dollars in cloud revenue at risk, according to confidential internal documents and people familiar with the situation.

It's another sign of how far behind the cloud giant is in the generative-AI race. This also shows how hard it will be for tech companies including Microsoft, OpenAI, Google, Meta, and Amazon to break Nvidia's grip on this huge and important market.

Stacy Rasgon, Bernstein Research's veteran chip analyst, said every major tech company wanted a piece of Nvidia's business but that no one had been able to make a mark.

"I'm not aware of anyone using AWS chips in any sort of large volumes," Rasgon told Business Insider, referring to Amazon's AI chips.

Advertisement

AI platforms and AWS profitability

AWS CEO Adam Selipsky with Nvidia CEO Jensen Huang
Amazon Web Services' outgoing CEO, Adam Selipsky, with Nvidia CEO Jensen Huang. Amazon

Amazon Web Services is the leading cloud company, renting access to computing power and storage over the internet. Part of its success, especially its profitability, is based on designing its own data-center chips, rather than paying Intel for these pricy components.

In the new generative-AI era, Amazon is trying to pull this off again by making its own AI chips, known as Trainium and Inferentia. This time, the idea is to avoid paying for expensive Nvidia GPUs, while still providing cloud customers with powerful AI services.

"We see Trainium following a similar path where Trainium1 is allowing us to get meaningful internal and external usage and feedback from our customers which is allowing us to innovate on their behalf," an Amazon spokesperson said.

Still, Amazon is at least four years into this AI chip effort, and so far, Nvidia is proving a harder nut to crack.

Advertisement

Nvidia spent more than a decade building a platform called CUDA to make it easier for developers to use its graphics processing units. The industry has gotten used to this. Amazon now has to somehow unwind all those habits and complex technical interrelationships.

Back when Amazon took on Intel's chips in the first cloud boom, AWS was the pioneer, creating the platform that established standards and processes. This time, it's Nvidia that has the beachhead. If Amazon can't breach this in a significant way, it may be stuck paying Nvidia, and AWS profitability could suffer.

'Friction points and stifled adoption'

The enormity of this challenge is revealed in the internal Amazon documents BI obtained and by the people familiar with the company's AI-chip efforts. These people asked not to be identified discussing sensitive private matters.

Last year, the adoption rate of Trainium chips among AWS's largest customers was just 0.5% that of Nvidia's GPUs, according to one of the internal documents. This assessment, which measures usage levels of different AI chips through AWS's cloud service, was prepared in April.

Advertisement

Inferentia, an AWS chip designed for a type of AI task known as inference, was only slightly better, at 2.7% of the Nvidia usage rate.

The internal document said large cloud customers had faced "challenges adopting" AWS's custom AI chips, in part due to the strong appeal of Nvidia's CUDA platform.

"Early attempts from customers have exposed friction points and stifled adoption," the document, marked "Amazon Confidential," explained.

This contradicts the more upbeat outlook Amazon executives have shared about AWS's AI-chip efforts. In an April earnings call, CEO Andy Jassy said demand for AWS's custom silicon was "quite high," and his annual shareholder letter this year name-checked a number of early customers, such as Snap, Airbnb, and Anthropic.

Advertisement

Even the Anthropic example has a big asterisk next to it. Amazon invested about $4 billion in this AI startup, and part of that deal requires Anthropic to use AWS's in-house AI chips.

Amazon's spokesperson said some parts of the internal documents were "not accurate," without providing details. "The truth is we are encouraged by the progress we're making with our custom AI chips and the feedback we're getting from customers on every aspect of our work," the spokesperson added.

The roots of Amazon's chip business

Amazon Web Services CEO Adam Selipsky
Selipsky talks about the company's Graviton chip at a re:Invent conference. Amazon

Amazon's chip business started in earnest when it acquired the startup Annapurna Labs in 2015 for roughly $350 million. This helped the tech giant design its own chips. It now offers Arm-based Graviton central processing units for most non-AI computing tasks. Inferentia debuted in 2018, and Trainium first came out in 2020.

"We shipped our first version of Graviton in 2018 and got very valuable customer feedback that we applied to Graviton2, which is when we saw adoption soar," the Amazon spokesperson said.

Advertisement

Other tech giants, including Google and Microsoft, are also designing custom AI chips. At the same time, the three major cloud firms, AWS, Microsoft Azure, and Google Cloud, are some of the largest Nvidia customers, as they sell access to those GPUs through their own cloud services.

The in-house AI-chip efforts have yet to make a major dent in Nvidia's grip on the market. On Wednesday, Nvidia reported another blowout quarter, more than tripling revenue from a year ago. The chipmaker is valued at about $2.5 trillion. That's at least $500 billion more than Amazon. Nvidia now accounts for roughly 80% of the AI-chip market, the research firm Omdia found.

AWS's AI chips are still relatively new, so Amazon measures their success in terms of their overall usage and positive customer feedback, both of which "are growing well," rather than their share of workloads, Amazon's spokesperson said.

The sheer amount of GPU capacity AWS has built up over the past decade contributes to "very large usage" of Nvidia chips, the spokesperson said.

Advertisement

"We're encouraged by the progress we're making with our custom AI chips," they said in a statement. "Building great hardware is a long term investment, and one where we have experience being persistent and successful."

'Parity with CUDA'

Internally at Amazon, Nvidia's CUDA platform is repeatedly cited as the biggest roadblock for the AI-chip initiative. Launched in 2006, CUDA is an ecosystem of developer tools, AI libraries, and programming languages that makes it easier to use Nvidia's GPUs for AI projects. CUDA's head start has given Nvidia an incredibly sticky platform, which many consider the secret sauce behind the company's success.

Amazon expects only modest adoption of its AI chips unless its own software platform, AWS Neuron, can achieve "improved parity with CUDA capabilities," one of the internal documents said.

Neuron is designed to help developers more easily build on top of AWS's AI chips, but the current setup "prevents migration from NVIDIA CUDA," the document added.

Advertisement

Meta, Netflix, and other companies have asked for AWS Neuron to start supporting fully sharded data parallel, a type of data-training algorithm for GPUs. Without that, these companies won't "even consider" using Trainium chips for their AI-training needs, according to this internal Amazon document.

Amazon's spokesperson told BI that AWS provided cloud customers "the choice and flexibility to use what works best for them," rather than forcing them to switch.

Nvidia's early investment in CUDA makes it an "essential tool" to develop with GPUs, but it's "uniquely focused on Nvidia's hardware," the Amazon spokesperson added. AWS's goal with Neuron is not necessarily to build parity with CUDA, the spokesperson said.

Snowflake's CUDA decision

Sridhar Ramaswamy speaks on stage at a Collision conference in Toronto, Canada.
Sridhar Ramaswamy speaks at a Collision conference in Toronto. Eóin Noonan/Getty Images

Snowflake, a leading data-cloud provider, is one of the companies that chose Nvidia GPUs over Amazon's AI chips when training its own large language model, Arctic. That's even though Snowflake is a major AWS cloud customer.

Advertisement

Snowflake CEO Sridhar Ramaswamy told BI that familiarity with CUDA made it hard to transition to a different GPU, especially when doing so could risk dealing with unexpected outcomes.

Training AI models is expensive and time-consuming, so any hiccups can create costly delays. Picking what you already know — CUDA, in this case — is a no-brainer at the moment for many AI developers.

The cost efficiency and advanced performance of Nvidia GPUs may be "years" ahead of the competition, Ramaswamy said.

"With AWS, we have the broadest range of compute choices," he added. "Most of our compute on AWS takes place on AWS Graviton for the price-performance benefits, and we trained Arctic on Amazon EC2 P5 instances to use NVIDIA silicon because we had architected it on CUDA." (EC2 P5 gives AWS customers cloud access to Nvidia H100 GPUs).

Advertisement

AWS can still generate revenue when customers use its cloud services for AI tasks — even if they choose the Nvidia GPU options, rather than Trainium and Inferentia. It's just that this might be less profitable for AWS.

Failure to replace GPUs

Explosive demand for Nvidia chips is also causing a GPU shortage at Amazon, according to one of the internal documents and the people familiar with the matter.

An obvious response to this would be to have cloud customers use Amazon's AI chips instead. However, some of the largest AWS customers have not been willing to use these homegrown alternatives, the documents said.

Inferentia chips fall behind the performance and cost efficiencies of Nvidia GPUs, some customers have told Amazon, and their performance issues have been escalated internally, according to this document.

Advertisement

The failure to replace customer demand for Nvidia GPUs with that for AWS's offerings has resulted in delayed services and put millions of dollars in potential revenue at risk, the document said.

AWS discussed splitting workloads across different regions and setting up more flexible and higher-priced blocks of Nvidia GPU capacity to alleviate the shortage issue, according to this document.

Amazon even uses Nvidia GPUs for its own projects

Amazon VP and distinguished engineer James Hamilton
Amazon VP and distinguished engineer James Hamilton Amazon

Even some internal Amazon AI projects rely on Nvidia GPUs, rather than AWS's homegrown chips.

Earlier this year, Amazon's retail team used a cluster of Nvidia GPUs, including V100s, A10s, and A100s, to build a model for a new AI image-creation tool, another internal document showed. There was no mention of AWS chips.

Advertisement

In January, James Hamilton, an Amazon senior vice president and distinguished engineer, gave a presentation on machine learning and said one of his projects used 13,760 Nvidia A100 GPUs.

It's unclear how well AWS AI chips are doing financially since Amazon doesn't break out specific cloud-segment sales. But in April, Jassy disclosed that its array of AI products was on pace to generate multibillion-dollar revenue this year. That's important for AWS, as its growth rate has stagnated in recent years, though it bounced back in recent quarters, especially on an absolute-dollar basis.

Amazon is following a sales strategy of making a variety of AI chips available through its cloud service. According to an internal sales guideline seen by BI, Amazon sales reps are encouraged to mention access to both high-end Nvidia GPUs and more affordable AWS chips when selling AI compute services.

One person familiar with the situation told BI that Amazon saw a big enough opportunity in the lower-end market, even if that means ceding top-of-line customers to Nvidia for now.

Advertisement

Prioritizing open source

The AI-chip market is forecast to more than double to $140 billion by 2027, the research firm Gartner found.

To get a larger share of the market, Amazon wants to work more closely with the open-source community, according to one of the internal documents obtained by BI.

For example, AWS's AI chips still have "compatibility gaps" in certain open-source frameworks, making Nvidia GPUs a more popular option. By embracing open-source technologies, AWS can build a "differentiated experience," it said.

As part of this effort, Amazon is prioritizing support for open-source models like Llama and Stable Diffusion, one of the people said.

Advertisement

Amazon's spokesperson said AWS had partnered with Hugging Face, a hub for AI models, and was a founding member of an open-source machine-learning initiative called OpenXLA to make it easier for the open-source community to take advantage of AWS AI chips.

Don't count Amazon out

Despite Amazon's AI-chip struggles, the effort seems to have caught the attention of Nvidia CEO Jensen Huang.

During an interview earlier this year, Huang was asked about Nvidia's competitive landscape.

"Not only do we have competition from competitors, we have competition from our customers," Huang said, referring to cloud giants such as AWS that sell access to both Nvidia GPUs and their own chips. "And I'm the only competitor to a customer fully knowing they are about to design a chip to replace ours."

Advertisement

Do you work at Amazon? Got a tip?

Contact the reporter, Eugene Kim, via the encrypted-messaging apps Signal or Telegram (+1-650-942-3061) or email (ekim@businessinsider.com). Reach out using a nonwork device. Check out Business Insider's source guide for other tips on sharing information securely.

Amazon Microsoft Google
Advertisement
Two crossed lines that form an 'X'. It indicates a way to close an interaction, or dismiss a notification.

Jump to

  1. Main content
  2. Search
  3. Account