If you’re looking for innovative use of AI technology, look to the cloud. Gartner reports, “73% of respondents to the 2024 Gartner CIO and Tech Executive Survey have increased funding for AI.” And IDC says that AI: “will have a cumulative global economic impact of $19.9 trillion through 2030.” But end users aren’t running most of those AI workloads on their own hardware. Instead, they are largely relying on cloud service providers and large technology companies to provide the infrastructure for their AI efforts. This approach makes sense since most organizations are already heavily reliant the cloud. According to O’Reilly, more than 90% of companies are using public cloud services. And they aren’t moving just a few workloads to the cloud. That same report shows a 175% growth in cloud-native interest, indicating that companies are committing heavily to the cloud.
As a result of this demand for infrastructure to power AI initiatives, cloud service providers are finding it necessary to rapidly scale up their data centers. IDC predicts: “the surging demand for AI workloads will lead to a significant increase in datacenter capacity, energy consumption, and carbon emissions, with AI datacenter capacity projected to have a compound annual growth rate (CAGR) of 40.5% through 2027.” While this surge creates massive opportunities for service providers, it also introduces some challenges. Providing the computing power necessary to support AI initiatives at scale, reliably and cost-effectively is difficult. Many providers have found that deploying AMD EPYC CPUs and Instinct GPUs can help them overcome those challenges. Here’s a quick look at three service providers who are using AMD chips to accelerate AI advancements.
Flexible, Cost-Effective Performance at Microsoft Azure
Microsoft offers an extensive line of AMD-powered Azure Virtual Machines (VMs) as part of its lineup of cloud computing services. They’ve tailored the specs of these services to meet a wide variety of different use cases, ranging from general purpose to memory-intensive and storage-optimized to high-performance computing and confidential computing to AI.
Azure customers appreciate the cost-effectiveness of these VMs. For example, Henrik Klemola, Director, Cloud COE at Epicor: “we have been using Azure Virtual Machines featuring the AMD EPYC processor for the past two years to run a number of business-critical applications as part of the Epicor solution portfolio. We have been very pleased with the consistent performance and compelling price-performance that these VMs have been able to deliver. We are looking forward to continuing to benefit from the innovation that Microsoft Azure will make available to us, including the ability to access cost-effective Azure services that are based on the latest AMD EPYC processors.”
Those new innovations include accelerators that support AI and other intensive workloads, like financial analysis, design, and engineering applications. Microsoft explains: “traditional CPU-only VMs often struggle to keep up with these apps, frustrating users and reducing productivity, but until now, deploying GPU-accelerated, on-premises virtual environments has been too costly.” By making the latest AMD advances available in the cloud, Microsoft is providing customers with the performance they need at a price point they can afford.
Scalable Growth at Oracle Cloud Infrastructure
Another major cloud provider using AMD to push the limits of what’s possible in AI is Oracle. In September, Oracle Cloud Infrastructure (OCI) chose AMD Instinct accelerators to power its newest OCI Compute Supercluster instance. Oracle explains that these instances allow customers to: “run the most demanding AI workloads faster, including generative AI, computer vision, and predictive analytics.” They enable the deployment of massive scale-out clusters for training large-scale AI models.
That scalability has been very attractive to Uber, which began migrating to OCI Compute with AMD and OCI AI infrastructure in 2023. “As we continue to grow and enter new markets, we need the flexibility to leverage a wide range of cloud services to help ensure we’re providing the best possible customer experience,” says Kamran Zargahi, Senior Director of Tech Strategy and Cloud Engineering at Uber. “Collaborating with Oracle has allowed us to innovate faster while managing our infrastructure costs. With OCI, our products can run on best-of-breed infrastructure that is designed to support multi-cloud environments and can scale to support profitable growth.”
Next-Generation Deep Learning at Meta
Meta is also heavily investing in cloud data centers to support the AI services it offers users. More than three billion people interact with Meta services like Facebook, Instagram, and WhatsApp every day. And Meta is working to quickly integrate generative AI into all those applications. To accomplish that goal, Meta has invested heavily in AMD technology. In fact, it has deployed more than 1.5 million AMD EPYC CPUs in its servers around the world. AMD Instinct GPUs have also been a part of its success. Kevin Salvadori, VP, Infrastructure and Engineering at Meta, explains: “All Meta live traffic has been served using MI300X exclusively due to its large memory capacity and TCO advantage.”
In addition, AMD chips play a central role in Meta’s Open Hardware vision. Meta notes in a blog post: “Scaling AI at this speed requires open hardware solutions. Developing new architectures, network fabrics, and system designs is the most efficient and impactful when we can build it on principles of openness. By investing in open hardware, we unlock AI’s full potential and propel ongoing innovation in the field.”
Grand Teton, Meta’s next-gen open AI platform supports the AMD Instinct MI300X Platform. That enables Grand Teton to run Meta’s deep learning recommendation models, content understanding, and other memory-intensive workloads.
Expand the limits of what’s possible
If your team is interested in pushing the limits of what’s possible with today’s AI innovations, try an AMD Instinct-based cloud instance from one of the public cloud vendors. Or build AMD Instinct GPUs into your own infrastructure. They provide the performance, low total cost of ownership and ease of adoption to supercharge AI.