NVIDIA Triton Inference Server Running A100 Tensor Core GPUs Boosts Bing Advert Delivery

by Winston7 June 20238 June 20230870

Inference software enables shift to NVIDIA A100 Tensor Core GPUs, delivering 7x throughput for the search giant. Jiusheng Chen’s team just got accelerated. They’re delivering personalized ads to users of Microsoft Bing with 7x throughput at reduced cost, thanks to NVIDIA Triton Inference Server running on NVIDIA A100 Tensor Core GPUs. It’s an amazing achievement for the principal software engineering manager and his crew.

Tuning a Complex System
Bing’s ad service uses hundreds of models that are constantly evolving. Each must respond to a request within as little as 10 milliseconds, about 10x faster than the blink of an eye. The latest speedup got its start with two innovations the team delivered to make AI models run faster: Bang and EL-Attention. Together, they apply sophisticated techniques to do more work in less time with less computer memory. Model training was based on Azure Machine Learning for efficiency.

Flying With NVIDIA A100 MIG
Next, the team upgraded the ad service from NVIDIA T4 to A100 GPUs. The latter’s Multi-Instance GPU (MIG) feature lets users split one GPU into several instances. Chen’s team maxed out the MIG feature, transforming one physical A100 into seven independent ones. That let the team reap a 7x throughput per GPU with inference response in 10 ms.

Flexible, Easy, Open Software
Triton enabled the shift, in part, because it lets users simultaneously run different runtime software, frameworks and AI modes on isolated instances of a single GPU. The inference software comes in a software container, so it’s easy to deploy. And open-source Triton – also available with enterprise-grade security and support through NVIDIA AI Enterprise – is backed by a community that makes the software better over time.

Accelerating Bing’s ad system with Triton on A100 GPUs is one example of what Chen likes about his job. He gets to witness breakthroughs with AI.

While the scenarios often change, the team’s goal remains the same – creating a win for its users and advertisers.

Intel Announces Intel Arc Pro A60 and Pro A60M GPUs

Winston

Winston has over 25 years of experience in the I.T. Industry. He launched Funky Kit with the aim to capture a wider audience worldwide. His knowledge in PC hardware is very distinguished, not only publishing enjoyable reviews but also writing great articles.

Comments
Facebook comments

Razer Officially Launches PC Remote Play

ASUS Republic of Gamers Announces New Gaming Peripherals

Razer Expands Premium Laptop Accessory Range with New Adjustable Aluminium Stand

CORSAIR Launches New Web-Based Firmware Update Utility, Enabling Updates Without Additional Software

Cooler Master MasterHUB Review

Elgato Stream Deck + and XLR Dock Bundle Review

DJI Osmo Pocket 3 Creator Combo Review

Elgato Key Light Neo Review

The Funky Kit Show LIVE Ep.335 – Gigabyte B850 AORUS Elite WiFi7, GeForce RTX 5060…

Prize Giveaway #207 – Win a ASRock Z790 LiveMixer Motherboard

Our Podcast Show Ep.125 – Apple WWDC Rumours & Garmin Pisses Off Users

The Funky Kit Show LIVE Ep.334 – TRYX Panorama SE 360 AIO, Nintendo Switch 2,…

Gigabyte B860 AORUS ELITE WiFi7 ICE Motherboard Review

Gigabyte B850 AORUS ELITE WiFi7 Motherboard Review

MSI MEG Ai1600T PCIE5 Power Supply Review

TRYX PANORAMA SE 360 ARGB AIO CPU Cooler Review

Prize Giveaway #207 – Win a ASRock Z790 LiveMixer Motherboard

Prize Giveaway #205 – Win a ASRock B650M PG Riptide WiFi Motherboard

Prize Giveaway #204 – Win a Gigabyte Z790 AORUS ELITE X WIFI7 Motherboard

Prize Giveaway #203 – Win a ASRock B650M PG Riptide WiFi Motherboard

Prize Giveaway #202 – Win a ASRock Z790 PG SONIC Motherboard

Computex 2024: MSI Cubi NUC, MEG Vision X AI

Computex 2024: Adata, Deepcool, Enermax, Noctua, Raijintek, TeamGroup

Computex 2024: Day 4 – Asus

Computex 2024: Day 4 – Thermaltake

NVIDIA Triton Inference Server Running A100 Tensor Core GPUs Boosts Bing Advert Delivery

Winston

Leave a Comment Cancel Reply

Gigabyte B860 AORUS ELITE WiFi7 ICE Motherboard Review

APNX V1 PC Chassis Review (including APNX fans and PC build)

Gigabyte B850 AORUS ELITE WiFi7 Motherboard Review

MSI MEG Ai1600T PCIE5 Power Supply Review

Crucial P310 2TB NVMe M.2 2230 SSD Review

Colorful GeForce RTX 5070 NB EX 12GB-V Graphics Card Review

PNY Unveils NVIDIA GeForce RTX 5060 Family of Graphics Cards

ZOTAC GAMING Announces GeForce RTX 5060 Graphics Card Series

CORSAIR Launches Upgraded HXi Series PSUs with Enhanced Cables and Dual-Color 12V-2×6...

MSI Releases the Custom NVIDIA GeForce RTX 5060 Series Graphics Cards

Acer Debuts Nitro Gaming PCs Featuring Latest NVIDIA GeForce RTX 50 Series...

OneOdio Unveils Studio Max 1 Wireless DJ Headphones – Flagship Product Set...

NVIDIA Triton Inference Server Running A100 Tensor Core GPUs Boosts Bing Advert Delivery

Related posts

Leave a Comment Cancel Reply