Cloudflare Powers Hyper-Local AI Inference with NVIDIA Accelerated Computing
SAN FRANCISCO, Sept. 27, 2023 — Cloudflare, Inc. today announced its global network will deploy NVIDIA GPUs at the edge combined with NVIDIA Ethernet switches, putting AI inference compute power close to users around the globe. It will also feature NVIDIA’s full stack inference software —including NVIDIA TensorRT-LLM and NVIDIA Triton Inference server — to further accelerate performance of AI applications, including large language models.
Starting today, all Cloudflare customers can access local compute power to deliver AI applications and services using fast and more compliant infrastructure. With this announcement, organizations will be able to run AI workloads at scale, and pay for compute power as needed, for the first time through Cloudflare.
AI inference is how the end user experiences AI and is set to dominate AI workloads. Today, organizations’ have great demand for GPUs. Cloudflare, with data centers in over 300 cities across the world, can deliver fast experiences to users and meet global compliance regulations.
Cloudflare will make it possible for any organization globally to start deploying AI models — powered by NVIDIA GPUs, networking and inference software — without having to worry about managing, scaling, optimizing, or securing deployments.
“AI inference on a network is going to be the sweet spot for many businesses: private data stays close to wherever users physically are, while still being extremely cost-effective to run because it’s nearby,” said Matthew Prince, CEO and co-founder, Cloudflare. “With NVIDIA’s state-of-the-art GPU technology on our global network, we’re making AI inference — that was previously out of reach for many customers — accessible and affordable globally.”
“NVIDIA’s inference platform is critical to powering the next wave of generative AI applications,” said Ian Buck, Vice President of Hyperscale and HPC at NVIDIA. “With NVIDIA GPUs and NVIDIA AI software available on Cloudflare, businesses will be able to create responsive new customer experiences and drive innovation across every industry.”
Today, Cloudflare is making generative AI inferencing accessible globally, and without up-front costs. By deploying NVIDIA GPUs to its global edge network, Cloudflare now provides:
- Low-latency generative AI experiences for every end user, with NVIDIA GPUs available for inference tasks in over 100 cities by the end of 2023, and nearly everywhere Cloudflare’s network extends by the end of 2024.
- Access to compute power near wherever customer data resides, to help customers anticipate potential compliance and regulatory requirements that are likely to arise.
- Affordable, pay-as-you-go compute power at scale, to ensure every business can access the latest AI innovation — without the need to invest massive funds upfront to reserve GPUs that may go unused.
Register here to reserve your access to Workers AI.
Cloudflare, Inc. (NYSE: NET) is the leading connectivity cloud company. It empowers organizations to make their employees, applications and networks faster and more secure everywhere, while reducing complexity and cost. Cloudflare’s connectivity cloud delivers the most full-featured, unified platform of cloud-native products and developer tools, so any organization can gain the control they need to work, develop, and accelerate their business.