Follow Datanami:
March 20, 2024

AI Software Vendors Move to Support Nvidia NIMs

Nvidia Inference Microservices (NIMs) are just a few days old, but AI software vendors are already moving to support the new deployment scheme to help their customers get generative AI applications off the ground.

Nvidia CEO Jensen Huang introduced NIM on Monday as a way to simplify the development and deployment of GenAI applications built atop large language and computer vision models. By combining many of the components one needs into a pre-built Kubernetes container that can run across Nvidia’s family of GPU hardware, the company hopes to take a lot of the pain out of deployment of GenAI apps.

In addition to Nvidia software like CUDA and NeMo Retriever, NIMs will include software from third-party software companies, Huang said during his keynote.

“How can we build software in the future? It’s unlikely you’ll write it from scratch or write a whole bunch of Python code or anything like that,” Huang said. “It’s very likely that you assemble a team of AIs. There’s probably going to be a super AI that you use that takes the mission that you give it and breaks it down into an execution plan.

Nvidia NIM

“Some of the execution plan can be handed off to another NIM. Maybe it understands SAP,” he continued. “It might hand it off to another NIM that goes off and does some calculation on it. Maybe it’s optimization software, or a combinatorial optimization algorithm. Maybe it’s just a basic calculator. Maybe it’s pandas to do some numerical analysis on it, and it comes back with its answer and it gets combined with everybody else’s and because it’s been presented with ‘This is what the right answer should look like.’ It knows what the right answer to produce, and it present it to you.”

The AI software industry wasted no time in getting behind Nvidia’s NIM plan.

DataStax announced that it has integrated the retrieval-augmented generation (RAG) capabilities of its managed database, Astra DB, with Nvidia NIM. The company claims the integration will enable users to create vector embeddings 20x faster and 80% less expensive than other cloud-based vector embedding services.

NVIDIA NeMo Retriever can generate more than 800 embeddings per second per GPU, which DataStax says pairs well with its Astra DB, which is designed to ingest embeddings at a rate of 4,000 transactions per second.

“In today’s dynamic landscape of AI innovation, RAG has emerged as the pivotal differentiator for enterprises building GenAI applications with popular large language frameworks,” DataStax CEO and Chairman Chet Kapoor said in a press release. “With a wealth of unstructured data at their disposal, ranging from software logs to customer chat history, enterprises hold a cache of valuable domain knowledge and real-time insights essential for generative AI applications, but still face challenges. Integrating NVIDIA NIM into RAGStack cuts down the barriers enterprises are facing to bring them the high-performing RAG solutions they need to make significant strides in their GenAI application development.”

Weights & Biases is also supporting Nvidia’s NIM with its AI developer platform, which automates many of the steps that data scientists and developers must go through to create AI models and applications. The company says that customers tracking model artifacts in its platform can use W&B Launch to deploy to NIMs, thereby streamlining the model deployment process.

The San Francsico company also announced that its providing W&B Launch to customers via GPU-in-the-cloud provider CoreWeave in a bid to simplify hardware provisioning. By making W&B Launch available as a NIM within the CoreWeave Application Catalog, it will accelerate customers’ deployment of GenAI apps.

“Our mission is to build the best tools for machine learning practitioners around the world, and that also means collaborating with the best partners,” said Lukas Biewald, CEO at Weights & Biases, in a press release. “The new integrations with NVIDIA and CoreWeave technologies will enhance our customers’ ability to easily train, tune, analyze, and deploy AI models to drive massive value for their organizations.”

Anyscale, the company behind the open source Ray project, announced its support for NIM and Nvidia’s AI Enterprise software (which NIM is a part of). The San Francisco software company is working with Nvidia to integrate the Anyscale managed runtime environment with NIM, which will benefit customers by bringing them better container orchestration, observability, autoscaling, security, and performance for their AI applications.

“This enhanced integration with Nvidia AI Enterprise makes it simpler than ever for customers to get access to cutting-edge infrastructure software and top-of-the-line compute resources to accelerate the production of generative AI models,” Anyscale CEO Robert Nishihara said in a press release. “As AI becomes a strategic capability, it’s essential to balance performance, scale, and cost while minimizing infrastructure complexity. The ability to tap into the best-of-breed infrastructure, accelerated computing, pre-trained models and tools will be crucial for organizations to stand out and compete. This collaboration is another important step forward in bringing generative AI to more people.”

EY (formerly Ernst & Young) also announced an expansion of its partnership with Nvidia to help joint customers use the company’s GPU and software in the areas of scientific computing, artificial intelligence, data science, autonomous vehicles, robotics, metaverse and 3D internet applications.

EY pledged to train 10,000 additional employees around the world to use Nvidia offerings, including its GPUs and software offerings like AI Enterprise, NIM, and NeMo Retriever. “We’re working with EY US to integrate NVIDIA’s leading-edge accelerated computing solutions with EY US’ extensive industry knowledge to help empower clients to streamline their AI transformation efforts,” said Alvin Da Costa, Vice President of the Global Consulting Partner Organization at Nvidia, in a press release.

Related Items:

The Generative AI Future Is Now, Nvidia’s Huang Says

Nvidia Looks to Accelerate GenAI Adoption with NIM

Nvidia Introduces New Blackwell GPU for Trillion-Parameter AI Models

Datanami