Kinetica got its start building a GPU-powered database to serve fast SQL queries and visualizations for US government and military clients. But with a pair of announcements at Nvidia’s GTC show last week, the company is showing it’s prepared for the coming wave of generative AI applications, particularly those utilizing retrieval augmented generation (RAG) techniques to tap unique data sources.
Companies today are hunting for ways to leverage the power of large language models (LLMs) with their own proprietary data. Some companies are sending their data to OpenAI’s cloud or other cloud-based AI providers, while others are building their own LLMs.
However, many more companies are adopting the RAG approach, which has surfaced as perhaps the best middle ground between that doesn’t require building your own model (time-consuming and expensive) or sending your data to the cloud (not good privacy and security-wise).
With RAG, relevant data is injected directly into the context window before being sent off to the LLM for execution, thereby providing more personalization and context in the LLMs response. Along with prompt engineering, RAG has emerged as a low-risk and fruitful method for juicing GenAI returns.
Kinetica is also now getting into the RAG game with its database by essentially turning it into a vector database that can store and serve vector embeddings to LLMs, as well as by performing vector similarity search to optimize the data it sends to the LLM.
According to its announcement last week, Kinetica is able to serve vector embeddings 5x faster than other databases, a number it claims came from the VectorDBBench benchmark. The company claims its able to achieve that speed by leveraging Nvidia’s RAPIDS RAFT technology.
That GPU-based speed advantage will help Kinetica customers by enabling them to scan more of their data, including real-time data that has just been added to the database, without doing a lot of extra work, said Nima Negahban, co0founder and CEO of Kinetica.
“It’s hard for an LLM or a traditional RAG stack to be able to answer a question about something that’s happening right now, unless they’ve done a lot of pre-planning for specific data types,” Negahban told Datanami at the GTC conference last week, “whereas with Kinetica, we’ll be able to help you by looking at all the relational data, generate the SQL on the fly, and ultimately what we put just back in the context for the LLM is a simple text payload that the LLM will be able to understand to use to give the answer to the question.”
This essentially gives users the capability to talk to their complete corpus of relational enterprise data, without doing any preplanning.
“That’s the big advantage,” he continued, “because the traditional RAG pipelines right now, that part of it still requires a good amount of work as far as you have to have the right embedding model, you have to test it, you have to make sure it’s working for your use case.”
Kinetica can also talk to other databases and function as a generative federated query engine, as well as do the traditional vectorization of data that customers put inside of Kinetica, Negahban said. The database is designed to be used for operational data, such as time-series, telemetry, or teleco data. Thanks to the support for NVIDIA NeMo Retriever microservices, the company is able to position that data in a RAG workflow.
But for Kinetica, it all comes back to the GPU. Without the extreme computational power of the GPU, the company has just another RAG offering.
“Basically you need that GPU-accelerated engine to make it all work at the end of the day, because it’s got to have the speed,” said Negahban, a 2018 Datanami Person to Watch. “And we then put all that orchestration on top of it as far as being able to have the metadata necessary, being able to connect to other databases, having all that to make it easy for the end user, so basically they can start taking advantage of all that relational enterprise data in their LLM interaction.”
Related Items:
Bank Replaces Hundreds of Spark Streaming Nodes with Kinetica
Kinetica Aims to Broaden Appeal of GPU Computing
Preventing the Next 9/11 Goal of NORAD’s New Streaming Data Warehouse
November 8, 2024
- UnifyApps Raises $20M Series A to Deliver AI Agents Across the Enterprise
- PuppyGraph Raises $5M to Bring Real-Time Graph Analytics to Enterprise Data Lakes
- Brillio Unveils AI Upgrades to BrillioOne.ai for Streamlined Development
- SiMa.ai Launches Palette Edgematic on AWS Marketplace, Scaling Low Code Development for ML at the Edge
November 7, 2024
- Nutanix Expands Partnership with AWS
- Redgate Launches Advanced AI Capabilities Across Its Database DevOps Portfolio
- Diliko Launches Agentic AI Platform, Enhancing Data Management for Mid–Sized Enterprises
- Cerabyte Discusses Use Cases for Its Ceramic Data Storage Solution at SC24
- Tintri Introduces Advanced Kubernetes Data Management with New VMstore CSI Driver
- Elastic Simplifies Elasticsearch Management with AutoOps Integration
November 6, 2024
- Arcitecta to Showcase New Data Management Solutions and Collaborative Presentations at SC24
- AtScale Launches Open Text-to-SQL Leaderboard for Transparent, Standardized Data Query Evaluation
- Lightbits Now Certified on Oracle Cloud Infrastructure
- Nutanix Positioned Furthest in Vision Among All Vendors in 2024 Gartner Magic Quadrant for File and Object Storage Platforms
November 5, 2024
- Grafana Labs Strengthens Cloud Native Ecosystem with Major OpenTelemetry and Kubernetes Monitoring Updates
- NTT DATA and Google Cloud Expand Partnership to Drive AI and Data Analytics in APAC
- Rackspace Expands Spot Platform with On-Demand NVIDIA GPU-as-a-Service for AI Workloads
- Tenstorrent to Build Japanese Engineering Talent with US-Based AI and RISC-V Training
- Qlik Opens Registration for Qlik Connect 2025