

(ZinetroN/Shutterstock)
Rockset today unveiled new vector database capabilities, such as the addition of approximate nearest neighbor (ANN) search and native support for LlamaIndx and LangChain, that it says will help companies efficiently scale their GenAI applications once they’re in production.
As companies experiment with the new generative AI capabilities delivered via large language models (LLMs) and vector search, they’re getting good early results, says Rockset co-founder and CEO Venkat Venkataramani.
“We’re not educating people on what can vector search do for you,” he says. “They’ve already tinkered it at very small scale, built prototypes, and they already see the magic.”
While vector search and GenAI prototypes tease a tantalizing future, companies often run into trouble when they try to make the leap from development to production.
“Not a week goes by where somebody calls me and says, ‘Venkat, I started with this toy open source vector database and we did a shadow launch and a scale test, and it just bombed,’” Venkataramani says. “Other vector databases may have good vector support, but the database part is very shaky. Is it scalable? Is it reliable? It gets very expensive and very hard to operate very quickly.”
Rockset rolled out its initial support for vector search and storing vectorized embeddings earlier this year. Like many other SQL and NoSQL databases, the Silicon Valley firm experienced a surge in demand for these data types, which are instrumental for enabling vector search as well as other types of GenAI applications built atop LLMs and computer vision models.
The addition today of ANN and native support for LlamaIndex and LangChain, which are open source tools for automating prompt engineering and other critical behind-the-scenes GenAI data workflows, bolster Rocket’s existing capabilities for serving scalable GenAI apps.
The ANN algorithm is critical for quickly matching GenAI app user input to pre-generated vector embeddings stored in a vector database. It’s used both in vector search, where it powers the similarity search, as well as other GenAI use cases for text and computer vision.
Rocket’s implementation of ANN is unique, Venkataramani says, because it rebuilds the ANN index in real time as new data arrives, versus as a batch job that requires downtime.
“Other vector databases require you to rebuild the entire ANN index and all of that in batch mode, and so you don’t really get a real time application,” he says. “Rebuilding these indexes also is actually way more computationally expensive, but if you can incrementally maintain it, it is a lot cheaper and also more real-time.”
Rockset’s support for compute-compute separation enables it to run workloads such as index rebuilding, compaction, and ongoing maintenance without impacting the application’s main vector query workload, Venkataramani says. Compute-compute separation gives the database a big advantage when it comes to scaling GenAI applications, he says.
“You can have one or more compute instances for searches and similarity searches and vector searches and other real-time analytics and reporting–whatever applications you have,” the Datanami 2022 Person to Watch says. “They’re completely decoupled. They’re fully independently scalable and isolated from each other. But they work on the same copy of the data, and new data coming in–new updates, inserts, and deletes–will be available for your searches within single-digit milliseconds.”
The fact that Rockset, as a distributed relational database, can store all of a customer’s data as opposed to just storing vectors, as a dedicated vector database does, is another big advantage, Venkataramani says.
“You can have one column that’s basically vector embeddings, and all the other columns and other structured data available right there,” he says. “Building these kinds of hybrid searches across vectors and other metadata that you have is as simple as a SQL where clause. It’s not like you have a vector database and then you put all the other metadata and other structured data in a second separate database and you have to somehow in the application wire them together.”
Having all of the data in one place turns out to be very important in some GenAI use cases, such as powering a song recommendation engine, Venkataramani says. Running the ANN or K nearest neighbor (KNN) search–which applies a brute-force approach that delivers exact answers–is just one step among many that happens behind the scenes in recommendation engine. Developers may also bring some pre- and post-filtering using other metadata to get the best song recommendations in front of the user.
“You want to push the computation close to where the data lives, but the optimizer needs to be able to know which filters to apply first and which filters to apply second,” he says. “Imagine I have all the vectors in the vector database and all the metadata in the second database. Which one do I do first? If I go and get the 10 songs that are closest in the vector database, all of them might be in my recent playlist. If I go and look at all the songs from all these artists, none of them might be nearest neighbors. So I have to be able to combine them in the same SQL WHERE clause to be able to do this efficiently on the same data set.”
Since OpenAI ignited the GenAI storm a year ago with the launch of ChatGPT, the need for vector capabilities has exploded in the database market. Rockset’s vector capabilities are attracting attention among existing customers as well as prospects that are building GenAI applications, ranging from chatbots to recommendation engines to vector search, Venkataramani says.
“It’s really hot. It’s very, very significant,” he says. “AI applications are not like…a separate category of apps. Every application will have parts of their application powered by AI models and AI kind of capabilities, and it’ll be invisible…You’re not going to have a separate one-off side database to build your AI apps. Every single app in the world right now is going to get enhanced and have some components of it.”
One of the companies adopting Rockset’s vector capabilities is JetBlue. The airline, which recently shared its participated in the vendor’s one-day conference, did a bake-off between Rockset and several other vector database, and picked Rockset to power GenAI and other applications.
“We saw the immense power of real-time analytics and AI to transform JetBlue’s real-time decision augmentation and automation, since stitching together three to four database solutions would have slowed down application development,” Sai Ravuru, JetBlue’s senior manager of data science and analytics, says in a recent case study. “With Rockset, we found a database that could keep up with the fast pace of innovation at JetBlue.”
Related Items:
Rockset Says It’s Ready for Real-Time AI
Rockset Looks to Compute-Compute Isolation for Real-Time Advantage
July 3, 2025
- FutureHouse Launches AI Platform to Accelerate Scientific Discovery
- KIOXIA AiSAQ Software Advances AI RAG with New Version of Vector Search Library
- NIH Highlights AI and Advanced Computing in New Data Science Strategic Plan
- UChicago Data Science Alum Transforms Baseball Passion into Career with Seattle Mariners
July 2, 2025
- Bright Data Launches AI Suite to Power Real-Time Web Access for Autonomous Agents
- Gartner Finds 45% of Organizations with High AI Maturity Sustain AI Projects for at Least 3 Years
- UF Highlights Role of Academic Data in Overcoming AI’s Looming Data Shortage
July 1, 2025
- Nexdata Presents Real-World Scalable AI Training Data Solutions at CVPR 2025
- IBM and DBmaestro Expand Partnership to Deliver Enterprise-Grade Database DevOps and Observability
- John Snow Labs Debuts Martlet.ai to Advance Compliance and Efficiency in HCC Coding
- HighByte Releases Industrial MCP Server for Agentic AI
- Qlik Releases Trust Score for AI in Qlik Talend Cloud
- Dresner Advisory Publishes 2025 Wisdom of Crowds Enterprise Performance Management Market Study
- Precisely Accelerates Location-Aware AI with Model Context Protocol
- MongoDB Announces Commitment to Achieve FedRAMP High and Impact Level 5 Authorizations
June 30, 2025
- Campfire Raises $35 Million Series A Led by Accel to Build the Next-Generation AI-Driven ERP
- Intel Xeon 6 Slashes Power Consumption for Nokia Core Network Customers
- Equal Opportunity Ventures Leads Investment in Manta AI to Redefine the Future of Data Science
- Tracer Protect for ChatGPT to Combat Rising Enterprise Brand Threats from AI Chatbots
June 27, 2025
- Inside the Chargeback System That Made Harvard’s Storage Sustainable
- What Are Reasoning Models and Why You Should Care
- Databricks Takes Top Spot in Gartner DSML Platform Report
- LinkedIn Introduces Northguard, Its Replacement for Kafka
- Change to Apache Iceberg Could Streamline Queries, Open Data
- Agentic AI Orchestration Layer Should be Independent, Dataiku CEO Says
- Why Snowflake Bought Crunchy Data
- Fine-Tuning LLM Performance: How Knowledge Graphs Can Help Avoid Missteps
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- The Evolution of Time-Series Models: AI Leading a New Forecasting Era
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- ‘The Relational Model Always Wins,’ RelationalAI CEO Says
- Confluent Says ‘Au Revoir’ to Zookeeper with Launch of Confluent Platform 8.0
- DuckLake Makes a Splash in the Lakehouse Stack – But Can It Break Through?
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- The Top Five Data Labeling Firms According to Everest Group
- Supabase’s $200M Raise Signals Big Ambitions
- Toloka Expands Data Labeling Service
- With $17M in Funding, DataBahn Pushes AI Agents to Reinvent the Enterprise Data Pipeline
- Databricks Is Making a Long-Term Play to Fix AI’s Biggest Constraint
- More News In Brief…
- Astronomer Unveils New Capabilities in Astro to Streamline Enterprise Data Orchestration
- Databricks Unveils Databricks One: A New Way to Bring AI to Every Corner of the Business
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- BigBear.ai And Palantir Announce Strategic Partnership
- Astronomer Introduces Astro Observe to Provide Unified Full-Stack Data Orchestration and Observability
- Databricks Donates Declarative Pipelines to Apache Spark Open Source Project
- Deloitte Survey Finds AI Use and Tech Investments Top Priorities for Private Companies in 2024
- Code.org, in Partnership with Amazon, Launches New AI Curriculum for Grades 8-12
- Databricks Announces Data Intelligence Platform for Communications
- More This Just In…