

(Andy Chipus/Shutterstock)
The large language model (LLM) revolution has transformed vector databases from obscure search tech into must-have products for AI success. But which vector database features should you look for, and which vendors are innovating? The analysts at Forrester recently dug into the field to provide answers in a new report.
Vector databases are designed to manage and process one particular data type called a vector embedding, which is a numerical representation of words, documents, images, or even sound. A vector database indexes and stores these embeddings in a multi-dimensional space that allows users or applications to retrieve these embeddings and others nearby that they resemble. This similarity search function is what enabled users to get much better search results than straightforward keyword-matching, and led to the creation of so-called “AI search engines.”
When ChatGPT dropped the LLM bomb on the world in late 2022, a new use for vector databases was quickly discovered. By storing a set of source documents as embeddings in a vector database and then calling on the database to serve information from those documents via similarity search conducted at runtime as part of the prompt engineering or retrieval-augmented generation (RAG) process, GenAI users discovered they could greatly improve the quality of the responses generated by chatbots, co-pilots, and other forms of AI interactions enabled by LLMs like ChatGPT.
Just a few “native” vector databases existed prior to ChatGPT, such as Pinecone, Milvus, and
Zilliz. But almost overnight, many existing database vendors adapted their wares to be able to store, index, and process vector data, too, including Elastic, DataStax, Couchbase, MongoDB, and even Teradata. For NoSQL and relational databases that were already multi-modal in nature, the addition of the vector data type was a no-brainer.
However, as the market for vector databases exploded, it also created some confusion among users about what is the best approach to adopting vector databases. “Is the pgvector plug-in for Postgres sufficient for my GenAI needs? What benefits does a native vector database bring that multi-modal databases can’t match? Do these vector databases only run in the cloud or can I run them on-prem too?”
Enter Forrester, the longtime IT analyst group based in Cambridge, Massachusetts. In “Vector Databases Landscape, Q2 2024” report, Forrester analyst Noel Yuhanna and several of his colleagues dug into the burgeoning market for vector databases while slicing and dicing the vector database capabilities from 24 vendors.
Forrester started out by defining its terms. “A database management system that provides storage, indexing, processing, and access for data represented by vectors to support similarity searches, RAG apps, modern generative AI/LLM apps, and vector-based analytics,” the company states.
“Customers leverage vector databases to support customer experiences, RAG applications, image similarity search, real-time anomaly data detection, optimized recommendation engines, and fraud detection,” it continues. “Despite being in the nascent stages of this market, we anticipate a surge in diverse use cases in the near term.”
Forrester sees the market for vector database divided into two main segments: native vector DBs and multi-modal vector DBs.
The key difference between the camps, Forrester says, is the greater scalability of native vector DBs, “particularly when handling large volumes of vectors.” The main advantage of a multimodal vector DB, meanwhile, is that it can store other types of data, potentially eliminating the need for two or more separate databases.
The challenges of scale in vector databases have not been entirely solved, and the high-end “is still a work in progress,” Forrester says. “High-end scale and performance still require considerable effort, especially when supporting tens of billions of data points (vectors).”
Forrester didn’t rank the vector databases by their capabilities to tackle standard vector database duties (perhaps that will be the subject of an upcoming Forrester Wave). But it did look into which databases are being positioned for some of the emerging use cases for vector databases, which is handy to know (see image to the right).
The market for vector databases has seen a large number of entrants over the past 12 months, which makes for interesting dynamics that observers and customers should closely watch, Forrester says.
For instance, the capabilities expected of vector databases is changing. Core functions, like vector storage, indexing, and processing, are being augmented with more advanced features, “including enhanced security measures, optimized processing capabilities, and seamless integration with diverse vector embedding transformers and data streaming engines,” the analyst group says.
Another thing to look out for is market bleed-over. Cloud data platforms, including data fabrics and data lakehouses, are also adopting vector capabilities, Forrester says, which could further disrupt the market for vector databases.
“This trend underscores a shift toward comprehensive data management solutions that seamlessly integrate vector functionality, potentially reshaping the landscape of specialized vector databases,” Yuhanna and the Forrester analysts write.
Related Items:
Vector Databases Emerge to Fill Critical Role in AI
Home Depot Finds DIY Success with Vector Search
Can Thought Vectors Deliver Human-Level Reasoning?
June 13, 2025
- PuppyGraph Announces New Native Integration to Support Databricks’ Managed Iceberg Tables
- Striim Announces Neon Serverless Postgres Support
- AMD Advances Open AI Vision with New GPUs, Developer Cloud and Ecosystem Growth
- Databricks Launches Agent Bricks: A New Approach to Building AI Agents
- Basecamp Research Identifies Over 1M New Species to Power Generative Biology
- Informatica Expands Partnership with Databricks as Launch Partner for Managed Iceberg Tables and OLTP Database
- Thales Launches File Activity Monitoring to Strengthen Real-Time Visibility and Control Over Unstructured Data
- Sumo Logic’s New Report Reveals Security Leaders Are Prioritizing AI in New Solutions
June 12, 2025
- Databricks Expands Google Cloud Partnership to Offer Native Access to Gemini AI Models
- Zilliz Releases Milvus 2.6 with Tiered Storage and Int8 Compression to Cut Vector Search Costs
- Databricks and Microsoft Extend Strategic Partnership for Azure Databricks
- ThoughtSpot Unveils DataSpot to Accelerate Agentic Analytics for Every Databricks Customer
- Databricks Eliminates Table Format Lock-in and Adds Capabilities for Business Users with Unity Catalog Advancements
- OpsGuru Signs Strategic Collaboration Agreement with AWS and Expands Services to US
- Databricks Unveils Databricks One: A New Way to Bring AI to Every Corner of the Business
- MinIO Expands Partner Program to Meet AIStor Demand
- Databricks Donates Declarative Pipelines to Apache Spark Open Source Project
June 11, 2025
- What Are Reasoning Models and Why You Should Care
- The GDPR: An Artificial Intelligence Killer?
- Fine-Tuning LLM Performance: How Knowledge Graphs Can Help Avoid Missteps
- It’s Snowflake Vs. Databricks in Dueling Big Data Conferences
- Snowflake Widens Analytics and AI Reach at Summit 25
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- Why Snowflake Bought Crunchy Data
- Change to Apache Iceberg Could Streamline Queries, Open Data
- Inside the Chargeback System That Made Harvard’s Storage Sustainable
- dbt Labs Cranks the Performance Dial with New Fusion Engine
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- It’s Official: Informatica Agrees to Be Bought by Salesforce for $8 Billion
- AI Agents To Drive Scientific Discovery Within a Year, Altman Predicts
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- DuckLake Makes a Splash in the Lakehouse Stack – But Can It Break Through?
- The Top Five Data Labeling Firms According to Everest Group
- Who Is AI Inference Pipeline Builder Chalk?
- ‘The Relational Model Always Wins,’ RelationalAI CEO Says
- IBM to Buy DataStax for Database, GenAI Capabilities
- VAST Says It’s Built an Operating System for AI
- More News In Brief…
- Astronomer Unveils New Capabilities in Astro to Streamline Enterprise Data Orchestration
- Yandex Releases World’s Largest Event Dataset for Advancing Recommender Systems
- Astronomer Introduces Astro Observe to Provide Unified Full-Stack Data Orchestration and Observability
- BigID Reports Majority of Enterprises Lack AI Risk Visibility in 2025
- Databricks Announces Data Intelligence Platform for Communications
- MariaDB Expands Enterprise Platform with Galera Cluster Acquisition
- Snowflake Openflow Unlocks Full Data Interoperability, Accelerating Data Movement for AI Innovation
- Databricks Unveils Databricks One: A New Way to Bring AI to Every Corner of the Business
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Databricks Announces 2025 Data + AI Summit Keynote Lineup and Data Intelligence Programming
- More This Just In…