July 18, 2023

DataStax Rolls Out Vector Search for Astra DB to Support Gen AI

Jaime Hampton

DataStax just announced the general availability of its vector search capability in Astra DB, its DBaaS built on Apache Cassandra.

Vector search is a must-have capability for building generative AI applications. In machine learning, vector embeddings are the distilled representations of raw training data and act as a filter for running new data through during inference. Training a large language model results in potentially billions of vector embeddings.

Vector databases store these embeddings and perform a similarity search to find the best match between a user’s prompt and the vectorized training data. Instead of searching with keywords, embeddings allow users to conduct a search based on context and meaning to extract the most relevant data.

There are native databases specifically built to manage vector embeddings, but many relational and NoSQL databases (like Astra DB) have been modified to include vector capabilities due to the demand surrounding generative AI.

This demand is palpable: McKinsey estimates that generative AI could potentially add between $2.6 and $4.4 trillion in value to the global economy. DataStax CPO Ed Anuff noted in a release that databases capable of supporting vectors are crucial to tapping into the potential of generative AI as a sustainable business initiative.

“An enterprise will need trillions of vectors for generative AI so vector databases must deliver limitless horizontal scale. Astra DB is the only vector database on the market today that can support massive-scale AI projects, with enterprise-grade security, and on any cloud platform. And, it’s built on the open source technology that’s already been proven by AI leaders like Netflix and Uber,” he said.

DataStax says one advantage of vector search within Astra DB is that it can help reduce AI hallucinations. LLMs are prone to fabricating information, called hallucinating, which can be damaging to business. This vector search release includes Retrieval Augmented Generation (RAG), a capability that grounds search results within specific enterprise data so that the source of information can be easily pinpointed.

Data security is another factor to consider with generative AI deployment, as many AI use cases involve sensitive data. DataStax says Astra DB is PCI, SOC2, and HIPAA enabled so that companies like Skypoint Cloud Inc., which offers a data management platform for the senior living healthcare industry, can use Astra DB as a vector database for resident health data.

“Envision it as a ChatGPT equivalent for senior living enterprise data, maintaining full HIPAA compliance, and significantly improving healthcare for the elderly,” said Skypoint CEO Tisson Mathew in a statement.

To support this release, DataStax also created a Python library called CassIO aimed at accelerating vector search integration. The company says this software framework easily integrates with popular LLM software like LangChain and can maintain chat history, create prompt templates, and cache LLM responses.

The new vector search capability is available on Astra DB for Microsoft Azure, AWS, and Google Cloud. The company also says vector search will be available for customers running DataStax Enterprise, the on-premises, self-managed offering, within the month.

Matt Aslett of Ventana Research expects generative AI adoption to grow rapidly and says that through 2025, one-quarter of organizations will deploy generative AI embedded in one or more software applications.

“The ability to trust the output of generative AI models will be critical to adoption by enterprises. The addition of vector embeddings and vector search to existing data platforms enables organizations to augment generic models with enterprise information and data, reducing concerns about accuracy and trust,” he said.

DataStax Bolsters Real-Time Machine Learning with Kaskada Buy

DataStax Nabs $115 Million to Help Build Real-Time Applications

Technologies: Cloud, Frameworks

Vendors: DataStax

Tags: Astra DB, datastax, generative AI, LangChain, python, vector embedding, vector search

DataStax Rolls Out Vector Search for Astra DB to Support Gen AI

July 3, 2025

July 2, 2025

July 1, 2025

June 30, 2025

June 27, 2025

Sponsored Partner Content

AI That Knows Your Business: Meet Cube D3

Mainframe data: A powerful source for AI insights

CData recognized in the 2024 Gartner ® Magic Quadrant™ Report

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Transforming Healthcare with Data

IDC Spotlight: Boosting AI Impact with Data Products

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

DataStax Rolls Out Vector Search for Astra DB to Support Gen AI

July 3, 2025

July 2, 2025

July 1, 2025

June 30, 2025

June 27, 2025

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Share

Copy short link