TimescaleDB Is a Vector Database Now, Too
Organizations that are using TimescaleDB to store and query their time-series data may be interested to know that they can use the database to store and query vectors for GenAI applications, too.
Timescale is best known for developing an open source time-series database.. The New York City company added extensions to Postgres to make time-series data a first class data type for IoT type applications, including gaming.
With today’s launch of Timescale Vector, the company is now entering the market for vector databases, which is flourishing as a result of the massive interest in generative AI applications built atop large language models.
Vector databases serve as a sort of long-term memory for LLMs, such as OpenAI’s GPT-4 and Llama from Meta. By storing and indexing the mathematical representations of pieces of text trained by the LLM, dubbed vector embeddings, the vector database can more quickly match the GenAI application’s user input at run time to the most pertinent piece of training data encountered by the LLM.
In TimescaleDB’s case, the company adopted pgvector, the open source vector library for Postgres. In addition to incorporating pgvector, the company bolstered its vector capability by using an Approximate Nearest Neighbor (ANN) algorithm, which it claims gives it much better performance than both plain vanilla pgvector as well as dedicated vector databases.
“We’ve built the additional support for these type of vector lookups that could enable people to build LLM models on top of it to answer … questions in a way that is much more performant, faster, and has better accuracy than other stuff that’s in the market,” says Michael Freedman, the CTO and co-founder of Timescale.
In a lengthy blog post today, the company shared some internal benchmark figures that it says proves its ANN index gives it better, faster performance on a dataset of 1 million OpenAI embeddings than competing vector databases.
The company claims it delivered 243% faster search speed at 99% recall than the vector database from Weaviate. It also claimed that it achieved about 39% faster search speed than pgvector’s ierarchical navigable small world (HNSW) algorithm and 363% faster search speed than pg_embedding.
“Timescale Vector optimizes hybrid time-based vector search, leveraging the automatic time-based partitioning and indexing of Timescale’s hypertables to efficiently find recent embeddings, constrain vector search by a time range or document age, and store and retrieve LLM response and chat history with ease,” the company writes in the blog.
In an interview with Datanami, Freedman also singled out Pinecone, which develops a dedicated vector database, as a new competitor. The problem with dedicated vector databases, Freedman says, is that they only store vector embeddings.
“But often you might have other relational data that you want to use in your question,” he says. “So if you’re building applications on Pinecone, you might need to deploy Pinecone and Postgres and something else, and then bring all that data together at query time and answer questions. If you’re using Timescale, it all sits together in one database, and you could actually build a lot of applications with a much simpler, operationally simpler stack.”
While TimescaleDB is best known as a time-series database, the company has since moved away from that niche and now considers itself to be a general database provider. It can not only store time-series and event data for IoT and gaming applications, but thanks to its Postgres core, it can store any relational data.
“We call ourselves Postgres ++,” Freedman says. “We’re Postgres ‘and.’ We’re not Postgres ‘or.’”
Having that underlying Postgres compatibility gives Timescale the capability to store the data for any organizations that are already using Postgres. That’s a considerable market, considering that Postgres is the world’s most popular database. And that has translated into a considerable amount of success for the open source offering, which counts tens of millions of users, Freedman says. The managed database service that Timescale offers in the cloud has about 1,000 paying customers, he says.
“They’re like, ‘Oh, I already use Postgres. I should just be using you for all of [my workloads],’” Freedman says. “As long as they want a relational database like Postgres, we can become a great go-to for Postgres.”
Timescale introduced its vector support to cloud customers a few months ago and today it’s officially announcing the start of the preview program. The company has attracted several early adopters for its vector capability, including PolyPerception, a European provider of recycling solutions.
“The simplicity and scalability of Timescale Vector’s integrated approach to use Postgres as a time-series and vector database allows a startup like us to bring an AI product to market much faster,” PolyPerception CEO Nicolas Bream says in the Timescale blog. “Choosing TimescaleDB was one of the best technical decisions we made, and we are excited to use Timescale Vector.”
Another early adopter, Blueway Software, is also finding the database a good fit for its GenAI development. “Using Timescale Vector allows us to easily combine PostgreSQL’s classic database features with storage of vector embeddings for Retrieval Augmented Generation (RAG),” says Alexis de Saint Jean, the company’s Innovation Director. “Timescale’s easy-to-use cloud platform and good support keep our team focused on imaging solutions to solve customer pains not on building infrastructure.”
You can learn more at www.timescale.com.
Editor’s note: This article has been corrected. The vector features is in preview, not general availability. Datanami regrets the error.
December 1, 2023
- Kognitos Raises $20M in Series A Funding to Automate Businesses Using Generative AI
- Voltron Data Launches Theseus to Unlock the Power of the Largest Data Sets for AI
- Insight Tech Journal Reflects on Gen AI and the Biggest IT Disruptors of 2023
- Accenture Launches Specialized Services to Help Companies Customize and Manage Foundation Models
- VAST Data’s New Platform Update Aims to Simplify AI Workflows and Hybrid Cloud Operations on AWS
November 30, 2023
- HPE Collaborates with NVIDIA to Deliver an Enterprise-Class, Full-Stack GenAI Solution
- Hitachi Vantara Introduces Pentaho+: A Simplified Platform for Trusted, GenAI-ready Data
- SAS Forecasts 2024 AI Trends: Tackling the Dark Age of Fraud with AI Solutions
- Scality’s 2024 Data Storage Predictions Reveal Continued HDD Relevance Against SSD Advances
- DataRobot Named a Leader in IDC MarketScape: Worldwide AI Governance Platforms 2023 Vendor Assessment
- HPE Fuels Business Transformation with New AI-Native Architecture and Hybrid Cloud Solutions
- Dremio Delivers GenAI-Powered Data Discovery and Unified Path to Apache Iceberg on the Data Lakehouse
- Quantum Myriad All-Flash File and Object Solution Now Generally Available
- AWS Announces 5 New Amazon SageMaker Capabilities for Scaling with Models
- Berkeley Lab’s 2023 Hopper Fellow Tackles Complex Datasets with Large-Scale Graph Analysis
- KNIME Launches AI Learnathon to Help Users Build Custom AI-Powered Data Apps – No Coding Required
November 29, 2023
- SiMa.ai and Supermicro Announce Partnership to Accelerate Power-Efficient ML at the Edge
- MongoDB Announces Atlas Vector Search Enhancement with Amazon Bedrock
- NVIDIA Brings Business Intelligence to Chatbots, Copilots and Summarization Tools with Enterprise-Grade Generative AI Microservice
- Cloudian Introduces HyperStore Bucket Migrator for the Amazon S3 Express One Zone Storage Class
Most Read Features
- Databricks Bucks the Herd with Dolly, a Slim New LLM You Can Train Yourself
- Big Data File Formats Demystified
- Data Mesh Vs. Data Fabric: Understanding the Differences
- Altman’s Back As Questions Swirl Around Project Q-Star
- Quantum Computing and AI: A Leap Forward or a Distant Dream?
- Patterns of Progress: Andrew Ng Eyes a Revolution in Computer Vision
- Taking GenAI from Good to Great: Retrieval-Augmented Generation and Real-Time Data
- Five AWS Predictions as re:Invent 2023 Kicks Off
- It’s a Snowday! Here’s the New Stuff Snowflake Is Giving Customers
- Berners-Lee Startup Seeks Disruption of the Current Web 2.0 Big Data Paradigm
- More Features…
Most Read News In Brief
- Mathematica Helps Crack Zodiac Killer’s Code
- Databricks: We’re a Data Intelligence Platform Now
- Pandas on GPU Runs 150x Faster, Nvidia Says
- GenAI Debuts Atop Gartner’s 2023 Hype Cycle
- Salesforce Report Highlights Importance of Data in the AI Revolution
- Retool’s State of AI Report Highlights the Rise of Vector Databases
- Cloudera Makes a Move in GenAI with Pinecone Partnership
- Amazon Launches AI Assistant, Amazon Q
- Big Growth Forecasted for Big Data
- New Data Unveils Realities of Generative AI Adoption in the Enterprise
- More News In Brief…
Most Read This Just In
- Salesforce Announces New Automotive Cloud Features
- DataStax Launches New Integration with LangChain, Enables Developers to Build Production-ready Generative AI Applications
- Dataiku Announces Breakthroughs in Generative AI Enterprise Applications, Safety, and Tooling
- Snowflake Puts Industry-Leading Large Language and AI Models in the Hands of All Users with Snowflake Cortex
- Martian Raises $9M for Advanced Model Mapping to Enhance LLM Performance and Accuracy
- Dremio Enhances KION Group’s Data Processing, Reducing Query Times from Half an Hour to Seconds
- Amazon Aurora MySQL zero-ETL Integration with Amazon Redshift Now Generally Available
- Terra Quantum Announces Partnership with NVIDIA for Quantum-Enhanced Data Analytics
- AWS Announces 4 Zero-ETL Integrations to Make Data Access and Analysis Faster and Easier Across Data Stores
- New NYU Report Identifies Tangible Threats Posed by Emerging Generative AI and How to Address Them
- More This Just In…
Sponsored Partner Content
December 6 - December 7