Cloudera and Pinecone Unite to Tackle AI Hallucinations with Advanced Vector Database Integration
NEW YORK, Nov. 3, 2023 — Cloudera, Inc., a data company for trusted enterprise artificial intelligence (AI), and Pinecone, a vector database company providing long-term memory for AI, are thrilled to announce a strategic partnership that integrates Pinecone’s AI vector database expertise into Cloudera’s open data platform, aimed at transforming the way organizations harness the power of AI to streamline operations and improve customer experiences.
A market leader, Pinecone’s vector database is critical infrastructure for Generative AI. Pinecone is optimized to store AI representations of data (vector embeddings) and search through them by semantic similarity, something traditional databases are very inefficient at doing. This capability is necessary for adding context to queries against applications that use Large Language Models (LLMs). That added context significantly cuts down on erroneous outputs – often referred to as “hallucinations” – helping search and Generative AI applications deliver responses that are accurate and relevant.
The partnership will see Cloudera integrate Pinecone’s best-in-class vector database into Cloudera Data Platform (CDP), enabling organizations to more easily build and deploy highly scalable, real-time, AI-powered applications on Cloudera. This includes the release of a new Applied ML Prototype (AMP) that will allow developers to more quickly create and augment new knowledge bases from data on their own website, as well as pre-built connectors that will enable customers to more quickly set up ingest pipelines in AI applications. In the AMP, Pinceone’s vector database uses these knowledge bases to imbue context into chatbot responses, helping to ensure useful outputs.
Customers can use this same architecture to set up or improve support chatbots or internal support search systems. This enables them to reduce operational costs by decreasing expensive human case-handling efforts and improving the customer experience with faster resolution times. More information on this AMP and how vector databases add context to AI applications can be found in our blog post here.
“Cloudera’s extensive expertise in data management combined with Pinecone’s cutting-edge vector database creates a formidable partnership. A lot of our customers already manage their data with Cloudera. Now it will be easier than ever for them to build AI applications using their embeddings stored with us and data stored with Cloudera. Together we will enable organizations to deliver unparalleled personalized experiences, drive user engagement, and achieve business success,” Elan Dekel, Vice President of Product, Pinecone.
“We are excited to bring the power of Pinecone vector database and semantic search capabilities to our public cloud customers to accelerate generative AI use cases, and significantly improve the developer experience at scale.” Abhas Ricky, Chief Strategy Officer, Cloudera.
“Integration of Pinecone with CDP adds a very critical new functionality that will help clients build generative AI applications,” said Sanjeev Mohan, founder of SanjMo and former Gartner analyst. “In addition, the planned integration between the open source Apache NiFi-based Cloudera Data Flow (CDF) and Pinecone further bolsters CDP’s emphasis on universal data distribution for AI. CDP customers can bring AI to where their data resides – on-premises, in the cloud or on the edge.”
Cloudera believes data can make what is impossible today, possible tomorrow. We empower people to transform their data into trusted enterprise AI so they can reduce costs and risks, increase productivity, and accelerate business performance. Our open data lakehouse enables secure data management and portable cloud-native data analytics helping organizations manage and analyze data of all types, on any cloud, public or private. With as much data under management as the hyperscalers, we’re a data partner for the top companies in almost every industry. Cloudera has guided the world on the value and future of data, and continues to lead a vibrant ecosystem powered by the relentless innovation of the open source community.
Pinecone created the vector database, which acts as the long-term memory for AI models and is a core infrastructure component for AI-powered applications. The managed service lets engineers build fast and scalable applications that use embeddings from AI models, and get them into production sooner. Pinecone recently raised $100M in Series B funding at a $750M valuation. The funding round was led by Andreessen Horowitz, with participation from ICONIQ Growth and previous investors Menlo Ventures and Wing Venture Capital. Pinecone operates in San Francisco, New York, and Tel Aviv.