Striim Announces Fully Managed Real-Time Streaming and Integration Service for Analytics on the Databricks Lakehouse
PALO ALTO, Calif., June 22, 2023 — Striim today announced Striim for Databricks, the first streaming SaaS solution to integrate database change streams using change data capture (CDC) technologies from enterprise-grade databases such as Oracle, SQL Server, PostgreSQL, MySQL, and other sources to the Databricks Lakehouse. Customers can quickly build a new data pipeline to stream transactional data from hundreds and thousands of tables to Databricks with sub-second end-to-end latencies to enable streaming analytics, refresh their AI/ML models in real time, and address time-sensitive operational issues.
“Enterprises are increasingly seeking solutions that help bring critical data stored in databases into the Databricks Lakehouse Platform with speed and reliability,” said Roger Murff, VP of Technology Partners at Databricks. “With this integration, customers can quickly and easily integrate their data into Databricks and begin analyzing and driving business value with data throughout their organizations.”
Organizations replicate data from multiple databases to cloud data warehouses, data lakes, and data lakehouses to enable their data science and analytics teams to optimize their decision-making and business workflows. Legacy data warehouses are not easily scalable or high-performant enough to deliver real-time analysis capabilities, while cloud-based data ingestion platforms can require significant effort to set up.
Striim for Databricks builds on Striim’s award-winning data integration and streaming capabilities to simplify building and operating data pipelines and enable real-time streaming workloads on the lakehouse. Using the newly-designed user interface, customers can configure and observe the ongoing and historical health and performance of their data pipelines, reconfigure their data pipelines to add or remove tables on the fly, and easily repair their pipelines in case of failures.
“A unified approach that combines the best of data warehouses and data lakes for modern workloads to make real-time decisions relies on fresh data being delivered in an open format,” said Alok Pareek, co-founder and Executive Vice President of Engineering and Products at Striim. “Our customers increasingly need operational data in Delta tables for their data analytics needs. We have designed Striim for Databricks to support Delta tables and Databricks Unity Catalog for operational ease, data sharing, flexibility, and resiliency so that our customers can use Spark Dataframes or SQL to easily extract business value from their data. We have automated schema management, snapshot, CDC coordination, and failure handling in the data pipelines to deliver a delightful user experience. ”
Striim for Databricks provides a high level of automation. Customers can set up their data pipelines with a few clicks, and Striim takes care of the rest. Striim uses patented technologies and Databricks best practices to parallelize writing to Databricks to maximize pipeline throughput and reduce end-to-end latencies. Striim continuously monitors and reports pipeline health and performance. Striim for Databricks natively stores and reports health performance data so customers can quickly analyze and optimize pipeline performance based on real-time, near-term, and historical data.
Customers can learn more about Striim for Databricks at www.striim.com/databricks. Striim builds and hosts the data pipelines in the customer’s chosen Databricks region, thus enabling them to meet their business and regulatory requirements. Striim for Databricks is designed with standard enterprise-grade security and reliability features, including end-to-end encryption, schema evolution, efficient state management, and automated alerting, monitoring, and notifications.
Striim, Inc. is the only supplier of unified, real-time data streaming and integration for analytics and operations in the Digital Economy. Striim Platform and Striim Cloud make it easy to continuously ingest, process, and deliver high volumes of real-time data from diverse sources (both on-premises or in the cloud) to support multi- and hybrid cloud infrastructure. Striim collects data in real time from enterprise databases (using non-intrusive change data capture), log files, messaging systems, and sensors, and delivers it to virtually any target on-premises or in the cloud with sub-second latency enabling real-time operations and analytics.