Follow Datanami:
October 23, 2023

Databricks Make a CDC Play with Arcion Acquisition

The quicker you can access transactional data directly from the change log of a database, the quicker you can do analytics or AI on it. That’s the basic math behind Databricks’ announcement today that it intends to acquire change data capture (CDC) developer Arcion Labs for $100 million.

Arcion Labs was founded in 2016 by Miryana Joksovic and Rajkumar Sen to address the difficulty in moving data among databases, applications, and the cloud. Originally called BlitzzIO, the company developed what it termed a Web-native, distributed CDC product that simplifies the creation and maintenance of data pipelines that can support high volumes of transactional data with guaranteed delivery.

Arcion connects to more than 20 transactional databases and data warehouse platforms. That includes transactional databases like Oracle, SQL Server, Db2, MySQL, PostgreSQL, MariaDB, SAP Hana, MongoDB, Cassandra, YugabyteDB, SingleStore, and Kafka, as well as targets like Teradata, Snowflake, Google BigQuery, Amazon Aurora, and Databricks.

The Arcion CDC offering is log-based, which is the gold standard for speed and accuracy. It claims that its native CDC connectors can replicate data five times faster than traditional batch-based ETL workloads. It’s able to detect and replicate changes made to DDLs, DMLs, and database schemas, helping to keep data accurate.

Databricks wants to use Arcion to speed the replication of transactional data into its lakehouse platform, says Ali Ghodsi, co-founder and CEO at Databricks.

“To build analytical dashboards, data applications, and AI models, data needs to be replicated from the systems of record like CRM, ERP, and enterprise apps to the lakehouse,” Ghodsi says in a press release. “Arcion’s highly reliable and easy-to-use solution will enable our customers to make that data available almost instantly for faster and more informed decision-making.”

Databricks had previously partnered with Arcion and was also an investor with the San Mateo, California company, including in a $13 million Series A round in February 2022. It announced a partnership with Arcion for replicating data into the Databricks lakehouse two months later.

Related Items:

Databricks Nabs $500 Million, Pre-IPO Investment by Nvidia

Databricks Versus Snowflake: Comparing Data Giants

Exploring the Top Options for Real-Time ELT