Follow Datanami:
June 10, 2024

Infoworks Streamlines Hadoop to Databricks Migrations with Unity Catalog Integration

PALO ALTO, Calif., June 10, 2024 —, a leader in data engineering software automation, recently announced that it has added Databricks Unity Catalog integration to Infoworks Replicator – its industry-leading solution to automate migration of Hadoop data and metadata to the cloud.

Migration of Hadoop to the cloud using legacy methodologies has been plagued with technical challenges, budget overruns, and delays, as these methods require large teams of specialized labor and lack significant automation. Infoworks Replicator has been proven at scale to alleviate these issues, substantially lowering the cost, risk, and time-to-value of Hadoop migration. Infoworks Replicator has been proven in some of the largest enterprise cloud migrations in the world.

With the addition of Databricks Unity Catalog to Infoworks’ list of data catalog integrations, Infoworks has further streamlined the Hadoop migration path for Databricks customers. Replicator with Unity Catalog integration enables automated data and metadata migration and cataloging in Databricks Unity Catalog in a single step, which establishes and maintains a robust unified data governance framework at every stage of migration. The result is immediately usable, business-ready data and accelerated time-to-value.

“We have seen a surge in demand for Hadoop migration solutions as businesses respond to the urgency of leveraging their data to generate business value through AI and analytics,” said Amar Arsikere, Infoworks Founder and CTO. “Infoworks has enabled some of the largest companies in the world to accelerate migration and creation of new business value from their data, and we continue to rapidly innovate to meet the needs of our customers.”

About Infoworks Replicator

Infoworks Replicator is the most automated solution on the market for large-scale Hadoop migrations. Amongst the capabilities provided by Infoworks Replicator are:

  • No-code automated data replication
  • Automated validation to self-heal data migration pipelines
  • Continuous incremental replication to synchronize changing data and metadata on source and target
  • Scalable performance leveraging existing Hadoop clusters
  • API endpoints provide efficiency in migrating a large number of entities – 100s of thousands of tables
  • Automated fault tolerance and recovery from the point of failure
  • Dynamic network throttling to optimize use of network resources

Applications of Infoworks Replicator include Hadoop migration and cloud-to-cloud data lake migration. Learn more about how to accelerate your Hadoop to cloud migration here.

About Infoworks

Infoworks is a leading a provider of comprehensive software solutions that deliver unprecedented levels of data engineering productivity through innovative automation and AI. Infoworks products have automated cloud migration and multi-cloud data operations for some of the world’s largest data-intensive enterprises in industries including healthcare, telecommunications, and financial services, operating on all major public clouds and data platforms. Infoworks enables enterprises to eliminate the bottleneck of data engineering and focus on creating business value from their data through analytics and AI.

Source: Infoworks