Follow Datanami:
July 7, 2020

Qubole Expands Integration with Informatica’s AI-driven Data Engineering Integration

SANTA CLARA, Calif., July 7, 2020 — Qubole, the open date lake company, announced an enhanced integration with Informatica’s CLAIRE-powered Data Engineering Integration to help organizations overcome the challenges of managing and governing big data on cloud data lakes. With Qubole’s open data lake platform, customers can build intelligent data pipelines up to three times faster using Informatica’s zero-code visual design interface and process those data pipelines using Spark on Qubole. This reduces data ingestion times significantly while eliminating the need to manually administer Spark clusters. Qubole also unveiled Qubole Release 59 (R59), a major update to its flagship platform, with new features for increased productivity and reduced TCO.

Enterprises are bursting at the seams with data. As businesses deal with such exponential data growth, compounded by varying data types and architectures, they require a simpler and more efficient solution. Qubole’s integration with Informatica provides the ability to do end-to-end metadata-driven data integration – building, orchestrating, and processing data pipelines. With Qubole and Informatica, customers can easily migrate their on-premises legacy data lakes to the cloud, while lowering costs of these deployments.

“There is a direct correlation between larger data workloads and the move toward cloud-native architectures,” said Ashish Thusoo, CEO and co-founder, Qubole. “As enterprises look to do more with their data lakes and reduce data processing and infrastructure costs, they’re turning to cloud architectures to improve uptime, avoid vendor lock-in and gain price leverage. With this Informatica integration, Qubole is answering that call to provide a robust and future-proof data management paradigm to support fast data lake adoption with a wide range of data processing needs.”

“High quality trusted data is critical for cloud analytics and overall cloud data lake modernization,” said Jitesh Ghai, SVP & GM, Data Management at Informatica. “The integration between Informatica and Qubole enables customers to build end-to-end intelligent and automated data pipelines that process data from a variety of sources, allowing them to accelerate productivity and gain competitive advantage.”

Qubole R59

Today Qubole also released R59, its second major product update of the year, with new features and enhancements that empower data teams to be more productive and reduce TCO. Updates include:

  • Workbench – Qubole now offers Workbench across AWS, Azure, and GCP. Key highlights include gradual rollout progressing for collections, and live cluster health metrics features; ability to list contents of AWS S3 buckets from any region in the Workbench storage tab; stability and usability fixes.
  • New out-of-the-box visualizations – The notebooks provided by Qubole now offer Qviz, a data visualization framework that enables users to render DataFrames with improved charting options, as well as autocomplete suggestions for Spark and PySpark notebooks.
  • Notebook workflows – This facility allows chaining of Notebooks to stitch a sequential extract-transform-load (ETL) workflow in another wrapper Notebook that can be scheduled. It also allows inclusion/concatenation of Notebooks.
  • New Package Management UI – Allows users to install packages from custom channels, and also restore environments to a previous state by accessing the Activity History.
  • GIT integration with Airflow clusters through DAG explorer – This allows integration of Continuous Integration / Continuous Deployment (CI/CD) of Airflow DAGs with version control.

Qubole is available on AWS, Google Cloud, and Microsoft Azure. For more information about how Qubole simplifies machine learning, streaming analytics and data exploration, visit Qubole.com.

To learn more about Qubole’s integration with Informatica, visit here.

About Qubole

Qubole is the open data lake platform for analytics and machine learning that large enterprises depend on to quickly harness the power of data and gain valuable business insights. Only Qubole provides a truly open platform that works with all major cloud providers and data processing engines. The company’s unified environment includes optimized versions of Spark, Presto, Hive and Airflow, with intelligent automation that scales usage up or down to meet service-level needs and minimize cloud costs. Based in Santa Clara, Calif., Qubole has offices in New York City, San Francisco, London, Singapore and Bangalore. For more information, visit us online.


Source: Qubole 

Datanami