Follow Datanami:
November 14, 2018

Trifacta Extends Data Preparation to DataOps with New Functionality for Data Engineers

SAN FRANCISCO, Nov. 14, 2018 — Trifacta, a global leader in data preparation, today announced a new set of capabilities designed to enable data engineers within data operations (DataOps) practices to more efficiently develop, test, schedule and monitor data preparation pipelines in production. With RapidTarget, Trifacta customers can utilize an existing data model to intelligently guide transformations and accelerate the process of generating a new  output dataset that matches the predefined schema. Automator provides end-to-end management of scheduling, monitoring and refining preparation workflows. These new features, along with the new Deployment Manager framework, facilitate the work of data engineers in administering data pipelines that feed analytics, machine learning and data science initiatives.

“Today’s organizations are increasingly adopting DataOps to more efficiently scale their analytics and data management practices,” said Wei Zheng, VP of Products at Trifacta. “Data engineers are at the center of this shift, playing a critical role in the data preparation practices that form the backbone of their data efforts. Our team has extensively studied the emerging role of data engineers within our customers’ organizations and focused these latest enhancements on improving their ability to transition data prep workflows into scalable, repeatable data pipelines. Our introduction of these new capabilities demonstrates the expanding role Trifacta is playing within the data management practices of our customers.”

DataOps is clearly on the rise, with 73 percent of organizations indicating they planned to invest in the methodology in 2018. In light of this growing adoption of DataOps practices, the role of the data engineer in configuring, managing and scaling data pipelines has risen as a critical position within data management. Trifacta, through its work with large-scale enterprises including New York LifeGlaxoSmithKline and Deutsche Boerse, has seen first-hand how data engineers are a driving force behind the maturation of data preparation from an ad-hoc activity to a standard enterprise data management process. These new features in the Trifacta platform further support the role that data engineers perform in promoting data preparation flows developed by analyst end users into scalable data pipelines that deliver value to the broader organization.

“Data engineering is a critical function at Malwarebytes—ultimately, we’re the ones responsible for the secure adoption of new analytics use cases across the organization,” said Manjunath Vasishta, Director of Data Science and Engineering, Malwarebytes. “Trifacta’s data preparation platform has been instrumental to that objective, allowing us to provide our business users with increased accessibility to data, while also maintaining governance. As data preparation usage continues to grow, the platform’s functionality provides data engineers with even further assurance to scale and operationalize data prep workflows in order to keep pace with the demands of the business.”

The new features from Trifacta to further support data engineers and DataOps-focused organizations include:

  • RapidTarget: Allows data engineers to set a predefined schema target that provides automated guidance for how diverse data sources must be prepared and joined together in order to map to that target. Aligning the transformation process to an existing data model is critical when integrating unfamiliar or external data into an analysis. RapidTarget offers intelligent suggestions to accelerate how users prep and blend data sources in order to align their output to fit a desired data model.
  • Automator: Trifacta’s system to intelligently manage the scaling, scheduling and monitoring of data prep workflows in production. Users can set the schedule for flows to automatically run in production at a given time, when data updates or programmatically using APIs. Upon scheduling, users can define parameters or variables to customize what data is input into a workflow and how outputs are published.  Finally, they can monitor the status and performance of jobs, and set alerts when input data changes or anomalies occur.
  • Deployment Manager: Provides a framework for testing, versioning and managing data preparation workflows as they transition into enterprise-wide data pipelines. Before a new workflow can bring value to the broader organization, the flow must be tested at scale. With Trifacta’s Deployment Manager, data engineers can seamlessly test new workflows across development, testing and staging environments prior to being transitioned into production pipelines. The framework also allows data engineers to manage the versioning of flows and even rollback to prior versions of flows if needed.

According to a recent Gartner report, “Providing the right data to the right people at the right time is a constant challenge. Data engineers are emerging as a crucial role in addressing this challenge. Data and analytics leaders must therefore develop a data engineering discipline as part of their data management strategy.”*

Learn more about how data engineers are using Trifacta as part of broader adoption of DataOps practices on Trifacta’s blog here.

*Gartner, Data Engineering Is Critical to Driving Data and Analytics Success, Roxane Edjlali, Nick Heudecker, Ehtisham Zaidi, 4 October 2018

About Trifacta

Trifacta is a global leader in data preparation. Trifacta leverages decades of innovative research in human-computer interaction, scalable data management and machine learning to make the process of preparing data faster and more intuitive. Around the globe, tens of thousands of users at more than 8,000 companies, including leading brands like Deutsche Boerse, Google, Kaiser Permanente, New York Life and PepsiCo, are unlocking the potential of their data with Trifacta’s market-leading data preparation solutions. Learn more at trifacta.com.


Source: Trifacta

Datanami