Follow Datanami:
January 13, 2020

How Data Engineers Have Helped Data Prep Grow Up

In recent years, a new term in data has cropped up more and more frequently: DataOps. As an adaptation of the software development methodology DevOps, DataOps refers to the tools, methodology and organizational structures that businesses must adopt to improve the speed, quality, and reliability of analytics.

Seems pretty straightforward right? Unfortunately, it’s not.

There are three main pieces to the DataOps puzzle that any organization must account for: technology, process, and people. The organizations that we’ve rubbed shoulders with through our work at Trifacta understand the balance involved; for successful DataOps, each of these pieces must inform and depend upon the other. Investing in one does little good without consideration of the others.

Data Prep: Technology, process, and people

When it comes to technology, the emergence of cloud and self-service has led to broader “analytics modernization” initiatives, wherein companies are augmenting or replacing existing analytics investments with modern solutions designed for today’s users, computing platforms and governance requirements.

However, successful adoption of the “modern analytics stack” relies on one fundamental process challenge – how do you balance self-service, governance, and scale? An impossible challenge if clearly defined roles & responsibilities are not set.

Then, there are a variety of different people involved in an organization’s analytics processes. IT or data architects manage the technology infrastructure. Data analysts and data scientists work hands-on preparing and analyzing the data. But once an analyst develops something of value, how do the right people across the organization get access to it? How do you make sure the data is accurate and well-governed? How does the entire process get configured to be scalable and automated?

This is where data engineers come in.

How the data engineer has helped data prep grow up

Data engineers are focused on taking the work of end users and operationalizing it for the broader organization’s use. As end users build new data prep workflows and analysis of value, it’s the role of data engineers to manage data prep; the process of scaling, scheduling and governing this work. In a sense, it’s a hand-off between the individual who has the greatest context for the data and the individual who has the greatest context for the organization’s systems, processes, and data governance.

This hand-off between data engineer and knowledge worker has become increasingly critical because the data prep process has transitioned from siloed desktop applications to modern cloud or data lake environments. And it’s why we’ve architected Trifacta to utilize scalable, modern computing platforms in the cloud. End users have the freedom to work with any type of data regardless of size or shape and data engineers have the appropriate computing environment to manage the governance and operationalization of valuable work developed by their end users through data prep.

In this sense, data engineers have helped data prep grow up.

What once was a siloed activity using excel or desktop apps, data prep is now a repeatable, scalable process that can fuel the broader organization’s DataOps practices in the goal of constantly improving the velocity, quality and reliability of analytics.

Not only have data engineers helped data prep grow up but they’ve also helped our platform mature.

Interested in trying Trifacta for yourself? Start free today!

Datanami