Follow Datanami:
December 9, 2019

Data Wrangling – Balancing Self-Service with Governance

Most organizations understand the importance of fully leveraging the large quantities of data available to them. Yet, most of these organizations are running into a bottleneck that is a relic of old, IT-driven data transformation processes. Back when data consisted of little more than transactional information, IT teams would silo off data centers that were tightly governed and restricted. It was IT’s job to provide clean data that the business needed to run reports, as opening up access to data would have been a security nightmare. Additionally, most of the legacy transformation tools required a level of technical expertise that most business professionals at the time did not have. When requirements changed, IT would have to adjust the ETL processes that delivered the outputs to the business. As data volumes explode, and the complexity of data and opportunity for value increases, the same strategy is no longer viable. IT teams cannot respond fast enough to requests from the business to provide new datasets in response to often changing requirements. Additionally, the effort needed to transform data for the business takes away from valuable time needed to monitor security and ensure proper governance of data. IT cannot keep up with the changing demands of the business, and the business cannot wait to be provided the data they need. The data sits unused and a huge opportunity for value is missed. Organizations need to take a new approach.

Embracing DataOps

DataOps is all about creating efficient operations with data, and the best way to accomplish this is to embrace a shared platform where data workers and IT teams collaborate towards a shared set of goals. Organizations need to adapt their process to empower data workers with self-service agility. In order to analyze, model, and extract the full value of data available to organizations, data workers need to be able to directly explore and refine raw data for its downstream purposes. At the same time, IT teams need to be able to maximize security, manage access and monitor data pipelines. Trifacta provides a shared platform focused on collaborative governance where data workers get direct access to the data they need to be successful and productive in driving insights and value through data, while IT teams can automate and monitor data pipelines to create scalable, efficient processes. Both organizations are able to maximize the time they spend on what they excel at.

Self-Service vs. Governance is No Longer a Trade Off

It is no longer the case that self-service is at odds with governance. Trifacta provides a platform where users collaborate in exploring and creating data preparation workflows in a highly secured and tightly governed platform. By processing data with cloud native execution frameworks, allowing administrators to assign groups, roles and restrictions, and providing monitoring and alerting on data pipelines, Trifacta ensures proper governance for IT teams. By opening up access to the data that business teams need in order to effectively explore, clean and prepare data for analytics, machine learning and AI, Trifacta unleashes the potential of creating value out of data.

Interested in trying Trifacta for yourself? Sign up for a free trial today!