Follow Datanami:
May 29, 2018

Citizen Data Scientists Take On Bigger Roles

Harry Glaser


There’s a common theme I’ve detected in talking to customers about how data is used now and what the future holds. Every company has a clear vision of where it wants its data to go, but no clear, unified vision of how to get there or who will lead the way. The future is clear, but the present is not.

Every business wants to use data to streamline the decision-making process and get more profitable results. Not just that, they want to apply data science to build predictive machine learning models that accurately illustrate how those data-driven decisions will impact the future of the business. With tactical advances being observed in artificial intelligence and machine learning technology, this is clearly how data will be analyzed in the future.

When it comes to who will play a role in analyzing that data, there’s also been a consensus — a much larger group of employees expect to be involved. The hunger for data insights is increasing at such an impressive rate that the analysis workload cannot be restricted to only formally trained data professionals; it’s already spilling over into new teams. What has traditionally been the realm of a formal data team, full of people with years of academic and on-the-job data science training, is expanding to include a larger group of data analysts and even employees from outside the data team who are data-literate and data-savvy enough to perform their own analysis.

The popular term used for this emerging group is “citizen data scientists,” and they are the driving force behind the latest stage of the data analysis evolution. A citizen data scientist believes that getting more data into the hands of more people on more teams is the simplest way to let that data inform more decisions.

Of course, it’s not that simple. Access to data does not just transform a citizen into a data scientist. This is the sticking point in the citizen data scientist experiment. The primary obstacle preventing the spread of data science is not simply access, it’s the precision and experience required to ensure that data is being properly understood.

When data scientists focus on curation, it frees citizen data scientists to pursue analysis (metamorworks/Shutterstock)

In the rush to empower an organization full of citizen data scientists, there’s definitely a risk of making bad decisions based on incomplete or inaccurate understanding of data. The democratization of data comes with that understood risk of bad analysis resulting from misguided exploration or misunderstood findings. For a lot of companies, that risk is too great to give traditional business users access to data. The responsibility to use the data wisely is too great of a barrier for most people to enter the analysis world seriously.

The conversation about citizen data scientists cannot end there. Instead of coming to the conclusion that citizen data scientists are too big of a risk, consider a new question: how can you provide an environment that will allow them to be productive? Can new technology and new business practices solve this problem? This is where existing data scientists come into play. Instead of having them train or double-check work done by business users, utilize them instead to curate the data into a single source of truth and provide a structured environment within which others can confidently explore.

With formal data scientists providing guidance and structure, the citizen variety is free to maximize their value, exploring data that relates to their knowledge of a particular line of business. This approach lets data scientists do the heavy lifting with data preparation and allows citizen data scientists to lead the final steps in translation to business value – it’s the best use of both skill sets. This approach increases the data literacy of individuals outside of the formal data team and lays the groundwork for a fleet of citizen data scientists to begin exploring data, discovering insights and translating them to business value.

If the goal is to spread access to analytics as widely as possible, this structured analytical environment is the best step for organizations to take right now. It places a lot of importance on data scientists to create a suitable environment for other members of a business, but that’s their speciality.

With data scientists curating data for citizen data scientists, the personnel aspect of the analytical process is optimized. Everyone has access to exactly what they need to optimize their skillset and arrive at decisions quickly. This is the realization of the ideal outcome that customers have described to me.

The path from today’s descriptive data environment to the consensus predictive one is still murky, but I’d suggest that the best answer to how we use data starts by addressing who uses it. With the right people analyzing information at the right depths, the groundwork is laid for more advanced data initiatives and new, more-informed lines of analysis. The predictive data landscape of the future is achievable, but only if we figure out the more pressing question of how to smoothly accommodate citizen data scientists today.

About the author: Harry Glaser is a co-founder and CEO of Periscope Data, an end-to-end analytics platform for data teams. Prior to founding Periscope Data in 2012, Glaser was a product manager at Google.

Related Items:

What Kind of Data Scientist Are You?

Taking the Data Scientist Out of Data Science

Standards Effort Seeks to Redefine ‘Data Scientist’