How to Turn Your Company Into a Data-Driven Enterprise
Business executives are feeling enormous pressure to ensure their companies are data-driven, be they at startups or global enterprises. Yet C-Suite surveys show there’s confusion about how to implement an effective data culture. According to a recent report by KPMG Capital, the vast majority of executives at large enterprises are struggling with “how to turn data into insights, and insights into real business advantage.”
In this race to embrace big data, the reflexive move would be to hire more data scientists and charge them with making sense of the mayhem. But there’s a major distinction between data science and data-driven decision making. And you might be surprised to learn that creating or expanding a data science team might not provide you with the business value you are striving to achieve.
Data science is an excellent resource for solving specific problems and offering improved services. For instance, Netflix and Amazon.com built powerful recommendation engines to predict a customer’s interests based on previous purchases. Or a bank might rely on an algorithm to determine who is a good candidate for a loan.
This type of specific goal is appropriate for a data science team to tackle. However, when it comes to meeting broader business goals, data science cannot replace decisions made by people with domain expertise. And an algorithm is not effective when the problem is unpredictable or associative. These are situations that call for empowering non-technical domain experts with tools they can use to analyze and visualize data.
The mark of a truly data-driven organization is one where a large portion of the organization is using data analytics on a regular basis and big data is no longer only the sphere of IT. For instance, at Wix, the super popular cloud-based web development platform, departments across the company have their hands on the company’s big data (details here). In the highly competitive travel industry, Magellan Luxury Hotels’ COO made it his mission to share big data across the board so agents can analyze sales closings, destination performance, and other metrics that help them better serve customers in near real-time.
Instead of asking what type of data stack does your company need and how many more PhDs should it on board, start asking questions like “What type of information consumers does our organization consist of?” and “What type of information do they work with?” Only then can you build your data strategy from the bottom up by providing them the tools they need to help themselves.
Hiring more data scientists may be necessary as well. But what you don’t want is a scenario where a handful of IT specialists are empowered and the rest of the organization is sitting in the dark for days or weeks, waiting for replies to their queries. (And by the way, here are more questions to consider as you map out your strategy.)
Know Your User, Know Your Data
So how does one determine what types of information consumers are prevalent in their organization and which tools best suit their needs? Start by gathering answers to the following questions:
- Do the majority of users rely on Excel?
- Do most rely on canned reports?
- Is there a critical mass of staff with data science degrees?
- Are the majority of decision-makers non-technical business users?
- What is the volume of data? (i.e. How much data do you have?)
- Is the data structured, unstructured or both?
- What is the rate of data increase? (i.e. How fast is it growing?)
- How many different data sources do you have?
- Where is the data? (Cloud? Private Cloud? On Premises?)
- Do you already have an infrastructure? (For example, Hadoop or a stack on top of Hadoop)
If you answered yes to 1, 2, and 4, you are probably looking for a data exploration and/or visualization tool for business users. Without a critical mass of data scientists on hand, you should avoid tools that pose integration challenges or require scripting knowledge. Business users need results quickly, in real time. Often they need answers to queries on the fly – during meetings or via mobile device if they’re on the road. Visualization tools allow business users to create reports but are often limited in scale. For instance, if you have more than three data sources or a growing number of users, a visualization tool might choke. Exploration tools allow for analysis of more complex data sources but generally require IT or a consultant to implement and does not always scale. Therefore, an increasing number of companies are turning to new blended technologies that offers visualization and data exploration.
Data scientists want to get their hands dirty in the world of algorithms, so if you answered yes to 3, there are a number of different possibilities depending on the make-up of your IT staff. R, MATLAB, and SAS are generally tools relevant for statisticians and others in the applied sciences. NumPy or SciPy are a good fit for Python developers. And Dato (formerly GraphLab), Apache Spark, and Apache Mahout are a good fit when it comes to distributed development.
Once you have a clearer picture of your users, you must understand your data (questions 5-10). For projects with limited scope that utilize a single data source, a data mart solution might suffice. For example, you might choose a solution build on OLAP (online analytical processing) or IMDB (in-memory databases) technology.
When data volumes are large, rapidly growing or may unpredictably spike, a data warehouse is needed to create a single centralized data store to serve multiple users and multiple business scenarios (in other words, to create a single version of the truth). Recently, in-chip analytics technology was introduced, which enables non-technical users to analyze and visualize terabytes of data in real time, thus empowering business users to take on challenges that previously necessitated help from IT.
Ideally, data scientists and decision makers can complement each other’s efforts. When non-technical business users are given the tools to be productive in a world where they can crunch a billion rows of data on their own, that frees up the data scientists to work on complex algorithms.
Of course, for many businesses this is a moot point. They simply cannot afford a data science team. The good news is that it’s possible for them to create a data-driven culture and leverage it to grow and thrive.
About the author: Eldad Farkash is a co-founder and CTO of Sisense and a serial entrepreneur with over 15 years of experience in building startups that focus on product & technology innovation. Eldad founded SiSense to disrupt the status quo in Business Analytics by creating a technology that makes big data analytics accessible and affordable for companies of all sizes. A 2013 recipient of the prestigious World Technology Award recognizing his development of In-Chip technology, Eldad continues to shake-up the BI market as Sisense’s CTO, overseeing the company’s technology, product stack, and user experience.