August 25, 2014

Five Steps to Demystify Big Data Analytics

Brett Sheppard

Sponsored Content by Splunk

Too many big data initiatives are science projects that take months of effort, risk failure and require highly trained data scientists with scarce skills. According to a CSC survey, 55 percent of big data projects aren’t completed and many others fall short of their objectives.

In “Why Most Big Data Projects Fail,” Dell General Manager Darin Bartik notes that business and IT groups are not aligned on the business problem they need to solve. Employees don’t have access to the data they need, making it impossible to find answers that will make the project successful. Further complicating issues, many tools, approaches and disciplines around big data are new, so people lack the knowledge and skills necessary to work with the data and achieve a successful business result.

There is a better way. The following steps can significantly shorten the time-to-value and risk of big data initiatives.

(1) Enable All Knowledge Workers to Benefit from Big Data

Ashish Thusoo ran the data analytics team at Facebook and realized from his experience that there is high value in democratizing data. As recited by Forbes.com author Dave Feinleib, “[Thusoo’s] goal was to make all capabilities related to data easy, from instrumenting applications and collecting data, to understanding and analyzing it, to creating data-driven applications.”

Organizations struggle to hire and retain data scientists who understand statistics, computer science and open-source technologies such as Hadoop or NoSQL data stores. According to McKinsey Global Institute, the United States will experience a shortage of between 140,000 and 190,000 skilled data scientists, and 1.5 million managers and analysts capable of reaping actionable insights from big data (McKinsey, Big data: the next frontier for competition, May 2011).

One way to address this lack of skills is to adopt technologies that bridge the gap between data scientists and knowledge workers. According to CITO Research in Big Data for Everyone (sponsored by Splunk), you need a “point-and-shoot camera for data,” where product managers, web analysts, risk managers, security analysts and other knowledge workers can simply point at Hadoop or another data store, and start exploring, analyzing and visualizing. Knowledge workers don’t want to be constrained by writing complex fixed schemas or migrating data into a separate data mart for analytics.

(2) Encourage a Data-driven Business Culture

While insights gleaned from big data can improve decision making, they do not rule out the vagaries of human behavior. All too often, David Sandler’s observation remains true: “People make decisions emotionally and then justify them with data.”

How can you encourage your organization to make data-driven decisions, instead of relying on the “HIPO” (highest-paid person’s opinion)? By combining data-driven metrics with storytelling and visualizations.

In Made to Stick, Chip and Dan Heath document why some ideas survive and others die. While data provides credibility, stories empower people to use an idea through a memorable narrative with unexpected, concrete details.

“When data and stories are used together, they resonate with audiences on both an intellectual and emotional level,” according to Stanford University Professor of Marketing Jennifer L. Aaker in her Persuasion and the Power of Story Video. As your big data projects succeed, share how data played a major role in making the project successful within your organization. Sharing these stories is a great grass-roots way to encourage a data-drive business culture.

(3) Stop Sampling and Embrace Raw Data

A hidden secret of many big data projects is that assessments are based on models of a subset of data that’s meant to be a representative sample. While this works fine if you’re trying to determine whether an episode of Glee is as popular as other TV shows, what if you’re a retailer that wants to understand customer interactions across offline and online channels? Or an investment bank that wants to measure risk in a portfolio? You need to be able to search, analyze and visualize raw granular data, not just sample.

By embracing raw data, you can analyze granular transactional, web and mobile data at massive scale and deliver a score by account, household or segment.

(4) Adopt Complementary Technologies in a Big Data Enterprise Architecture

Use the strengths of data warehouses, business intelligence software, machine data platforms, Hadoop and NoSQL stores, and enable them to coexist in your organization’s data architecture. For example, Cloudera and Teradata jointly published a useful guide outlining requirements that are best suited for either a data warehouse or Hadoop, “Hadoop and the Data Warehouse: When to Use Which.”

The adage, “If all you have is a hammer, every problem looks like a nail,” is true for big data projects, and it’s important to understand the role of every technology. For example, there are worthwhile use cases for do-it-yourself Hadoop and Apache Pig, Apache Hive or SQL on Hadoop, but understanding where to use each, and more importantly, how they complement each other can make or break a big data project. The key is to use the strengths of complementary technologies to support your projects.

(5) Apply Role-based Security for Data Lakes

To move past data silos and take full advantage of low-cost batch storage technologies like Hadoop, many organizations are looking favorably at a “data lake” or “data reservoir” model. In this model, data is stored once and shared by multiple business and IT stakeholders. This architecture requires role-based access controls to protect sensitive data and customer privacy. Someone in finance may be authorized to see customer non-public information such as home address and credit card number, while a marketing analyst sees masked data.

About Splunk

Splunk Inc. provides the leading software platform for real-time Operational Intelligence. Splunk software and cloud services enable organizations to search, monitor, analyze and visualize machine-generated big data coming from websites, applications, servers, networks, sensors and mobile devices. To learn more, visit splunk.com/company.

Vendors: Splunk

Tags: big data, Big Data Analytics, splunk

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Five Steps to Demystify Big Data Analytics

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 6, 2024

May 3, 2024

May 2, 2024

May 1, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Five Steps to Demystify Big Data Analytics

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 6, 2024

May 3, 2024

May 2, 2024

May 1, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link