January 7, 2014

Datanami Dishes on ‘Big Data’ Predictions for 2014

Alex Woodie

This space was going to feature a “Top 10 Big Data Predictions for 2014″ story. But considering the large number of such stories currently in circulation, a different tact was in order. Instead, you’ll find a selection of pertinent predictions from players in the “big data” software industry, followed by Datanami’s opinion as to whether it will be spot on or whether the soothsaying will miss the mark.

Prediction No. 1: The End of Data Scientists

Tableau Software is predicting that 2014 will mark the beginning of the end of the data scientist (the “sexiest job of the year” for 2013), and the movement of data science away from IT specialists. Even the common data analyst appears on Tableau’s big data hit list; the company insists that data science moves away from people who have “analyst” in their title, and begins to be used by “the everyman.”

Data scientists–that’s so 2013.


Datanami’s Take: Mostly True

There simply aren’t enough data scientists to go around. It may seem odd to find joy in forecasting the end of data scientists, who are the best qualified people to make data analytics a reality at any given organization. But for the industry to take the next step and to evolve toward something more mainstream and less esoteric, the level of difficulty absolutely has to be driven out of the technology. Saying “end the data scientists” is just another way of saying “end the complexity.”

2014 Prediction No. 2:   The Burst of the BYOD Bubble

The bring-your-own-device (BYOD) movement has gone as far as it can, predicts Adaptiva, a provider of enterprise systems software. While mobile devices are great for enabling people to read data on the go, the thinking goes, they’re lousy at allowing people to write data. Also, smartphones and tablets pose horrible security risks, which further marginalizes their use–at least among enterprises that respect data and their customers’ privacy. “Securing these devices is nearly impossible because even in a tightly controlled environment, the devices themselves cannot be locked down,” Adaptiva says. “Users can install any application, visit any website, and transfer any data outside the company’s network.”

Datanami’s Take:  No Freaking Way

Like the Borg, resistance of the BYOD movement is futile. BYOD may not survive in its current incarnation, but it will adapt to the changing enterprise realities. Mobile devices already play an instrumental part in people’s lives today, and they will continue to evolve to enable better and more automated consumption and generation of data at the point of person. If a mobile Charlton Heston was alive in the corporate world today, he’d put it this way: “You can have my iPhone when you pry it from my cold, dead hands.”

Prediction No. 3. Analytics Shift to Embedded BI

Tableau says we’ll start to see embedded data analytics and “business intelligence” (anybody remember that term?) begin to emerge in 2014. “Analytics start to live inside of transactional systems,” the company says, “and scenarios like customer relationship management will lead the way with analytics providing support for the many small decisions salespeople make in a day.”

Datanami’s Take: Absolutely Spot On

Just as powerful software will begin to abstract away some of the complexity of analysts, thereby lessening the need for data scientists, so, too, will much of the analytics begin to move away from “analytic products” per se and begin living in the day-to-day operational systems, where it can have the biggest impact. At a very general level, you can expect to see better harmony between Hadoop systems, where discoveries are made, and NoSQL-based systems, where the discoveries are monetized and put into action.

Next — 2014 Predictions No.s 4-6 –>

Prediction No. 4: Relational SQL Promises and Disappoints

In 2013, we saw the re-emergence of good old SQL as a key technology for company’s big data and analytics strategies, as technologies like Hive, Drill, and Impala flourished. Unfortunately, SQL’s need for data structure is a blessing and a curse, according to MapR Technology’s CEO and co-founder John Schroeder. “Centrally structuring data causes delays and requires manual administration,” he writes. “SQL also limits the type of analysis. An over emphasis on SQL will delay organizations fully leveraging the value of their data and delay reactions.”

Datanami’s Take: Pick ‘Em 

We’re gonna go all wishy-washy here. On the one hand, trying to force structure upon data that isn’t naturally structured (or that doesn’t naturally have much structure to it) may eliminate some of the benefits of that data. Not knowing what data you’ll need to store next is what drove such huge interest in loosely structured NoSQL databases over the past few years. On the other hand, most of the world’s applications already speak SQL, and many of the world’s programmers do, too. In the final analysis, building stronger SQL ties will accelerate development and use of data analytics in the short term, but it has the possibility to slow long-term innovation in utilizing emerging unstructured data types.

Prediction No. 5: Hadoop Ventures Closer to Real Time

Hadoop version 1, aka “the MapReduce paradigm,” is dead and buried. People today want Hadoop to give them the answer, like, now, already. The launch of Hadoop version 2 in late 2013 brought us Yet Another Resource Negotiator (YARN), which will be instrumental in bringing new engines into Hadoop, including those that provide more real-time processing. MapReduce also gets more real-timey, thanks to Pig and snap-on components, such as Syncsort’s DMX-h tooling.

Datanami’s Take: Inevitable

Americans are not patient people by birth, and this urge to speed things up is reflected in all aspects of life. In the world of big data processing, Hadoop is the midst of a massive makeover that will see it emerge as a speedy, real-time hub of information for the corporate datacenter–an “enterprise data hub,” if you will–provided its handlers adhere to the corporate standards of normalization, auditability, and security.

Prediction No. 6: The Wearable Internet of Things Grows

Sorry Sergey, but Google Glass is not yet cool.

In 2014, we’ll continue the work of wiring up the physical world with sensors that will feed us a continual stream of data about things we’re interested in. Increasingly, we’ll also be instrumenting ourselves with wearable sensors, like the Google Glass and the Samsung Galaxy watch. “We’ll see the Internet of Things explode in 2014,” Pivotal’s global head of data science services Annika Jimenez says in a blog post. “Not just the industrial internet, which will be a big driver, but as we saw at Databeat, a lot of startups will arise from devices with sensors.” Data coming off sensors that generate data every 10 minutes “will clearly be the next big wave of Big Data. This will inspire all sorts of entrepreneurial activity, impact industrial companies like GE, as well as our daily lives through wearables.”

Datanami’s Take: A little bit yes and a little bit no.

The amount of data generated by the so-called Internet of Things will increase at a rapid rate in 2014. And likewise, the vendors who help organizations get actual value of that data, such as Splunk and GE-spinoff Pivotal, will have strong years. But Datanami must draw the line at wearables. Until somebody really famous and cool starts wearing something like the Google goggles on a daily basis (Ashton Kutcher, are you listening?) then wearables will be constrained to the geek chic niche.

Next — 2014 Predictions No.s 7-10 –>

Prediction No. 7: Big Data Vendor Consolidation Begins

Cloudera CTO Amr Awadallah said he expects to see consolidation among big data startups. “Some companies will start to close their doors, while others will probably get acquired (e.g., MapR and Hortonworks),” Awadallah wrote in a recent Sandhill.com article. That was clear swipe at Cloudera’s direct competitors in the Hadoop space. Cloudera has gone on record as saying that it doesn’t consider the other Hadoop distributors as its primary competitors, but instead sees itself competing against the likes of tier-one megavendors like IBM and Oracle.

Datanami’s Take: More than likely true.

The level of hype in the big data market is unsustainable, and so is the rate at which new startups are opening their doors and VCs are funding them. There is still lots of room for growth in the big data market, which IDC says will grow at a CAGR of 27 percent to hit $32 billion by 2017. But there also needs to be a reckoning of some of the weaker business models and technologies. Cloudera may be claiming “game over” in the Hadoop space, but it’s actually the NoSQL database market that is riper for a shake up.

Prediction No. 8:  Rise of Analytics 3.0 and Chief Analytics Officer

In 2014, analytics will become increasingly more embedded in daily workflows, processes, products, and services, the International Institute for Analytics (IIA) recently predicted. We’ll also leaning more heavily on machine learning algorithms to help us cope with the raw influx of data, including data from facial recognition systems and “wearables.” As the hype surrounding big data analytics melts away, users will be forced to find a new balance between automated decision-making and human intervention.

Datanami’s Take: Mostly Spot On

The IIA hits the nail on the head when it comes to the bigger role that analytics is about to play in our lives, both at work and at play. The speed at which the field of analytics is evolving is truly amazing, and it will require some getting used to. Except for wearables. Datanami is not on board with that stuff.

Prediction No. 9: Excel Sent Off to ETL Pasture

In many instances, the incredible growth of data sets is exceeding the capabilities of Excel to manage and manipulate that data for extract, transform, and load (ETL) purposes, says Nenshad Bardoliwalla, co-founder and vice president of products at Paxata, a startup that aims to simplify data quality processes.  “It’s simply too light a tool for today’s enormous data sets that mash structured data with unstructured data,” Bardoliwalla says, adding that sales of Tableau and QlikView licenses now outpace Excel sales.

Datanami’s Take: Uncertain

Everybody loves to bash Excel, which is a shame because it’s such a great tool. Yes, it does have its limitations and is probably not going to suffice when data sets start getting really, really big. As a rule of thumb, you can easily handle the first 20,000 rows of data in Excel. But the global familiarity with Excel and its availability make it a perfect tool to get started with when working with smaller data. Don’t let the term “big data” get in the way of using the right tool for the job.

Prediction No. 10: The Term “Big Data” Fades Away Into the Ether

The awareness of the term “big data” got a shot in the arm in 2013 when it was included for the first time in the Oxford English Dictionary, which is the authority on words that are English, drink ale at pubs, and have bad teeth. But getting an entry in that hallowed text may not be the honor it once was–Lattice Engines CEO Shashi Upadhyay noted that the Brits also added “twerking” to their dictionary. Upadhyay laments the selection of the two words “big data” to represent a new way of thinking about IT. “The term ‘big data’ casts such a wide net, from infrastructure and analytics to applications, that it created confusion for companies and the term became meaningless,” he writes.

Datanami’s Take: Please Let It Be True

We agree that the term “big data” is too vague and needs to be replaced by something. The problem is, we’re not sure what that “something” might be. The term “data analytics” probably comes the closest, at least as far as this publication is concerned. Unfortunately, not everything that falls into what we consider the “big data” camp involves analytics. In the meantime, we’re probably stuck with “big data.” “Without our beloved buzzwords,” Upadhyay writes, “we wouldn’t have ideas to rally around and create momentum.”