Analytics 3.0: A Blend of Big Data and Old Techniques
Big data may appear to be a new trend, and refer to volumes, velocities, and variability that didn’t exist a few years ago. To some extent that’s true. But in the biggest companies, big data is really just a continuation of a decades-old trend toward “more data.” As these companies bring new technologies such as Hadoop and in-memory processing into their data centers, a new approach to building analytic systems is emerging, which is called Analytics 3.0.
In a new SAS-sponsored white paper, called “Big Data in Big Companies,” authors Thomas H. Davenport and Jill Dyché discuss the results of interviews about big data they performed with 20 large companies, such as AIG, Schneider National, UPS, Macy’s, GE, Caesars, Bank of America, United Healthcare, and Sears.
These companies have all grappled with growth in data volumes over decades. But until recently, most of the data resided in relational databases, often on IBM mainframes, where it resided until it was cleaned up with an ETL tool, moved to a data warehouse, organized using tools such as OLAP, and presented to users through query
That model will largely remain under the new Analytics 3.0 banner (except for ETL and OLAP, which are basically unnecessary in the new world of Hadoop and HANA). The systems put in place to analyze customer or product data residing in relational databases have been refined to suit specific needs, and ripping them out in favor of something “new” is anathema to big company momentum.
However, these companies recognize that times are changing, and that technology is emerging that allows them to make sense of semi-structured and unstructured data, such as PDF documents, server logs, photos, and video streams. The companies wouldn’t be doing this if it didn’t present an opportunity for them to improve their operations.
The challenge, then, is how to effectively mix the old, stand-alone analytic infrastructure (Analytics 1.0) with the infrastructure that has emerged over the last seven years thanks to the likes of Google, Yahoo, and Ebay (Analytics 2.0). The answer, say Dyché and Davenport, is Analytics 3.0, which combines the best of big data and traditional analytics to “yield insights and offerings with speed and impact.”
“The most important trait” of Analytics 3.0, the authors say, “is that not only online firms, but virtually any type of firm in any industry, can participate in the data-driven economy. Banks, industrial manufacturers, health care providers, retailers–any company in any industry that is willing to exploit the possibilities–can all develop data-based offerings for customers, as well as supporting internal decisions with big data.”