Five Answers for Finding Bigger Insight from Your [Big] Data
When General Electric CEO Jeff Immelt spoke at a GE users summit late last year on the topic of the Internet of Things, he had something interesting to say: “All companies need to become Internet and software companies.” Huh?
He quickly clarified. “The industrial world is changing dramatically, and those companies that make the best use of data will be the most successful.”
GE has a goal of “no downtime” and it has invested heavily in technology to understand the data streaming in from IoT-enabled devices. With 10 billion objects currently connected to the IoT, and 20 billion predicted by 2020 it, Immelt’s “everyone needs to be an Internet and software” company observation makes sense.
But first it might help to understand what kind of software is needed to make sense of all this data. It’s no surprise that an auto insurance company that made a name for itself devising premium payments based on sensor data from cars showing speed, braking and acceleration patterns – doesn’t actually use the data after the rates are set. Continuing to analyze streaming data would be nearly overwhelming.
So how does an enterprise make the most of big data when sources such as the IoT continue to pump out potentially useful streams? Here are answers to five simple questions related to big data.
1. With all the data out there, how can I store it efficiently?
If you aren’t already using it, get comfortable with Hadoop. This free programming framework supports the processing of large data sets in a distributed computing environment. The open source architecture frees organizations from the constant need to maintain and expand relational databases. It offers massive data storage and super-fast processing at roughly 5 percent of the cost of traditional less flexible databases. Hadoop’s is also able to handle structured and unstructured data, including audio, visual and text. And there’s no need to have a Hadoop expert on staff. There are products that serve as bridges to existing databases and data warehouses. This is the foundational technology for creating predictive analytics models.
2. I need the data right away. How can I get it quicker?
Instead of storing data and running queries against it, use event stream processing to stream the data through models and queries. This is how credit card companies quickly figure out if someone has your card (real-time fraud scoring) and it is critical to using sensor data. Think of what needs to happen to make driverless cars a reality – the data on position of the vehicle and other vehicles involves a massive stream of data. There is no time to extract, transform and load it – event stream processing is employed to automate the car’s moves.
3. Now that I have access to all this data, where do I start?
Most big data uses don’t involve driverless cars – or some equally highly automated function. Chances are, humans need to look at this data and to do that you need visual analytics. Visual analytics is a key component in keeping less urgent (but still critical) data from sitting, unused, while a harried IT person finds time to query the information for a report. Look for interactive options that allow business users to drill in to the data, run predictive analysis and do it all from a mobile device. One freight company found that using visual analytics allowed them to pack their trucks more efficiently, spot subtle drop-offs in customer business (so they could research why before they lost the business. It also helped the company develop real-time pricing.
4. How can I get my analysis done quicker to get a jump on the competition?
By using in-memory, in-database solutions processing, data remains suspended in the fast memory of a powerful set of computers, instead of on a slower disc drive. Multiple users can share this data across multiple applications in a rapid, secure and concurrent manner. In-memory analytics also takes advantage of multi-threading and distributed computing, where you can distribute the data (and complex workloads that process the data) across multiple machines in a cluster or within a single server environment. The speed has a huge impact as one manufacturer learned. It was able to detect a product issue in one-third the time of traditional warranty analysis, reduced warranty costs 10 to 15 percent for those hard-to-detect issues and figured out how to revamp documentation to reduce call center requests.
5. How do I do this without investing in complex solutions?
There are multiple applications available in the cloud. This dramatically brings down the cost of installing, maintaining and purchasing the necessary hardware to host on site – especially for small businesses and those that are rapidly growing. But until recently the cloud only housed simple analytics. Increasingly, the cloud is becoming home to more sophisticated types of analytic solutions and even a kind of “results as a service” approach that provides complex analytics support to organizations without the resources, physical and human, to perform analytic functions in-house.
As big data continues to grow exponentially with the IoT as a prime source, other more recent technologies including Hadoop, streaming analytics, data visualization, in-memory analytics and cloud computing, offer a path to the insight to be found in the data.
About the author: Malene Haxholdt (@Malene_Haxholdt) is principal analytics consultant at SAS, where she’s responsible for managing all aspects of SAS’s key analytics strategy and focused on how customers can gain value from business analytics and mature their analytical culture. She has been involved in more than 100 customer projects with a focus on implementing analytics in companies across many industries. Malene joined SAS Denmark in 2000 as an analytical consultant working with implementing analytical solutions based on techniques such as data mining, forecasting and optimization. Malene holds a Master of Science degree from Copenhagen Business School in applied statistics.