It’s Sink or Swim in the IoT’s Ocean of Bigger Data
The network of connected devices commonly called the Internet of Things is poised to drive data growth to stratospheric levels. Our ability to wield analytics on these big, fast, and diverse data will determine whether we can successfully harness the IoT to improve livelihoods and boost bottom lines, or whether we’ll crumble under the weight of the new data.
If you thought today’s data levels were big or fast or diverse, you haven’t seen anything yet. Analysts say the amount of data we’re dealing with now is just a trickle compared to the torrent that will be unleashed over the next few years as the IoT ramps up and sensors start popping up in, well, just about everything.
How big is the IoT? Gartner predicts there will be 26 billion units installed on the IoT by 2020. That’s a conservative estimate compared to a recent Morgan Stanley report that predicts we’ll have 75 billion connected devices by 2020. Cisco CEO John Chambers brought some even bigger numbers to a speech at the Consumer Electronics Show last winter.
In the speech, Chambers predicted that in 10 years, the IoT industry would generate $19 trillion in revenue and savings. “2014 will be the transformative, pivotal point for the Internet of Everything,” Chambers said. “I predict to you that the [IoE] will be five to 10 times more impactful in one decade than the whole Internet to date has been.”
To date, many of the IoT use cases have been focused on improving life in cities. In the Spanish capital of Barcelona, for example, sensors are going into garbage cans that will automatically detect if they are full and need to be emptied, and could potentially be used to detect hazardous substances to improve the health of sanitation workers. That city is also experimenting with smart street lights that automatically dim when people aren’t around, thereby cutting energy use. Smart lighting also has a positive impact on street crime, Chambers said.
“I think we’re beginning to see that this will impact every aspect of our lives,” Chambers said. “It isn’t just connecting a car or connecting a refrigerator or connecting the video capability in healthcare. It’s the combination of these together that changes process and allows for different entertainment outcomes and different business decision outcomes.”
We’re in the midst of an IoT gold rush as manufacturers begin putting sensors into everything from cars and clothing to parking spaces and rivers. Sensors of various types will collect data about the presence of heat, light, chemicals, seismic waves, magnetic waves, and acoustic waves, and transmit it via WiFi, Bluetooth, NFC, cellular, and TCP/IP protocols. These sensors will not only detect, but provide an extension of human control on our environment.
The notion of a totally “dumb” endpoint will soon become antiquated, predicts IBM big data evangelist James Kobielus. “Before long, it will be difficult to find any consumer, business, industrial or other device that totally lacks embedded, data-driven analytic intelligence,” Kobielus writes at IBM’s Big Data & Analytics Hub. “What’s driving this trend are the plummeting cost of solid-state storage, the inexorable miniaturization of electronic components, and the embedding of deeper analytic libraries in every device.”
Before we can even analyze the data, we’ll need to come up with new approaches for simply surviving the data onslaught without obliterating our systems. “IoT threatens to generate massive amounts of input data from sources that are globally distributed,” Joe Skorupa, vice president and distinguished analyst at Gartner, says in a recent report. “Transferring the entirety of that data to a single location for processing will not be technically and economically viable.”
Organizations will invariably try to store and process this data within a single framework, such as Hadoop, which excels at handling semi-structured data, as much of the machine data generated by the IoT will be. But Hadoop, as it exists today, may not be very well suited for analyzing IoT-generated data. Andrew Rogers, the CTO of SpaceCurve, took this position in a recent conference. The company, which raised $10 million a year ago, is development a real-time streaming analytics product for the IoT that can work on petabyte-scale data volumes.
Another approach to analyzing data in the IoT is to put more data processing power on the end points themselves, which can do the first phase of the analytics process. NoSQL database vendor Couchbase is taking this approach with its new mobile database offering, Couchbase Lite, which will install on anything with a processor, including laptops, phones, wearables, machines, and cars. “It’s really any computing device on the edge of the network,” Couchbase architect Wayne Carter says. NewSQL database vendor VoltDB is also eyeing potential new workloads for its software and the IoT.
Hadoop may yet evolve into a real-time streaming platform for IoT analytics. The hopes and dreams of Hadoop are heavily invested in Apache Spark, the in-memory engine that was just released at version 1.0 last week. Another vendor ambling to get a piece of the real-time Hadoop pie is DataTorrent, which is very close to reaching general availability with its streaming analytics software. SQLStream, which can work with Hadoop, is also actively targeting IoT workloads with its software.
As the IoT evolves, big data technology will invariably evolve to match it. The pressure that the IoT is beginning to exert on our current networks and systems will undoubtedly yield some unforeseen adaptations to the technological advances of tomorrow.