April 15, 2014

How Fast Data is Driving Analytics on the IoT Superhighway

Alex Woodie

The promise of big data is morphing into the fast data opportunity. Unless you have the capability to respond to the Internet of Things and the trillions of data points generated by smartphones, sensors, and social media, the business opportunities of fast data can pass you by.

For many commercial analytic applications, fast data is the inevitable endpoint of any big data project. Once your data scientists reach that “aha” moment of insight by carefully sifting through their big (but static) data sets, your business pros will say “Great, so how do we make money off this?” That’s where fast dynamic data comes into play.

TIBCO made its name in the IT business with its information bus, which provides high-speed and low-latency connectivity among disparate enterprise systems, such as stock markets and trading applications. Now the company that popularized the phrase “two second advantage” is bringing that concept to the Internet of Things (IoT) and fast data.

Last week, the company announced that BusinessWorks, its flagship data integration platform, has been bolstered with improved REST support that will enable customers to pull data from the APIs of smartphones, sensors, and other data-generating devices that make up the IoT.

“The first requirement of fast data is getting access to this data,” says TIBCO senior director of marketing Thomas Been. “Now we allow them to capture everything outside of their firewall. It can be social networks or anything that has an API.”

For example, a retailer could use BusinessWorks to capture geographic data from consumer’s smartphones and use that as the basis for a real-time offer generation system. “By looking at your profile, looking at the patterns of the insight you pull from big data, I will send you an offer on your preferred brand of jeans to get you into my shop,” Been says. “And then, I know, based on the information I have, that you will be spending money.”

When it comes to mining social media for analytical insights, speed is definitely of the essence. Yesterday Datanami covered a company called Blab and how it’s pulling signals from social media to help ad buyers and PR companies predict what topics will go viral, and which ones will go dead.

Another company that’s plying the IoT waters is Ugam, a developer of analytic applications. The Frisco, Texas-based company is seeing particular traction in the area of leveraging free consumer data emanating from the social media networks for the purpose of helping retailers decide what to sell and where to place it on the shelves. But beware of which social media networks you choose to monitor.

“Basically, Twitter is a bit ‘noisy’ when it comes to getting customer feedback for pricing and assortment decisions,” says Ugam chief innovation officer Mihir Kittur. “It’s too cluttered with complaints and generally non-relevant information. Instead, Ugam has found that the combination of product reviews, Google+s, Facebook likes, and Pinterest pins provide much better social signals for pricing and assortment intelligence.”

Retail’s rapid pace makes it a good place to test fast data theories to see if they’re profitable. But when it comes to actually helping people, nothing beats the nation’s biggest industry: healthcare. The folks at TIBCO are aiming to build fast data applications in hospital settings that find patterns in vast amounts of data pulled in from digital medical devices.

“We have customers who are looking at integrating medical devices in real time so we can identify diseases earlier and can propose the right cure to the patient earlier,” TIBCO’s Been says. “They do the big data thing to understand the patterns and how the diseases are spreading, and then using real time data to look for the symptoms.”

While Hadoop has become synonymous with big data, it’s not seen perceived favorably when it comes to fast data. TIBCO, for one, isn’t a huge fan of Hadoop. You will recall how the company’s CTO Matt Quinn pleaded with people to stop chasing yellow elephants at the company’s annual user conference last year.

Hadoop has comes under fire for its perceived lack of interactivity and real-time capabilities. But there are several initiatives to add real-time capabilities to Hadoop, if not remake it into a fast data platform. Two of the most prominent include Apache Spark and Apache Storm.

Spark is gaining a tremendous amount of momentum as a time replacement for MapReduce, which up to this point has been the analytical brains behind the Hadoop data platform. Spark is not only easier to code (supporting not only Java but also Python and Scala); it’s also faster, and comes with pre-built hooks for SQL (Shark), real-time streaming (Spark Streaming), machine learning (MLLib), and graph processing (GraphX).

One Hadoop software vendor that’s adapting to the realities of big fast data is MapR Technologies, which recently announced that it’s partnered with Databricks to bring the in-memory Apache Spark technology to its Hadoop product fold. MapR’s competitor Cloudera is also distributing Spark; Hortonworks supports it as a technology preview, with full support expected later this year.

Storm is also gaining followers as the real time needs of big data morph into fast data. Like Spark, Storm gives the user the option of programming in a variety of languages, including Ruby, Python, JavaScript, Perl, and PHP.

One company that’s using Storm in production is LivePerson, the provider of Web-based communications software. In a recent video, Ido Shilon, team lead at platform engineering group at LivePerson, explains how the company rebuilt its back-end infrastructure to make its offerings more resilient.

The core elements of LivePerson’s real-time system are made up of Storm, Apache Kafka, and the Couchbase NoSQL database. As part of its information process initiative, the company collects information about every session, such as what websites users come from, what browser they’re using, and what pages they’ve accessed. This information is streamed via Kafka to Storm for analysis, and then stored in documents in the Couchbase database. Eventually, these three products will form the hub of its “wisdom repository,” where it will be able to analyze this information, Shilon says.

The pieces of the fast data puzzle are still coming into view. The Internet of Things promises to flood us with more machine generated data that we could ever dream. Making something useful out of all information will be neither easy nor intuitive. But its very existence will demand action and fuel data-driven competition among companies for years to come.

Rudin: Big Data is More Than Hadoop

Please Stop Chasing Yellow Elephants, TIBCO CTO Pleads

Applications: Complex Event Processing, Data Mining, Enterprise Analytics

Technologies: Middleware

Sectors: Healthcare, Retail

Vendors: Cloudera, Couchbase, Hortonworks, MapR

Tags: fast data, Hadoop, mapreduce, Spark, storm

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

How Fast Data is Driving Analytics on the IoT Superhighway

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

How Fast Data is Driving Analytics on the IoT Superhighway

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link