Splunking Up a Machine Data Storm
Splunk is a plucky company that’s hoping to ride the rise of machine data and “the Internet of things” to riches as vast as the types of data it claims to be able to analyze. The company, which had a very successful IPO in 2012, has an interesting story, as it is in the process of evolving from being just another collector of enterprise IT logs (albeit one with a pretty interface) into the Consolidator General for All Machine Data.
Just a few years ago, Splunk was where you dropped all those log files from your middleware server, your database server, and your network switch that didn’t fit anywhere else. When a problem with the IT setup would arise, you’d hit up that nice-looking Splunk interface to troubleshoot the situation, and possibly discover that messages were falling into a crevice that had popped up between the database and the Web app server.
Splunk is still selling traditional IT monitoring tools. But today, the San Francisco-based company realizes that the really big money is in big data. The company adeptly broadened its focus from the well-defined world of Syslog and SNMP traps toward the unstructured world of machine data.
|It doesn’t take a data scientist to see that Splunk’s stock has been on a roll.|
The neat thing that Splunk accomplished was positioning itself as a de facto data standard for data types that had no standards. Of course, Splunk isn’t any type of official standard for machine data, but it appears to have a lead with its technology, and is looking to capitalize on its (for lack of a better word) mindshare.
By simply capturing, indexing, and making unstructured data ready for analysis with its Search Processing Language (SPL)
the company has given users the capability to analyze data coming from smartphones, cars, refrigerators, medical devices, elevators, and buildings. It just so happened that Splunk didn’t need to change its technology very much to do this. “The current data collection technique that Splunk already supports is very well suited for data coming from sensors and devices,” Stephen Sorkin, vice president of engineering at Splunk, says in a recent YouTube video.
Two Splunk customers that have found creative uses of the company’s technology are Dominos Pizza and iRyhthm, a medical device manufacturer. In addition to using Splunk to manage its IT infrastructure, Dominos uses Splunk to track the effectiveness of its marketing activities and to present data to store managers. If an on-line coupon is not generating the type of demand that the folks at Dominos world headquarters hoped it would, they can check the data from Splunk and tweak the campaign within a day, says Seth Porta, a site reliability engineer at Dominos.
“Going forward we’d really like to pull more data form stores so we can leverage Splunk to give more information to our store guys [and to] create more customized dashboards, higher level metrics,” Porta said. This includes dashboards showing how many pizza orders were made with a credit card, and how many came in via Android versus iPhone, Porta said in a video posted to Splunk’s website.
At iRhythm, Splunk is being used to ingest and manage data being generated from its heartbeat tracking device, which is used to help patients suspected of having irregular heartbeats and arrhythmia.
An iRhythm customer typically wears one of the devices for 14 days, during which time the average patient’s heart will beat 1.5 million times, says Mark Day, executive vice president of R&D at the company. Keeping a handle on the security of that healthcare data is critical to keep iRhythm from breaking patient confidentiality laws. “Splunk is kind of with us every step of the way through that process,” Day says in a video posted to the Splunk website.
Splunk’s big data fortunes got even bigger last week, when it unveiled Splunk Enterprise 6 at its annual user conference. The big news with this release is the new Pivot interface, which Splunk claims is so easy to use that non-technical users can explore, manipulate, and visualize data without having to know a query language, such as its own SPL.
|Splunk Enterprise 6 runs up to 1,000 times faster, the company claims.|
Another new feature in version 6 that you can drop into the “ease of use” column is the introduction of new data models that “that provide for a more meaningful representation of underlying machine data and a deeper understanding of relationships in the data.”
This is a smart move on Splunk’s part. There is a huge gap between the demand for qualified data scientists and the current supply. Vendors in all aspects of the big data racket–including newfangled Hadoop and NoSQL data stores, as well as the traditional data warehousing and business intelligence arenas–are trying to use the power of software to overcome the shortage of trained data scientists.
Splunk also has its eyes on Hadoop. Earlier this year, it unveiled Hunk, an analytic tool designed to work with data stored in Hadoop. The advent of the patent-pending Splunk Virtual Indexing technology will enable Splunk users to explore, analyze, and visualize data in Hadoop as if it were stored in a Splunk index, the company says. Currently in beta, Hunk is expected to become available by the end of 2013.