Using Visualization to Unlock Secrets in Big Data
In this age of big data, brilliant programmers who can write the best algorithms are worth their weight in gold. But there are other ways to exploit big data that don’t require high-paid programmers at all. For the folks at the University of Maryland’s Human-Computer Interaction Laboratory, users who explore their own data using visualization tools and techniques can sometimes beat the best algorithms.
The ability to explore and discover new connections in large data sets is one of the most powerful and important capabilities that visualization techniques can give humans. Megan Monroe, a research associate at the Human-Computer Interaction Laboratory, uses an analogy to compare the use of data mining algorithms to data visualization.
“The way I like to think about it is, if someone invented teleportation tomorrow, would people still go on trips?” she says in a recent interview with IEEE Software. “I think the answer is yes. Sometimes, it’s more about the journey than the destination.”
Instead of using algorithms to automatically find the patterns in data set, users can sometimes get more out of the process of finding patterns themselves. “Using data mining algorithms, you’re basic lay being teleported from point A to point B. There are no stops for sightseeing,” she says.
“Don’t get me wrong: I think that this approach [data mining algorithms] has real tangible advantages in a lot of situation,” Monroe continues. “But sometimes data analysis is more about the journey, more about the process of exploration, and discovering questions you didn’t even know you had going into the data, and allowing the data sets to surprise you in ways that clearly you weren’t expecting.”
The Human-Computer Interaction Laboratory has at the forefront of data visualization techniques with two technologies it developed, primary for the medical field. These include LifeLine, a technology first developed in the late 1990s that summarizes a patient’s entire medical history on a single screen, and EventFlow, a newer technology that provides visual analysis of temporal events.
As the Human-Computer Interaction Laboratory worked with early users of the technologies, its members detected some patterns in how people responded to seeing their data for the first time.
Their first reaction tends to be shock, and disbelieve that what they’re seeing is a graphical representation of their data. “We’ve seen it very often where people say, that is not my data!” says Catherine Plaisant, the associate director of research of the Human-Computer Interaction Lab. “We also see very often that they look at the data, and start forming hypotheses. That’s really what visualizations do: [allow you to] come up with hypotheses, and answer questions you didn’t know you had. Very often what it means is, you have to go out and get new data.”
”The reactions have been wild,” Monroe says. “When we start working with a research group, they’ve been working with their data for months or years. It’s their livelihood. And then all of a sudden, to see their data for the first time, it literally takes their breath away. It’s like if you had a long time pen pal that you set up a meeting with, and it turns out they’re a Victoria’s Secret model. They’re blown away.”