Follow Datanami:
May 3, 2012

Snapshots from the Edge of Big Visualization

Datanami Staff

Visualization is becoming even more critical as datasets continue to outgrow their containers.

Even the most robust analytics applications can sometimes fail to present the big picture of a particular dataset. This realization is the driving force behind the ongoing integration of specialized visualization suites for nearly every analytics offering.

Data scientists of all stripes are continually asking new questions of their data and increasingly want robust tools to help them see what new questions are on the horizon.

Effective visualization allows previously undiscovered questions to emerge and provides a human element to bridge the divide between the plethora of new tools and frameworks for processing and managing big data.

This week we wanted to highlight the changing role and scope of both scientific and enterprise data visualization by showcasing select projects that point to what lies ahead.

Let’s take a short tour of just a very small sampling of what data visualization is providing for research and industry. While a comprehensive list would include thousands of examples, the following were selected for their embodiment of larger popular data visualization trends.

Next — Water, Words & Worlds….>>>

 

Visualizing the World of Words

Here is a riddle: What kind of dataset would tally in the 200 terabyte range—packed with 3 years of audio and video “memories” in a human life? A hint lies in the image below, but it’s probably not what you are guessing…

MIT cognitive scientist Deb Roy has the answer which lies in his harvesting of “the largest home video collection ever made” to understand the process of how a child learns language.

In this example, Deb Roy’s team captures every time his son ever heard the word water along with the the context he saw it in. They then used this data to penetrate through the video, find every activity trace that co-occurred with an instance of water and map it on a blueprint of the apartment. That’s how they came up with wordscapes: the landscape that data leaves in its wake.

 To give a sense of scope, the digital “memory” files that this visualization was based on include:

  • 90,000 hours video
  • 140,000 hours multi-track audio
  • a 70-million-word transcript

…in short, the equivalent of most of what during a year or two of our childhoods.

Considering this exercise of imagination, what would you do to harvest usable information out of that huge amount of opaque data?

NEXT — Visualizing the Final Frontier…>>>

 

Visualizing the Final Frontier

If there was ever any doubt in your mind that both the art and science of large-scale data visualization has swiftly evolved in the last five years, look no further than some of the work going on at NASA.

The space agency released a video that puts together some of the most stunning massive data visualizations of everything from cosmic super-events to trends closer to our corner of the universe.

The Scientific Visualization Studio at Goddard Space Flight Center is home to some of the most bleeding-edge innovations in big data wrangling to produce stunning visual representations. The visualization center works closely with scientists in the creation of visualization products, systems, and processes in order to promote a greater understanding of Earth and Space Science research activities at Goddard Space Flight Center and within the NASA research community.

There is a great deal more where this image and video wonderment came from at the organization’s visualization studio site.

All the visualizations created by the SVS (currently totalling over 4,200) are accessible to you through the site. Where possible, the original digital images used to make these animations have been made accessible. Lastly, high and low resolution stills, created from the visualizations, are included, with previews for selective downloading.

NEXT —Secrets of the Beaten Path…>>>

 

Uncovering the Runner’s City

Okay, so it’s fair enough to say this isn’t exactly “cutting edge” in terms of data volume or even aesthetics of the visualization, but if there’s one thing the following visualization demonstrates—user-contributed data is the next wave of personalization of everything.


From dining to real estate to fitness trends, visualization is the key to understanding our surroundings in ways raw, static demographic data will never allow.

After seeing Cooper Smith’s visualizations of data from runners in New York City (below), Eric Fischer wanted to see what similar data sets would look like for other cities. Ho notes that Nike+ doesn’t have public GPS logs, but MapMyRun does, if you are willing to spend several hours clicking through search results to hit the “Download” buttons, so that’s what I did to get the tracks for these 771 runs (from June 13 through August 9) in San Francisco.


As Open Source Planning has pointed out, uploaded runs come from a fairly small, self-selected group of people, the most obvious result of which is the total absence of the southeastern corner of the city from this map. 

This is not so much different than the use of social data to create dynamic demographic maps, as we discuss in more detail here.

NEXT — Crossing Fault Lines…>>>


Crossing Fault Lines

The following visualization was created using open data from the government that provides real-time worldwide earthquake data that can be harnessed to observe long-term trends or, as in the case with the visualization below, to watch a dramatic event unfold.

The image shows a snapshot in time—the devastating earthquake in Chile on February 27, 2010 as it happened.

The creator used the government earthquake data in conjunction with Processing.org, which provides an open source platform for users who want to create data-driven images, animations and interactions.

Initially developed to serve as a software sketchbook and to teach fundamentals of computer programming within a visual context, Processing also has evolved into a tool for generating finished professional work.

  • » Interactive programs using 2D, 3D or PDF output
  • » OpenGL integration for accelerated 3D
  • » For GNU/Linux, Mac OS X, and Windows
  • » Projects run online or as double-clickable applications

Today, there are tens of thousands of students, artists, designers, researchers, and hobbyists who use Processing for learning, prototyping, and production. Definitely worth checking out if this is something new to you.

NEXT — The World Underfoot…>>>

 

Seeking Water Underfoot

For a planet with a reputation of being blue, Earth faces some significant challenges when it comes to conjuring freshwater.

This visualization reveals the freshwater stores that NASA’s GRACE (Gravity Recovery and Climate Experiment) satellite detects from space and shows how that data can be used to evaluate groundwater gains and losses, critical information in the effort to conserve the water that people depend upon.

Texas endured its driest year ever in 2011, and southern Alabama and Georgia have continued to suffer serious drought in 2012.

Climate change is predicted to make drought more frequent in the southern United States, putting a strain on groundwater resources and creating a need for better understanding of where water lies—even if it’s not immediately visible.

NEXT — Visualizing Professions….>>>

 

Visualizing Professions

A couple of weeks ago we brought you a story about the big data infrastructure at LinkedIn. While we talked more about the software, hardware and conceptual environment at the company, we could have easily spent several more articles marveling at how the massive amounts of data are creating visualizations that create maps far more than just personal networks.

Image Credit: Luc Legay

Beyond the personal connections, visualizations are being used at LinkedIn to gain a bird’s eye view of everything from the skill set maps (as seen belo) of various industries to relative value of a name on a worker’s propensity to lead a company.

LinkedIn data scientist Scott Nicholson told us that visualization is creating some of the most unique opportunities for his company’s ability to view the ecosystem of work—and to add value to user experience.

As he said, “We have a view that no one else has,” he said. “We have data that lets us understand what actions people take professionally, and then we take that one step further to see how we can personalize their experience on LinkedIn based on that behavior.”

NEXT — A Moving Tale of Movement…>>>

 

Keeping Britain Moving

Visualizations are finding their way into news media in the form of rich graphics and mind-blowing videos. One such “oldie” that sparked mainstream recognition of the power of visualization of massive data is from the BBC’s special series on unseen networks throughout Britian.

Sit back for a moment and let the gravity of what you’re watching sink in for a moment before you bother thinking about the sheer aesthetic thrill ride of these visualizations.

What you’re looking are the many ways that data is moving Britain. As part of a special series on the BBC, GPS traces from taxi cabs and airline flights  illuminate the neural path of telephone communications… internet traffic bursts from computer to computer… just about every data route emerges at once, showcasing an unseen, defined highway.

With all that data on display, patterns emerged – zero air traffic in no-fly zones and taxis taking alternate routes to avoid heavy traffic.

Related Stories

Inside LinkedIn’s Expanding Data Universe

7 Big Winners in the U.S. Big Data Drive

Inside the Mind of a Data Artist

Datanami