May 17, 2012

GPUs Tackle Massive Data of the Hive Mind

Nicole Hemsoth

From simulating billions of objects in a galaxy to uncovering new sources of fossil fuels to accurately predicting financial risk in real time, GPU computing is opening doors to speeding the solutions to big data problems in an increasing number of industries.

Each year at the GPU Technology Conference (GTC), a few thousand developers, end users and GPU computing enthusiasts gather to discuss and share their progress with using GPU acceleration on data-intensive science, research and enterprise problems via CUDA and other approaches, including OpenCL.

At the core of the presentations each year are some compelling use cases in fields as diverse as energy exploration, biomedical and pharmaceutical development, government, finance, retail and beyond. Last year GPU-boosted astronomical and genomics applications were the stars of the show, but this year’s pet use case is certainly its own animal.

At GTC 2012 in San Jose this year, the role of GPUs in the calculations and visualizations from award-winning Princeton researcher, Iain Couzin. Dr. Couzin, who is pursuing work work in the field of ecology and evolutionary biology and claims that GPU computing has revolutionized his research.

Couzin’s groundbreaking work hones in on the collective action of living dynamic groups, including clusters of living things outside of mere schools of fish or clusters of cells. Couzin says that GPU computing has democratized his research—a statement that harkens back NVIDIA CEO, Jen-Hsun Huang’s opening keynote on Tuesday about how GPUs are opening new possibilities in science and enterprise with unprecedented processing power for the price.

Couzin’s research in collective behavior stretches back to over a decade, but he says he had always been limited by the scale and speed of the simulations that were running on small CPU-only clusters. He claims that the first time he stepped into GPU computing was at the dawn of the GPGPU era. His first steps were to install a GeForce card for $400. That single act led him down the path of tapping into the capabilities of CUDA to explore what was now possible, even on his small in-office cluster.

Instead of relying on 2D models or simulations that could take a few weeks to run, Couzin says harnessing the early single-precision performance of the first generation of GeForce and later Tesla cores in his relatively small cluster has allowed him to tap into enormous data sets to run complex, multi-layered simulations of everything from schools of fish instantly reacting to the invasion of a predator to the collective, fast reaction of human cells to a surface wound.

Ever since that first foray into tapping into GPUs via the early GeForce cards, Couzin and his Princeton team have been refining their massively parallel models of collective behavior using CUDA as well as OpenCL to a much less extent. He says the thousands of processing cores allow his team to move beyond the small collections of collectively moving creatures he used to be limited by to millions of moving parts in real-time.

Couzin explains that the scope of studying self-propelled particles within a given space is a heavy load for a CPU—and that doubling number of individuals in the simulations quadruples the computation time. Not only is this a slow process, but, as he describes, “the spatio-temporal variability in the environment, feedback between individuals and the environment together with how individuals evolve on evolutionary time scales makes it virtually impossible to use traditional method of CPU computing.”


Far from being critical to understanding only biological and evolutionary keys to how species interact to ensure survival, this research presents an incredible series of implications for the future of science—but also for business purposes.

As Couzin explains, “A fundamental problem in a wide range of biological disciplines is understanding how functional complexity at a macroscopic scale (such as the functioning of a biological tissue) results from the actions and interactions among the individual components (such as the cells forming the tissue).” He says that since they can be readily observed and manipulated animal groups present unrivaled opportunities to link the behavior of individuals with the functioning and efficiency of the dynamic group-level properties.

Some of his ongoing projects include:

If you have the time, it’s well worth it to watch the entire keynote address from GTC 2012 where he puts his work in context, shedding light on tangential aspects to collective behavior, including how to best find love.

Related Stories

Snapshots from the Edge of Big Visualization

Six Super-Scale Hadoop Deployments

Floating Big Data on GPU Clouds