Language Flags

Translation Disclaimer

HPCwire HPC in the Cloud Digital Manufacturing Report Green Computing Report


October 11, 2012

Brown University Advances Genomics, Big Data Research


Genomics, a field of study where researchers try to pinpoint a relatively tiny amount of important genes in a sea of DNA, is a perfect testing ground for big data. Further, the process of sequence genes has become exponentially cheaper and faster over the last decade, deepening the sea from which genomics researchers have to fish.

A few months ago, we highlighted an algorithm presented by C. Titus Brown and Michigan State University which essentially produced a reduced map of the genomes they were studying.

Now, a new genomics-based formula is garnering national attention. Last year, Brown University computer science professors Eli Upfal, Ben Raphael, and Fabio Vandin developed an algorithm called HotNet which, according to its website, is “an algorithm for finding significantly altered subnetworks in a large gene interaction network.” The algorithm was ultimately used to find mutated genes in cancerous cells, attracting the attention of the medical industry.

As a result of this research, The National Science Foundation and National Institutes of Health have awarded the professors $1.5 million in additional funding. With the funding, the Brown University team hopes to achieve greater accuracy within their algorithm in determining which mutations are important.

After all, not all mutated genes will necessarily contribute to the development of cancer. Thus is the challenge: not only finding the mutations but obtaining statistical certainty that a particular mutation out of many is relevant.

Of course, while these algorithms would be remarkably useful to the healthcare industry, the team has higher aspirations. Upfal et al are hoping to eventually expand these capabilities past cancerous cells and into other large datasets.

“These datasets have all the good and bad properties of Big Data,” said Upfal. “They’re big, noisy, and require very complicated statistical analysis to obtain useful information.”

If that process of filtering through tons of worthless  or irrelevant information to find the nuggets of insight sounds familiar, it may be because it represents almost every big dataset a company working with big data has had to work with.

Related articles:

Researchers Germinate Novel Approach to Big Bio Data

DNA Big Data Research Stuns Stephen Colbert

DNA to Carry New Data Burden

 

 

Share Options


Subscribe

» Subscribe to our weekly e-newsletter


Discussion

There are 0 discussion items posted.

 
Cray CS300-LC

Sponsored Links

Sponsored Whitepapers

Best Practices in Big Data Storage - Sponsored by Cleversafe, Cray, DDN, NetApp, & Panasas

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas

From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Download this Whitepaper...

Big Data, Big Brains – Sponsored By NetApp

04/22/2013 | NetApp

Big data has proven to be one of the most promising yet challenging technologies for both government and industry. But, before IT leaders can harness the full potential of big data, there are key issues to address surrounding infrastructure, storage, personnel, and training.
MeriTalk surveyed 17 visionary big data leaders to find out what they see as the big data challenges and opportunities as well as how government can best leverage big data. Download the “Big Data, Big Brains Report”.

Download this Whitepaper...

View the White Paper Library

Sponsored Multimedia

SGI President and CEO, Jorge Titinger, on Big Data

SGI President and CEO, Jorge Titinger, talks about SGI's history and leadership in HPC and how that has converged into Big Data Solutions.

View Multimedia

Cray CS300-AC Cluster Supercomputer Air Cooling Technology Video

The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.

View Multimedia

More Multimedia

SGI DataRaptor with MarkLogic Database

Job Bank

Datanami Conferences Ad

Featured Events

May 22-23, 2013
Business Intelligence Innovation Summit
Chicago, IL
United States

June 4-4, 2013
The Economist's Information Forum
San Francisco, CA
United States

June 10-13, 2013
Cloud & Big Data Expo
New York City, NY
United States

June 19-20, 2013
GigaOM Structure
San Francisco, CA
United States

June 26-27, 2013
2013 Hadoop Summit
San Jose, CA
United States

June 26-27, 2013
Big Data World Congress
London
United Kingdom

» View/Search Events

» Post an Event