Follow Datanami:
November 21, 2011

The Path to Personalized Medicine

Datanami Staff

When we consider the verticals that are addressing some of the most complex big data problems, the life sciences industry is often one of the first to spring to mind. From drug discovery and development to targeted personalized medicine approaches that are tailored to a specific genome, this is an area that is truly data-driven.

The problem, of course, is that to keep advancements in treatment and drug options affordable, as well as to come clean on the long-held goal of providing routine genome-based individualized medicine, handling these data-intensive processes requires new ways in thinking about everything from computational and programming efficiency, to robust storage, and of course, powerful analytical abilities that mind the power envelope while delivering the needed compute horsepower.

According to Dr. Patricia F. Dimond, DNA sequencing is undergoing two important transformations. First, the cost of sequencing is going down, with expenses dropping by roughly 50% every five months. This is bringing the ability to sequence individual patient genomes to around $1,000 and placing the hope of personalized medicine into reach.

A second movement is one that many are well aware of—the need for compute, storage and complex data management and analysis tools continues to skyrocket as typical data generation figures for genomic sequencing hover in the petabyte range.

According to Dimond, “the petabyte crisis and the need to use all that data to discover novel therapeutics provide ready-made forcing functions to move companies into the cloud.” Dimond says that many firms are already taking advantage of cloud resources “to bolster storage and analysis of the huge amounts of data generated from research and clinical development involving next-generation sequencing.”

Dimond points to several examples of how the cloud is being used to propel big data research, including the creation of DNAnexus, which relies on Google’s cloud storage to provide a vast well of DNA sequence data that is made available in the Sequence Read Archive (SRA) database. This platform allows users to tap into the DNAnexus platform to use pre-sequenced data for further genetic analysis, including DNA mapping, RNA sequencing, variant analyses and visualization and also lets users leverage eisting SRA data for their own pending projects.

In Dimond’s opinion, the cloud could be the cure for the big data woes the life sciences industry is facing. As an another, she points to Dell, which made an announcement this month about a new effort to bring cancer programs into virtualized spaces in conjunction with the Neuroblastoma and Medulloblastoma Translational Research Consortium (NMTRC) with support from the Translational Genomics Research Institute (TGen).

The effort will take advantage of TGen’s platform and the Dell cloud to help NMTRC refine cancer treatment strategies for young patients who are taking part in a focused NMRTC trial across 10 pediatric cancer treatment centers in the U.S.

The study will focus on finding the least toxic, most effective ways to target pediatric cancer treatment, requiring a complete genetic tumor analysis from each patient so understand pathway blockages for specific drugs.

According to Dimond, “This will generate more than 200 billion measurements per patient that must be analyzed, shared, and stored, DELL predicted. Data computation and analysis of this information would have required weeks to months to process and thus would have limited the depth and number of pediatric cancer patients who could be included in the clinical trial. But Dell expects that the time needed for such large-scale studies will be reduced to just days through the use of Dell’s cloud.”

For research efforts like this which involve collaboration across organizations, treatment centers and an array of platforms, the cloud is providing the solid foundation for big data analytics that have truly mission-critical end goals. While many drug companies are already using a variety of cloud paradigms (public, private and hybrid) an effort like this could set the stage for further exploration of the cloud to handle big data problems that require immediate results.

More information is available in Dimond’s full article about the data-intensive needs of cancer researchers that are being addressed by cloud computing.