April 28, 2014

Next-Generation DNA Sequencing Performance at Scale

Nicole Hemsoth

Compute at the NGS core

Compute lies at the core of NGS sequencing and assembly workflows. In these workflows, raw data output from next-generation sequencers is passed to systems performing the computationally intense assembly of genomes (the work of turning fragmented digital representations of genomes into whole genome maps).

Cray^® XC30™ supercomputers and Cray^®CS300™ cluster supercomputers are suited to the requirements of research institutions and clinics performing sequencing and assembly on a daily basis. What sets the company’s technologies apart is the fact that they can meet a range of requirements. Cray systems are designed to handle everything from special-purpose compute needs to diverse sets of applications.

In facilities that support multiple applications, the Cray XC30 supercomputer provides a robust and scalable architecture for bioinformatics. For institutions looking for a dedicated assembly system, the Cray CS300 cluster supercomputer series — and in particular, the CS300 Large Memory System incorporating vSMP Foundation™ from ScaleMP™ — supports assembler applications requiring large amounts of shared memory such as Velvet.

Regardless of the choice, Cray systems can be used to run community-standard applications such as Galaxy to manage NGS workflows and provide result visualization and analysis.

The need for workflow-driven storage

NGS workflow involves repetitive manipulation of raw data — DNA fragment files output from sequencers and measuring upwards of 300 gigabytes — assembled into whole genome maps and used for research and clinical purposes.

In environments using the XC30 supercomputer, the Cray^® Sonexion^® scale-out Lustre^® storage system simplifies deployment and management with storage that delivers exact levels of performance in an integrated and preconfigured package.

Organizations deploying the CS300 system for NGS can choose from a range of storage solutions from Cray, DDN, NetApp and other manufacturers.

Often forgotten is the fundamental problem of cost-effectively managing and maintaining an ever-increasing collection of data sets associated with NGS workflows. The explosion of data from NGS — from raw sequence data to final results data — has unleashed an unprecedented data management responsibility. For many organizations, data growth is outpacing the ability to manage and archive it. This situation has created a need for an archival system to transparently migrate data to different tiers of storage — from high-performance scratch parallel file systems to capacity-optimized disk and tape archives.

Cray Tiered Adaptive Storage (TAS), powered by Versity, is the only complete and open archiving solution built for enterprise-class Linux^® environments, including Cray XC30 and CS300 systems. TAS is designed to meet NGS customers’ massive scalability needs. Cray preconfigures, integrates and tests all hardware and software to provide a ready-to-deploy system.

Cray solutions for analysis

Data sequenced, assembled and captured is of little value unless it can be analyzed. Together, Cray and YarcData provide a comprehensive set of platforms to perform visualization and analysis. Widely used genome analytics applications such as Galaxy and BLAST (NCBI and AbokiaBLAST) are available for the XC30 and CS300 systems.

As researchers and healthcare professionals expand the practice of precision medicine, rapid analysis of genomic information will increasingly play a role in patient diagnosis and treatment. Big data graph analytics can help healthcare professionals, analysts and scientists take full advantage of their data. It enables the capture and exploration of relationships among vast data sources impossible to achieve with a search approach — and turn data’s latent value into realized value.

YarcData’s Urika^® graph analytics appliance is purpose built for discovery and enables new insights in real time. It addresses the limitations of commodity hardware, scaling to meet increasing volumes of data and quickly updating relationships as new data streams in.

Why Cray?

The ability to turn pioneering hardware and software technologies into renowned supercomputing solutions is the work of decades, and no one else has more experience than Cray. It’s why leading users across industries and disciplines repeatedly choose Cray. From technical enterprise- to petaflop-sized solutions, Cray systems enable tremendous scientific achievement by increasing productivity, reducing risk and decreasing time to solution.

http://www.cray.com/bio-itworld/

Vendors: Cray

Tags: big data, cray, DNA sequencing

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Next-Generation DNA Sequencing Performance at Scale

Compute at the NGS core

The need for workflow-driven storage

Cray solutions for analysis

Why Cray?

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Next-Generation DNA Sequencing Performance at Scale

Compute at the NGS core

The need for workflow-driven storage

Cray solutions for analysis

Why Cray?

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link