Follow Datanami:
November 2, 2011

Genomics Project Taps SGI for Big Data Needs

Datanami Staff

When it comes to data-intensive applications, genomics research projects represent a shining example. While some of these applications are certainly compute-intensive, life science research generates an incredible amount of data in a short period of time, leading to complex storage and system needs.

The amount of data that rolls off of sequencing machines can be in the many-terabyte range, and furthermore, regulatory guidelines for some applications and genomics research centers require storage of genomic data for long periods of time.

This week the Institute for Chemical Research, which is housed at Kyoto University in Japan boosted its capability to handle the needs of the GenomeNet project out of Kyoto’s University Bioinformatics center, which is forming a high performance cloud to handle genomic and related research projects.

SGI’s Altix UV 1000 is the company’s top shared memory system that is often chosen for its ability to scale. The Altix UV boasts up to 256 sockets (2560 cores, 4096 threads) and claims architectural support for up to 262,144 cores. With 16TB of shared global memory in a single system image, the company might be on the right track to claim this is high on the list for those with both data intensive and HPC applications.

SGI has the Altix UV 10, 100 and 1000, meaning the Institute for Chemical Research went for the high end. This system ships as a fully integrated cabinet-level product with up to 256 sockets and the full 16TB of shared memory across four racks for a total of 24.6 teraflops of power in a system image.

Scientific Computing reports that the Institute chose the solutions to achieve a 6x improvement in the performance they had with the previous SGI system. It will be configured with over 3072 Xeon E7-series cores with 48TB of memory and storage in the 840TB arena.

As Allison Proffitt noted, “The system consists of two servers: one for computational chemistry and one for GenomeNet calculations. The computational chemistry server consists of two nodes with 512 cores and 8TB of shared memory. Applications such as quantum chemistry and molecular dynamics will be utilized on the UV 1000, enabling users to run Gaussian, CASTEP, Discover and other programs to examine molecular structures and their specifications and characteristics. The GenomeNet calculation server also consists of two nodes with 1024 cores and 16TB of shared memory.”

Datanami