Follow Datanami:
January 12, 2017

Clemson Software Optimizes Big Data Transfers

Data-intensive science is not a new phenomenon as the high-energy physics and astrophysics communities can certainly attest, but today more and more scientists are facing steep data and throughput challenges fueled by soaring data volumes and the demands of global-scale collaboration. With data generation outpacing network bandwidth improvements, moving data digitally from point A to point B, whether it’s for processing, storage or analysis, is by no means a solved problem as evidenced by the continuation or what could even be called the revitalization of sneakernets.

Even for those scientists fortunate to have access to the highest-speed networks, like the 100 Gigabit Ethernet research and education infrastructure, Internet2, it takes a certain level of expertise to maximize data transfers. Recognizing that their advanced networking capabilities were not always fully exploited, a group of Clemson University researchers has come up with a way to optimize transfers for everyone.

Not surprisingly the work is coming out of the Clemson genetics and biochemistry department, which has had a front row seat to the past decade’s data deluge. In a news writeup, Clemson’s Jim Melvon observes that while high-energy physics is often cited as the poster child for data-intensive science, genomics is catching up. And as in the computational physics community, long distance data sharing and collaboration is essential for life science researchers.

To read the rest of the story, see