September 4, 2012

In-Memory Tweaks Boost Proteomics Research

Nicole Hemsoth

From genome sequencing to mass spectrometry to collect information for proteomics work, researchers have honed the craft of data collection for the life sciences—now the real questions of large-scale analysis are foremost on the minds of many in this forward-thinking community.

The list of analytical solutions and platforms gets longer each year, but when we think about column-based in-memory databases, the varied needs of business versus scientific analytics tend to come to mind.

According to a team of German researchers who worked with the SAP Innovation Centers in Germany and Palo Alto, the business community isn’t the only user base that can tap sophisticated in-memory data mining tools. The researchers state that the analytical performance of these engines is a promising prospect for the data mining needs of the life sciences community.

More specifically, this in-memory experiment focused on bringing a business analytics platform into the proteomics fold. This molecular biology branch requires complex analysis across several terabytes of protein data to identify diseases and other traits via biomarkers.

The team behind the business-to-scientific analytics flip focused on the HANA, SAP’s in-memory data management platform. Upon fine-tuning the platform to handle the vast wells of life sciences data, they concluded that an in-memory database approach meets “some inherent requirements of data mining in the life sciences.” More specifically, they claim that for their specific area, proteomics research, HANA and the implementation of a “flexible data analysis toolbox can be used by life sciences researchers to easily design and evaluate their analysis models.”

Life sciences data is not only massive in volume, it requires complex, extensive analysis that requires ultra-performance from the data management side of the fence. This is usually handled with heterogeneous systems, applications and scripts, but with the diversity and distribution of the data across different files, formats and systems, the challenges are significant. This means that what these users require are systems that support heterogeneity of “application logic and data sources.”

The advent of analytical platforms that allow for analysis of big, diverse, distributed data in near-real time creates new possibilities. The team claims that the SAP HANA in-memory database is different from standard disk-based solutions since it stores primary data directly in memory to boost performance and allow for the speedy calculations. This, they say, “enables fast execution of complex ad-hoc queries even on large datasets and includes multiple data processing engines and languages for different application domains.”

The researchers said that an enterprise platform like SAP’s HANA, which serves as the foundation for their future enterprise applications, can be tailored to meet scientific needs. They say it already has the basis for a solid analytical platform built-in with complex data mining capabilities and machine learning functionality as well as the aforementioned ability to sling ad-hoc queries across massive datasets in relatively short timeframes.

The buzz around in-memory approaches to complex analysis for both science and business has picked up in pace over the last year or two. In addition to SAP HANA, there are a number of other companies that are focusing on different areas of the market, including SAS, Kognitio, ParAccel, IBM, VoltDB and others.

Related Articles

Pulling Big Insurance Into the In-Memory Fold

World’s Top Data-Intensive Systems Unveiled

IBM Fellow Tracks HPC, Big Data Meld

Six Super-Scale Hadoop Deployments

Applications: Data Mining, Enterprise Analytics, Research Analytics

Technologies: Frameworks

Sectors: Biosciences, Science

Vendors: ParAccel, SAP

Tags: HANA, in-memory, life sciences, proteomics, research, SAP, sap innovation

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

In-Memory Tweaks Boost Proteomics Research

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 6, 2024

May 3, 2024

May 2, 2024

May 1, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

In-Memory Tweaks Boost Proteomics Research

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 6, 2024

May 3, 2024

May 2, 2024

May 1, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link