September 5, 2012

YarcData Architect on Hadoop’s Fatal Flaw

Datanami Staff

Systems like Hadoop and MapReduce are great at slicing problems into multiple pieces, evaluating each little piece, and plugging them back into the whole accordingly, much like an integral in calculus. But what if those little pieces interact with each other constantly, like sections of an ocean?

According to YarcData’s Solutions Architect, James Maltby, Hadoop and MapReduce are less suited to store these graphs than his company’s uRIKA database.

“Many graphs are tightly connected and not easily cut up into small pieces,” said Maltby. “A good example might be a map of genomic networks, which may contain 500 times as many connections as data nodes. Many MapReduce steps are required to solve this problem, and performance suffers. In contrast, uRIKA stores its graph in a large, shared memory pool, and no partitioning is necessary at all.”

Genomics is one of the more complicated and more exciting big data research fields. Medical scientists are working on genomics in hopes to ascertain precisely where diseases originate. However, the vast amount of genes per genome and the many connections those genes make amongst themselves makes genomics a complex big data problem. Slicing that problem severs those all-important connections.

Further, social networking data is intrinsically interconnected as people frequently make posts as a reaction to someone else’s post. Relational databases do not represent such data well, according to Maltby. “When the data is irregular or graph-structured, as in complex financial instruments or social network, the relational database becomes unwieldy and performance suffers.”

“In a semantic graph database like uRIKA,” said Maltby on how the semantic graph database differs from the relational, “the joins are implicit and built into the graph structure, so writing complex ‘what-if’ queries is easier, and performance is much improved.”

Of course, there exist in-memory semantic graph databases other than uRIKA. Per Maltby, what differentiates uRIKA is its performance, which stems from operating in-memory, and scalability. “uRIKA has the largest scaled sharable memory system in the industry, with up to 512 terabytes of RAM. Typical systems run from 2 to 32 terabytes of RAM.”

Not only does uRIKA reportedly scale 16 times more data than its nearest competitor, it also boasts an impressive input/output rate. “uRIKA is highly parallel, working on tens of thousands of parallel threads, working on the problem at the same time. And perhaps most importantly for big data problems, uRIKA has a high-speed I/O system. It’s capable of reading or writing up to 350 terabytes an hour.”

As big data problems grow more complex and interconnected, graph databases grow more important. Maltby and YarcData hope their uRIKA system will become the early standard-bearer for semantic graph databases.

Related Articles

MapReduce Makes Further Inroads in Academia

Study Stacks MySQL, MapReduce and Hive

Six Super-Scale Hadoop Deployments

Technologies: Frameworks

Sectors: Biosciences, Science

Vendors: Cray

Tags: data, Hadoop, james maltby, mapreduce, urika, yarcdata

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

YarcData Architect on Hadoop’s Fatal Flaw

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

YarcData Architect on Hadoop’s Fatal Flaw

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link