April 30, 2012

The Application Angle to Unstructured Data

Datanami Staff

This week, Tom Leyden from Amplidata broke the interview-as-marketing vehicle mold when he thoughtfully addressed what big data needs he’s hearing about, the shifting nature of applications, and the role of cloud models.

Leyden joked that the buzz around big data is happening because “cloud computing is getting old, the recession is not over and the industry needs new hype.”

More seriously, however, he said that while right now the focus seems to be directed at big data analytics, the real meat of the issue is in all of the big unstructured data.

For an object storage company like Amplidata the arrival of big unstructured data’s value marks a clear opportunity. As the company’s Paul Speciale wrote not long ago, “Corporations have previously viewed big unstructured data as a burden and therefore a cost, they have turned to the lowest-cost media available for storage of this data: tape. Now that there is a realization that there is tremendous business value in unstructured data, they understand that keeping it dormant and hard to access on tape is indeed a highly inefficient choice.”

According to Leyden, “companies are turning dead tape archives into live disk archives and investigating ways to actively use the archives (rather than just spending money on tape and not accessing the data ever).” He says that the key technology here is erasure coding; an alternative to RAID that provides much more reliability with less overhead and cost. Of course, at the heart of this, at least if you ask an object storage guy like Leyton, is the belief that object storage is the latest, greatest way to store massive amounts of complex, unstructured data and that is provides a leg up for the (oftentimes legacy) applications that need to tap such data.

He says that as for cloud, it’s still up in the air as to whether or not companies can actually save money, but it does lend to business agility. Then again, he said, for an object storage company like his own, the meaning of the word cloud, at least in the context of enterprise big data, is in question. He says that “In the storage industry Amplidata is seeing the start of a paradigm shift from file-based storage to object storage (no file system, a programmable REST API, cloud storage).

Leyden says, “This is probably just one phenomenon that is added to these numbers. Most enterprises still run on legacy applications for the most part. As the shift is turning to applications in the cloud, we will probably see a big wave of migrations of legacy applications to the cloud, especially as object storage helps facilitate this. How do we explain a factor 6 growth for the cloud industry? Applications.”

On that note, he claims that the term “unstructured big data” is in itself difficult to pin down due to diversity of the data as well as the all-important applications. For instance, he points to “big science data”, which refers to genomics research projects for example (both analytics and unstructured). Then there is also “big enterprise data”, which is mostly the massive amounts of documents and other unstructured data that is generated by companies. On the other side there’s specific “big entertainment data” that is unique to the film industry as improved film quality has had a big impact on storage requirements. At yet another end of the spectrum are “big data streams”, which refers to large volumes of data generated by cloud applications such as Twitter and Facebook.

In the end, Leyden says, it’s not about the data, the structure or the vehicle for data processing and transmission (cloud or otherwise), the emphasis should be on the applications when making important “big data” decisions.

Related Stories

Six Super-Scale Hadoop Deployments

Big Unstructured Data and the Case for Optimized Object Storage

Hollywood Sharpens Focus on Storage

Applications: Enterprise Analytics

Technologies: Storage

Vendors: Startups and More...

Tags: amplidata, big data, cloud, enterprise, leyden, Object Storage, tape, unstructured

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

May 6, 2024

May 3, 2024

May 2, 2024

May 1, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

The Application Angle to Unstructured Data

Join the discussion Cancel reply