October 17, 2018

Condusiv: System Capabilities Challenged As Data Warehouses Expand to Lakes

Oct. 17, 2018 — According to a report from the Aberdeen Group, the average company is experiencing data volume growth of more than 50% per year, from an average of 33 different sources. Primary reasons cited for this are to increase operational efficiency; make data available from departmental silos and legacy systems; lower transactional costs; and offload capacity from mainframes or data warehouse operations.1 “The use of what are called ‘data lakes,’” says James D’Arezzo, CEO, Condusiv Technologies, “is an increasing contributor to this staggering growth rate.” D’Arezzo, whose company is a world leader in I/O reduction and SQL database performance, adds, “For organizations to make productive use of data in these volumes, it is vital that their IT managers take steps to optimize basic system functions.”

The term “data lake” is comparatively new, and there is still some confusion between a data lake and a data warehouse. The primary difference is that where the information placed into a data warehouse needs to be structured into folders, rows, and columns, a data lake is a repository for all kinds of data, structured or unstructured. Structure is only applied when the data is queried by a user.2

Many data experts see the use of data lakes as a vital next step in making strategic use of information. “In today’s world,” says Michael Hiskey, Head of Strategy at hub software development firm Semarchy, “a data lake is the foundation of information management. When built successfully, it can empower all end-users, even nontechnical ones, to use data and unlock its power. In a word, the data lake makes data science possible.”3

For all its potential power, however, the great strength of a data lake—its ability to absorb data of virtually any kind from virtually any source—is also a weakness. Its constituent bits and pieces of differently structured data must undergo a considerable amount of processing and preparation before they can be combined and analyzed to produce a meaningful insight, which requires significant system resources. Suppose, explains an industry observer, you’re running a job on Hadoop. Running a machine learning engine could take up quite a few CPU cycles. Real-time analytics, on the other hand, could be extremely memory intensive. Transforming or prepping data for analytics might be equally I/O intensive.4

Meanwhile, notes D’Arezzo, the need for breadth (again, the data lake’s reason for existence) must co-exist with the need for speed. In a world in which the term “big data” is rapidly being replaced by “fast data”5, all organizations are struggling to get the most out of their necessarily limited computational resources. Left unaddressed, system performance will inevitably degrade as data lakes expand.

“The temptation,” D’Arezzo says, “will be to throw money at the problem in the form of additional hardware. But that won’t work, partly because it’s inefficient, and partly because data volume is growing a lot faster than IT budgets. Both financially and in terms of overall system performance, it makes better sense to optimize the capacity of the hardware you already have. We’ve developed software solutions that can improve overall system throughput by 30% to 50%, or more—without the need for new hardware.”

About Condusiv Technologies

Condusiv Technologies is a world leader in software-only storage performance solutions for virtual and physical server environments, enabling systems to process more data in less time for faster application performance. Condusiv guarantees to solve the toughest application performance challenges with faster-than-new performance via V-locity for virtual servers and Diskeeper or SSDkeeper for physical servers and PCs. With over 100 million licenses sold, Condusiv solutions are used by 90% of Fortune 1000 companies and almost three-quarters of Forbes Global 100 companies to increase business productivity and reduce data center costs while extending the life of existing hardware. Condusiv CEO Jim D’Arezzo has had a long and distinguished career in the high-tech arena.

Condusiv was founded in 1981 by Craig Jensen as Executive Software. Over 37 years, he has taken the thought leadership in file system management and caching and transformed it into enterprise software. For more information, visit http://www.condusiv.com.

1.    Lock, Michael, “Angling for Insight in Today’s Data Lake,” Aberdeen Group, October 2017.
2.    Patrizio, Andy, “What is a data lake? Flexible big data management explained,” InfoWorld, September 24, 2018.
3.    Hiskey, Michael, “Building a Successful Data Lake: An Information Strategy Foundation,” Data Center Knowledge, September 11, 2018.
4.    Kleyman, Bill, “A Deep Dive Into Data Lakes,” Data Center Frontier, August 29, 2018.
5.    Kolsky, Esteban, “What to do with the data? The evolution of data platforms in a post big data world,” ZDNet, September 13, 2018.

Source: Condusiv Technologies

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Condusiv: System Capabilities Challenged As Data Warehouses Expand to Lakes

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Condusiv: System Capabilities Challenged As Data Warehouses Expand to Lakes

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link