February 2, 2021

Varada Open-Sources its Workload Analyzer to Help Data Teams Optimize Data Lake Queries

TEL AVIV, Israel, Feb. 2, 2021 — Varada, a data lake query acceleration innovator, today announced that it has open-sourced its Workload Analyzer for Presto, including both Trino (formerly known as PrestoSQL) and PrestoDB, making the source code available to everyone via Github. The Workload Analyzer is a free, easy-to-use tool that offers visibility into how Big Data and analytics workloads are performing, offering users insights into how to improve performance and optimize resources.

“Presto democratized Big Data, exponentially expanding the number of business users that can ask questions to a Big Data infrastructure and enlarging the number of underlying data sources they can query,” said Ori Reshef, vice president of products at Varada. “But as the number of users within an organization grows, the challenge of DataOps teams is to keep queries running quickly, delivering results in a timely way so that those users can do their jobs. Unfortunately, DataOps teams are only able to get bits and pieces of the information they need to optimize resources from Presto itself. So Varada built the Workload Analyzer to give DataOps teams deep and actionable insights.”

The Workload Analyzer collects details and metrics on every query, aggregates and extracts information, and delivers dozens of charts describing all the facets of cluster performance. For the first time, data engineers have a holistic view of their cluster and can drill down into pain points to determine what queries to optimize and how.

The Workload Analyzer is compatible with PrestoDB and Trino. The Workload Analyzer script runs safely within the Presto cluster in the user’s Virtual Private Cloud (VPC), collecting and analyzing query statistics (JSONs). No data leaves the cluster and the tool does not require any external resources. The Workload Analyzer has already been tested on dozens of massive scale production clusters, resulting in zero impact on query performance.

Using the Workload Analyzer, data teams can:

Learn how resources are used on an hourly and weekly basis and define scaling rules
Identify heavy spenders and improve the pipeline
Improve predicate pushdown and significantly reduce IO and CPU
Identify “hottest” data
Improve JOINs performance
Provide a better production roll-out experience and identify upgrade risks upfront

Presto: A Tool of Choice for Data-driven Companies

Presto is an open source distributed SQL query engine for running interactive analytic queries. Presto offers many benefits, most notably its ability to quickly run queries on a wide variety of data sources all at once, including ‘raw,’ unmodeled data. With this capability, as well as other unique advantages, Presto has quickly become a tool of choice for many significant data-driven companies.

The Varada Commitment to the Trino and PrestoDB Communities

“As part of our deep commitment to the PrestoDB and Trino communities, Varada decided to release a standalone, open source version of our Workload Analyzer tool so that any Presto user can evaluate potential performance improvements in their cluster,” said Eran Vanounou, CEO of Varada. “The tool will help PrestoDB and Trino users optimize their clusters on their own using their existing solutions. Of course, we anticipate that after discovering the existing inefficiencies within their clusters, many users will want to further evaluate how adding an indexing layer to PrestoDB or Trino can help them vastly improve performance. We will be more than happy to demonstrate how the Varada Data Platform can do just that.”

Varada leverages Presto in its innovative query acceleration engine, the Varada Data Platform. A big data infrastructure solution for fast analytics on thousands of dimensions, the Varada Data Platform became generally available in December 2020. Varada’s proprietary indexing layer runs on top of Presto, improving Presto’s query response time by x10-x100.

About Varada

The Varada mission is to enable data practitioners to go beyond the traditional limitations imposed by data infrastructure and instead zero in on the data and answers they need—with complete control over performance, cost and flexibility. In Varada’s world of big data, every query can find its optimal plan, with no prior preparation and no bottlenecks, providing consistent performance at a petabyte scale. Varada was founded by veterans of the Dell EMC XtremIO core team, and is dedicated to leveraging the data lake architecture to take on the challenge of data and business agility. Varada has been recognized in the Cool Vendors in Data Management report by Gartner, Inc. For more information, visit: https://varada.io/

Source: Varada

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Varada Open-Sources its Workload Analyzer to Help Data Teams Optimize Data Lake Queries

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

April 22, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Varada Open-Sources its Workload Analyzer to Help Data Teams Optimize Data Lake Queries

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

April 22, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link