Follow Datanami:
July 15, 2022

Alluxio Featured Speaker and Premier Sponsor at PrestoCon Day

SAN MATEO, Calif., July 15, 2022 — Alluxio, the developer of the open source data orchestration platform for data driven workloads such as large-scale analytics and AI/ML, has announced its participation in PrestoCon Day, a day dedicated to all things Presto taking place virtually on Thursday, July 21, 2022. Alluxio will also be hosting a Presto Committer Virtual Office Hour to answer any questions related to Presto and Alluxio. 

Alluxio Sessions at PrestoCon

July 21 at 11:40 am PT – “Architecting your Data Platform with Presto and Alluxio in Heterogenous Environments,” by Adit Madan, director of product management at Alluxio 

As the cloud is evolving and the adoption of a hybrid-cloud or multi-cloud approach grows, the data architecture must adapt to heterogeneous environments. In this talk, Adit Madan shares insights on how to architect a data platform with Presto and Alluxio that provides agility and simplicity to your data team.

July 21 at 11:45 am PT – “Dynamic UDF Framework and its Applications,” by Rongrong Zhong at Alluxio and Yanbing Zhang, software engineer at Bytedance. 

In this talk, Rongrong and Yanbing will talk about a microservice that they built at Uber to analyze Presto queries. The Presto Query Engine does not provide endpoints for query analysis purposes. One has to either execute the query or gather insights from the query explain plan. In this talk, they will talk about 1. The work that they had to do to do the query analysis in a microservice using Presto as a library. 2. Doing predicate analysis on the queries to come up with data formatting recommendations in order to improve query performance. 3. Using the analysis service for query result cache invalidation. The analysis figures out whether the results from a previous run of the query are still valid and can be reused.

July 21 at 1:15 pm PT – “Speeding up Presto at Uber with Alluxio Caching,” by Chen Liang, senior software engineer at Uber and Beinan Wang, software engineer at Alluxio

At Uber, Presto is heavily used as one of the primary data analytics tools, and Presto’s query performance has profound production impact at Uber. As part of the Presto optimization effort, Uber turned to explore Alluxio as a caching solution. Alluxio is an open source data orchestration platform often used by many compute frameworks as the caching layer. Alluxio caching is currently enabled on ~2000 nodes across 6 clusters at Uber. This session will present Uber’s journey integrating Alluxio cache into Presto. It will review the specific challenges encountered and how they were addressed. It will also share their performance improvements. Lastly, this session will discuss plans and next steps, and potential future collaboration opportunities with the community.

View all the sessions in the full program schedule.

PrestoCon Day is a free virtual event and registration is open.

About Alluxio

Proven at global web scale in production for modern data services, Alluxio is the developer of open source data orchestration software for the cloud. Alluxio moves data closer to data analytics and machine learning compute frameworks in any cloud across clusters, regions, and clouds, providing memory-speed data access to files and objects. Intelligent data tiering and caching deliver greater performance and reliability to customers in financial services, high tech, retail and telecommunications. Alluxio is in production use today at eight out of the top ten internet companies. Venture-backed by Andreessen Horowitz, Seven Seas Partners, Volcanic Ventures, and Hillhouse Capital. Alluxio was founded at UC Berkeley’s AMPLab by the creators of the Tachyon open source project. For more information, contact [email protected].


Source: Alluxio

Datanami