December 11, 2020

LinkedIn’s Translation Engine Linked to Presto

George Leopold

An SQL translation engine unveiled this week by LinkedIn is integrated with other open-source SQL query engines like Presto in a combination aimed at bulging data lakes.

The Microsoft unit’s Coral engine handles analysis and rewrite along with translation duties. Along with Presto, Coral integrates with Spark and Pig. LinkedIn said Thursday (Dec. 10) Coral is also integrated with Dali Catalog, LinkedIn’s data access tool that defines and “evolves” a data set.

LinkedIn describes Dali Catalog as its common data layer enabling high-velocity data while abstracting the details of data access from compute engines.

The catalog includes Dali tables and views. The latter is a “relation” that that refers to logic applied on base tables. Dali views enable data transformation, cleaning and aggregation from multiple sources as well as adding semantic meaning to data, LinkedIn said in a blog post unveiling Coral.

Dali views also are readable in Hive, Spark Pig, and Presto.

Coral is designed to make Dali views “more user-friendly, agile, secure and portable,” its maintainers added. Portability, meaning the query definition language is not tied to the underlying engine, is achieved through view virtualization, view transition and rewrite as well as Coral’s integration with Presto, Spark and Pig.

For view virtualization, the Coral module is used to access database, table and view information. The Dali Catalog uses a Coral Hive module to interface with view definitions stored in Hive. The module also houses a parser, validator and converter to handle representations of Hive query language view definitions.

Source: LinkedIn

“Coral rewrites view definitions into a number of engine-compliant languages and SQL dialects” LinkedIn said. “During that rewrite, it maps functions to their equivalent ones in the target engines so they are semantically equivalent.”

Meanwhile, integration with Presto rewrites Dali view definitions to a Presto-compliant SQL query. Similarly, the Coral Spark implementation rewrites to the Spark engine.

LinkedIn said it has worked with the Presto community to integrate Coral functionality into the Presto Hive connector, a step that would enable the querying of complex views using Presto.

Ongoing enhancements include developing more “frontend” query APIs, including those suitable for querying graph data and defining machine learning features. LinkedIn said work is also underway to support Dali views in streaming and online query engines.

The Coral code repository on GitHub is here.

Recent items:

Will the Presto Community Ever Be United Again?

Ahana Goes GA with Presto on AWS

How Facebook Accelerates SQL at Extreme Scale

Applications: Enterprise Analytics

Technologies: Frameworks

Sectors: Financial Services, Other

Vendors: LinkedIn

Tags: Coral, Dali Catalog, Dali tables, Dali views, pig, presto, Spark, SQL query engine, view virtualization

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

LinkedIn’s Translation Engine Linked to Presto

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

April 19, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

LinkedIn’s Translation Engine Linked to Presto

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

April 19, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link