September 7, 2023

Starburst Introduces Python DataFrame Support for Complex Data Transformation and Data Application Workloads

BOSTON, Sept. 7, 2023 — Starburst today extended their support for the most widely used multi-purpose, high-level programming language – Python – with PyStarburst, and announced a new integration with the open source Python library Ibis, built in collaboration with composable data systems builder and Ibis maintainer Voltron Data.

For Starburst and Trino developers and data engineers, this announcement means that they no longer need to offload data to frameworks like PySpark and Snowpark to handle complex transformation workloads. Instead, teams can leverage a single, powerful MPP engine for both their analytical and transformation workloads – reducing the cost and complexity of their stack.

PyStarburst provides a familiar syntax to PySpark and Snowpark for writing and running production-grade ETL pipelines and data transformations, making it easy to not only build new pipelines with PyStarburst but also to migrate existing PySpark and Snowpark pipelines to Starburst without rewriting code.

“Many data engineers prefer writing code over SQL for transformations, and many software engineers are used to building data applications in Python. With PyStarburst, we’re giving them the freedom to do so with the increased productivity and performance of Starburst’s enterprise-grade Trino,” said Martin Traverso, CTO of Starburst.

For developers and data engineers looking to build scalable data applications, the new Ibis integration provides a uniform Python API that can execute queries on more than 18 different engines – including DuckDB, pandas, PostgreSQL, and now Starburst Galaxy. This means you can scale from development on a laptop to production in Galaxy without rewriting a single line of code.

“At Starburst everything is built with openness in mind, and we are interoperable with nearly any data environment, so we’re extending that commitment to our programming languages. The partnership with Voltron Data and Ibis was a natural fit,” said Harrison Johnson, Head of Technology Partnerships at Starburst.

Together, Ibis and Starburst Galaxy empower users to write portable Python code that executes on Starburst’s high-performance data lake analytics engine, operating on data from more than 50 supported sources. Users will now be able to build analytic expressions across multiple data sources with reusable scripts that execute at any scale.

“Python users struggle to bridge the gap between prototypes on their laptops and production apps running on platforms like Starburst Galaxy. Ibis makes it much easier to bridge this gap,” said Josh Patterson, CEO of Voltron Data. “With Ibis, you can write Python code once and run it anywhere, with any supported backend execution engine. You can move seamlessly from crunching gigabyte-scale test data on your laptop to crunching petabyte-scale data in production using Starburst Galaxy.”

To learn more about Starburst, including its offerings and integrations, please visit the Starburst website.

About Starburst

For data-driven companies, Starburst offers a full-featured data lake analytics platform, built on open source Trino. Our platform includes the capabilities needed to discover, organize, and consume data without the need for time-consuming and costly migrations. We believe the lake should be the center of gravity, and be the starting point for querying disparate data. With Starburst, teams can access more complete data, lower the cost of infrastructure, use the tools best suited to their specific needs, and avoid vendor lock-in. Trusted by companies like Comcast, Grubhub, and Priceline, Starburst helps companies make better decisions faster on all their data.

About Voltron Data

Voltron Data offers a new way to design and build composable data systems. Founded in 2021, the global team is led by data engineers and core open source maintainers driving innovation in the data analytics ecosystem for the last 15 years. Today, Voltron Data offers a set of modular components built on open standards that help organizations augment existing data systems, unlock language interoperability, and take advantage of hardware acceleration.

Source: Starburst

Starburst Introduces Python DataFrame Support for Complex Data Transformation and Data Application Workloads

April 26, 2024

April 25, 2024

April 24, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Starburst Introduces Python DataFrame Support for Complex Data Transformation and Data Application Workloads

April 26, 2024

April 25, 2024

April 24, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link