March 24, 2023

Apache Flink 1.17 Update Drives Streaming Data Warehouses

Jaime Hampton

The folks at the Apache Flink project have announced a 1.17.0 release of the popular open source distributed framework for streaming data use cases.

“Apache Flink is the leading stream processing standard, and the concept of unified stream and batch data processing is being successfully adopted in more and more companies. Thanks to our excellent community and contributors, Apache Flink continues to grow as a technology and remains one of the most active projects in the Apache Software Foundation,” the Apache Flink project management committee said in an announcement.

The Flink 1.17 release is geared towards optimizing streaming data warehouses, which are a modern data storage and processing solution for handling real-time or near-real-time data streams. Many traditional data warehouses are primarily batch-oriented where data is only loaded at scheduled times, but streaming warehouses continuously ingest, process, and analyze data as it is generated, allowing for analytics and decision-making based on the newest data available. Flink is a popular choice for implementing streaming warehouses because the framework was specifically designed for large-scale, low-latency data stream processing.

The 1.17 release has several features and improvements for data stream processing. One feature is streaming SQL semantics which addresses non-deterministic operations challenges by fixing incorrect optimization plans and functional issues. An experimental feature has been introduced to inform SQL users of potential correctness risks and optimization suggestions. There are also enhanced checkpoint improvements to improve speed, stability, and usability. A new REST interface allows users to manually trigger checkpoints with custom types during job execution.

Another enhancement has been made to watermark alignment to enhance coordination and reduce excessive buffering by downstream operators. Additionally, The FRocksDB update brings improvements to RocksDBStateBackend, including shared memory between slots and support for the Apple M1 chip.

Flink 1.17 also has updates to support batch processing. There is a new delete and update API in Flink SQL for batch mode, enabling row-level modifications in external storage systems. Enhancements to batch workload stability and performance have been made. Flink 1.17 introduces a “gateway mode” for SQL Client, enabling users to submit queries to a SQL Gateway for advanced functionality. Additionally, users can now manage job lifecycles through SQL statements.

Apache Flink continues to garner interest due to its unique ability to run stream processing with very large state or high throughput. In a recent article, Robert Metzger, a member of the Apache Flink PMC, notes that “In 2022 alone, a total of at least $55 million has been invested by venture capitalists into startups building companies around Apache Flink.” Examples of companies investing in Flink are Confluent and its recently acquired Immerok, and also AWS, which offers Flink as a hosted service.

“Flink is hot because the community of data scientists and infrastructure engineers have decided that the future is Flink. We have all the ingredients: well-funded startups, well-resourced enterprises loaded with engineering talent, a battle-tested and open-source technology, and a huge market that is rapidly emerging from an early state into one that is looking to modernize data stacks to become real-time,” wrote Metzger.

Confluent to Develop Apache Flink Offering with Acquisition of Immerok

Preventing the Next 9/11 Goal of NORAD’s New Streaming Data Warehouse

Applications: Data Management, Enterprise Analytics

Vendors: Apache Flink

Tags: Apache Flink, real-time streaming data, streaming data warehouses

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Apache Flink 1.17 Update Drives Streaming Data Warehouses

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Apache Flink 1.17 Update Drives Streaming Data Warehouses

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link