May 6, 2020

Step One in Kafka’s Metamorphosis Revealed

Alex Woodie

(CoreDESIGN/Shutterstock)

Confluent today revealed that elastic scaling and billing is the first stage of Project Metamorphosis, its eight-part plan to morph its Confluent Cloud and Confluent Platform — and Apache Kafka to a lesser extent — to the cloud.

Cloud computing and real-time streaming of event data are two of most compelling technological innovations to occur over the decade, according to Jay Kreps, the co-creator of Apache Kafka and the CEO and founder of Confluent.

However, event streaming platforms and cloud computing don’t necessary play well together, Kreps says in a video blog posted to Confluent’s website today. “That’s what Project Metamorphosis is about: bringing those two things together,” he says. “We think it’s time to fundamental re-imagine what event streaming can be in the cloud, and provide an experience without all the incidental complicity and manual labor.”

Confluent revealed Project Metamorphosis two weeks ago in conjunction with the massive $250 million Series E funding haul. Today, the company pulled the curtain back on the first of eight planned announcements around the project, and it has to do with enabling elastic scaling of Confluent Platform running in the cloud.

According to Kreps, Confluent already supported cloud elasticity with the Basic and Standard Confluent Cloud options, which scale up to support data throughput of 100 MB per second and offered automatic balancing of data. With today’s announcement, Confluent is adding elasticity and dynamic load management with its Dedicated cloud offering, which scales up to handle multiple gigabytes of throughput per second.

Confluent CEO and co-founder Jay Kreps says Kafka currently is a poor fit for today’s cloud architectures

“You can dynamically create clusters and grow and shrink them as you need more or less capacity, and Confluent Cloud will do the hard work of balancing the load and moving the data placement around the cluster to take advantage of the capacity that you have,” Kreps says.

Currently, this elasticity is supported with just the core components of Confluent Cloud. But the plan calls for Confluent to add the same type of dynamic scaling – and the associated billing – to other components of its cloud offering, including Kafka Connect, which links the core Kafka event streaming system to outside databases, file systems, and applications; as well as KSQL, which adds the ability to manage and query state data in conjunction with the event data (which is handled with Kafka).

But wait, there’s more! “We’re not stopping there,” Kreps adds. “We don’t want these to be just scalable. But we want the scaling to be done for you. Over the course of the year, we’ll be working on making these systems autoscale, so they can automatically expand and contract to meet their needs and reduce the cost to you.”

Customers who run Confluent’s brand of Kafka on their own servers will be able to get the same elasticity and auto-scaling options soon, Kreps promises. That’s because the next major release of the Confluent Platform will support automatic data balancing and data placement within Confluent clusters, he says.

“When combined with Confluent Operator, which supports operations on Kubernetes, you can get a similar cloud-like elasticity in the private cloud and on-prem environments, allowing you to dynamically launch and expand clusters on demand and get that same data balancing capability natively within the cluster,” Kreps says in the video blog.

But what about users of Apache Kafka? While the open source software doesn’t add to Confluent’s bottom-line, the company (which exercises a lot of control over the Apache project) will be working with the open source community to expose some of these capabilities in the open source version of Kakfa.

“Some of the work that we’re doing to make that [elasticity and auto-scaling in a cloud-native manner] possible has spilled over into Apache Kafka itself,” Kreps says, specifically pointing to the KIP-500 program to remove Kafka dependencies on Zookeeper.

“This is really important to support elastic usage patterns,” Kreps says. “ ZooKeeper is Kafka’s biggest bottleneck and the dependencies on Zookeeper, which is really kind of a legacy component, makes elastic operations difficult. This effort to remove that is very active. Design work is happening; code is begin written; and we believe these changes will make it possible to dramatically scale up the number of partitions and topics that Kafka can support, as well as significantly simplifying the operation for those self-managing Kafka, by making it a single self-contained deployment without multiple tiers that all have to be tuned and secured independently.

When Zookeeper is fully eradicated from Kafka, the software will be able to scale to millions of partitions, Kreps says.

The company plans to unveil seven more components of Project Metamorphosis over the next seven months. If the story of Gregor Samsa is any indication, Kafka will resemble something entirely different when it’s all said and done.

Confluent Reveals ksqlDB, a Streaming Database Built on Kafkav

Higher Abstractions, Lower Complexity in Kafka’s Future

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Step One in Kafka’s Metamorphosis Revealed

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Step One in Kafka’s Metamorphosis Revealed

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link