March 28, 2016

Transactional Streaming? You Need State

John Hugg

The distinction between traditional operational systems and event/stream processing has begun to blur. Stream-oriented approaches offer novel ways to build applications that yesterday would have used a more traditional stack, such as LAMP or something similar. Rather than have monolithic clients fetch, process and update data over a network, developers are building pipelines that push data through fixed processing. It’s easier to scale, easier to reason about and is friendly to federated architectures and the super-trendy microservices.

This new world order comes with a catch. These streaming systems need to manage state, whether we’re talking about simple aggregations to prepare dashboards, or full-on system-of-record databases for storing account balances. To be truly operational, it’s not enough to make a decision, but it has to be recorded, and it has to be based on the best information.

Unfortunately, most stream processing systems have little to no integrated state management, leading developers to combine stateful systems like databases with streaming systems in multi-cluster setups. These cobbled-together hybrid systems offer flexibility but are complex to develop and deploy, may mask bugs and display surprising behavior when components fail. The superficially “simple” task of connecting systems together requires both distributed systems expertise and tremendous effort to hammer down unforeseen production issues.

But it is possible to integrate event processing with state management in a consistent, transactional way where one event equals one ACID transaction. You need to anticipate and answer questions:

How can integration and atomic processing simplify failure management? How does transactionality simplify building applications?
How can users leverage stronger consistency to do more complex math and calculations within event processing problems?
How can we move from struggling to count to dynamic counting and aggregation in real time?

Many streaming systems focus on at-least-once or at-most-once delivery. Systems that offer stronger guarantees are often very limited in how they interact with state. Can stronger consistency help achieve exactly-once semantics?

Ultimately, the latency reduction from integration can mean the difference between decisions that control how an event is processed or reactive decisions that only affect the future. Integrating operational stateful components can reduce latency and increase throughput, which can mean the difference between managing a handful of servers or a fleet.

The data management solutions of tomorrow will manage state and processing together. Building distributed systems over unreliable networks is very difficult, and the best systems will show up with the right mix of abstraction, flexibility and best-practices to help developers succeed at any scale.

I’ll discuss these topics in my Wednesday session, “Transactional streaming: If you can compute it, you can probably stream it,” which goes from 1:50 to 2:30 in room LL20 B. You can read more about the session here.

About the author: John Hugg, founding engineer & Manager of Developer Relations at VoltDB, has spent his entire career working with databases and information management. In 2008, John was lured away from a PhD program by Mike Stonebraker to work on what became VoltDB. As the first engineer on the product, he liaised with a team of academics at MIT, Yale, and Brown who were building H-Store, VoltDB’s research prototype. Then John helped build the world-class engineering team at VoltDB to continue development of the open source and commercial products.

Applications: Enterprise Analytics, Predictive Analytics

Technologies: Frameworks, Mobile, Network, Storage

Sectors: Other

Vendors: VoltDB

Tags: Hadoop, stateful computing, streaming analytics

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Transactional Streaming? You Need State

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Transactional Streaming? You Need State

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link