Follow Datanami:
June 25, 2024

Prefect 3.0 Enhances Data Workflows with Transactional and Flexible Orchestration

June 25, 2024 — Prefect today announced the open-source technical preview of Prefect 3.0. In this blog post written by CEO Jeremiah Lowin, Prefect introduces innovative solutions for data teams to write resilient code, automate with flexibility, and efficiently run on provisioned resources. Prefect 3.0 offers a new framework for building resilient workflows by design, emphasizing transactional semantics, flexible orchestration, and portable execution.


Today, we’re very excited to announce the open-source technical preview of Prefect 3.0.

Data teams face a relentless challenge. Modern businesses rely on a complex web of workflows and automations, making it critical to build resilient systems that are easy to trust. Prefect 3.0 is our answer to that challenge, providing a framework for building workflows that are resilient by design. Our third-generation workflow engine enables new levels of confidence in automation through the ability to:

  • Apply transactional semantics to workflow execution, ensuring data consistency and minimizing the cost of failure through the ability to roll back errors to a clean state.
  • Model and streamline any workflow. By open-sourcing our events and automations engine, we allow data teams to take action on any event, from a suspicious transaction to a network security alert.
  • Enable task autonomy across infrastructure with native execution on third party distributed platforms like Ray and Dask, extending Prefect’s ability to run any function, anywhere, with right-sized resources.

In addition, we are announcing ControlFlow, our new open-source framework for building agentic workflows, built on top of Prefect 3.0. ControlFlow allows AI engineers to orchestrate and control AI agents across multiple tasks, with consistent history and context throughout.

The Relentless Challenge: A Growing Long Tail of Data Work

In our data-driven era, trust is the bedrock upon which every decision, insight, and product rests. While companies once fixated on ensuring the integrity of data warehouses and ETL pipelines, this focus has shifted as the warehouse has become increasingly commoditized. Today, business value is created in an ever-expanding long tail of custom, interconnected data workflows that span diverse use cases from personalized customer experiences, to fraud detection, to dynamic pricing.

This long tail of automation processes presents key challenges: First, the proliferation of data work expands the surface area for failures and misalignments that can erode trust in automated systems, particularly in the era of AI. Second, the diversity of modern data processes requires flexible automation capable of reliably handling heterogeneous workloads – a capability existing tools lack. Third, the heterogeneous compute resources needed to run data processes have exploded, creating deep coordination friction that slows businesses.

To navigate this landscape, data teams need tooling that seamlessly blends three critical characteristics: resilience to withstand failure, flexibility to handle bespoke processes, and portability to ensure efficient execution on the right infrastructure. Achieving this seemingly impossible combination is key to building robust, trustworthy data flows that can power an organization’s competitive edge.

Prefect 3.0: Enabling the Resilient Enterprise

Since our inception, we’ve created significant workflow innovations to help users manage risk around moments of technological failure that erode trust. Prefect 1.0 introduced functional orchestration with a key focus on developer experience, and Prefect 2.0 introduced dynamic orchestration for data workflows, allowing them to adapt in real-time to changing conditions.

With Prefect 3.0, we offer a framework for building workflows that are resilient by design. This isn’t just a collection of new features. It’s a new way of thinking about data workflows, built on three pillars to power the next generation of resilient workflows:

  • Transactional orchestration
  • Flexible orchestration
  • Portable orchestration

Transactional Orchestration

Prefect 3.0 brings transactional semantics to your Python workflows, allowing you to group tasks into atomic units and define failure modes. If any part of a transaction fails, the entire transaction can be rolled back to a clean state. This elevates failure handling to a first-class operation, letting you build resilience directly into your workflow code. Our approach to transactional orchestration also makes your workflows or tasks automatically idempotent: rerunnable without duplication or inconsistency across any environment. Idempotency is game-changing for data pipelines, ensuring safety amid failures. With Prefect 3, idempotency comes standard, making workflows naturally resilient. Transactional semantics with idempotency revolutionizes Python workflow development, providing robust data integrity.

Flexible Orchestration

Building on our commitment to modeling even the most complex enterprise workflows, we’re open-sourcing the event-driven engine that was previously only available in Prefect Cloud. This engine has been battle-tested by some of the largest companies in the world, and you can now build event-driven workflows and automations, which are a powerful way to build near-real-time systems. But we’re not stopping there. Prefect 3 is a fully multi-modal orchestrator, with native support for not just batch and event-driven workflows, but also for embedded or interactive workflows, human-in-the-loop situations, and background tasks. You can combine any of these capabilities to model your workflows in the most natural way possible.

Portable Orchestration

We’ve redesigned the Prefect 3 engine to be completely portable. This extends our existing capabilities around task runners and work pools to support running any function, anywhere, whether that’s a laptop, a lambda, or even a legacy orchestrator. You don’t even need to set up a full workflow. Just decorate the function, call it and you’ll get the full benefits of orchestration and observability. This is our third-generation orchestration engine and it is the fastest and most scalable one we’ve ever built—in many cases, reducing overhead by more than 90% compared to Prefect 2.

ControlFlow: Take Control of your AI Agents

Prefect 3 offers a fundamental shift in how data engineers can build workflows with confidence. But there’s another force transforming our industry where trust is sorely needed – one with the potential to redefine how we work, build, and innovate. That force, of course, is AI.

That’s why we’re excited to announce our new, open-source framework: ControlFlow.

ControlFlow lets you build agentic LLM workflows that you can actually trust. At its core, it’s based on a very simple premise: AI agents are most effective when applied to small, well-defined tasks, and run off the rails otherwise. By splitting workflows into discrete tasks and composing them, we can get all the benefits of complex behavior without the risks of too much autonomy. This significantly mitigates hallucinations and unexpected behavior, while making it easier to debug, monitor, and of course, control your agents.

ControlFlow allows AI engineers to respond to failure programmatically, just like Prefect users have been doing for years. And because ControlFlow is built upon Prefect 3, you can make use of transactions to roll back an agent’s memory to a clean state when trouble occurs.

Get Started with Next Generation Workflows

Whether you’re a data engineer, an AI engineer, or an internal tools engineer, I hope you’ll join us in building the next generation of data workflows. Workflows that instill trust in the long tail of data work by enabling resilient code by design, flexible automation, and portable execution across any infrastructure.


Source: Jeremiah Lowin, Prefect

Datanami