March 31, 2023

Meet Maxime Beauchemin, a 2023 Person to Watch

Alex Woodie

When it comes to prolific contributors to open source projects in the big data space, Maxime Beauchemin is definitely somebody you should know. As a data engineer at Airbnb, Beauchemin created multiple tools that he subsequently released to the world, including Apache Airflow, the popular data pipeline creation and management tool, and Apache Superset, which provides BI and analytics capabilities. He is also the founder and CEO of Preset, the commercial entity behind Superset.

We recently caught up with Beauchemin, who we named a Person to Watch for 2023.

Datanami: You’ve created two successful open source projects, Apache Superset and Apache Airflow. What do you attribute the success to? What made them successful?

Maxime Beauchemin: Most people are familiar with the idea of “product market fit” (PMF), a term coined by Marc Andreessen more than 15 years ago, and I like to think of a proxy for it in open source that I’d call “project community fit” (PCF). So it’s not just about the quality of the project, or how much you invest into it, it’s about building the right thing at the right time for the right people, and riding the momentum. I think reading about PMF and doing the mind exercise to translate the ideas to an open source project is fairly straightforward and informs finding PCF fairly well. The dynamics aren’t identical but they’re similar. If anything open source has better network effects (because it’s free by definition, and welcomes contributions) and snowballs better than a product in a market.

In any case, the ideas behind PMF were foreign to me back when I started both projects at Airbnb back in 2014/2016, and just wanted to build something that was going to be useful at Airbnb, and put it out there just in case someone outside of Airbnb may be interested to pick it up and collaborate or even just use it. My thinking was “if I’m building something for Airbnb that’s not a competitive advantage, why limit my impact to Airbnb?” Looking back, I think what worked for me was to build with passion, and to engage as directly as possible with anyone showing any kind of interest, whether it’d be on GitHub, email, Slack, or looking for conversation. For a long time, I honored and handled every single touch point. I also went beyond just writing software and did a lot of things that I’d now call “product marketing,” finding good names for the project, did some decent messaging/positioning, built half decent websites with nice screenshots, maintained decent docs, …

Both projects hit a point where I couldn’t keep up. From that point on, the projects have a life of their own. That’s OSS “escape velocity.” Feels great to reach this point!

Datanami: Do you think data engineering gets the respect it deserves? Why does it seem perpetually overlooked in the data space?

Beauchemin: The world isn’t always a fair place, but I think generally things (people, ideas, concepts, projects) tend to get the respect they deserve over time. In many ways historically data engineering, (maybe thinking about the pre-pipeline as code era, call it drag-and-drop ETL days) didn’t show a lot of self-respect either, especially when measured from the perspective of software engineering.

Arguably data engineering didn’t come into being until mid-2010s, tried to catch up/integrate software engineering practices, and while doing so missed out on the devops movement, only to try to catch up on some of that over the past five years or so through the lagging data ops movement. I think the gap in respect is reasonable when measured against software engineering practices, but is that fair!? We don’t measure other functions by SWE practices standard.

In the end, respect should be based on business impact, not solely around code/PDLC rigor and maturity. On the impact front, there are some real problems too. I talk about it in an article title “the downfall of the data engineer,” and some of these problems are preventing data engineering from delivering more impact and get respect from the organization as a whole.

Datanami: Is it getting easier or harder to be a data engineer in 2023?

Beauchemin: Clearly easier, the role is better defined, the stack/tooling has evolved, best practices increasingly well defined, and expectations around the role are more clear than ever before. Oh and the modern data stack is amazing, you can get started in minutes, get a world-class-scale-to-infinity cloud data warehouse setup in minute, set up Apache Superset instantly on top of it using Preset, do data integration with Airbyte or Fivetran without a hitch, set up Airflow through Astronomer, DBT Cloud. All this infrastructure is at your fingertips, pay-as-you-go and frankly amazing! The pool of articles and resources around best practices is only increasing too, communities exist now, … So much easier than it used to be.

Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?

Beauchemin: I’m a huge snowboarder. Grew up riding 50 days a year in the Quebec city scene in the 90s, and recently moved to Tahoe to be able to get back into riding regularly. Before the move, going to ride from the Bay Area while having three young kids was very difficult, so I didn’t ride much for the past decade. But now I’m back on the mountain! Oh and the kids are getting good now, so we often ride together!

You can read the rest of the interviews with the 2023 class of Big Data Wire’s People to Watch here.

Applications: Data Management

Technologies: Middleware

Vendors: Airbnb, Preset

Tags: 2023 People to Watch, Airbnb, Apache Airflow, Apache Superset, Big Data Wire People to Watch, Data engineering, data pipeline, Maxime Beauchemin, open source, product market fit, project community fit

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Meet Maxime Beauchemin, a 2023 Person to Watch

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 16, 2024

April 15, 2024

April 12, 2024

April 11, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Meet Maxime Beauchemin, a 2023 Person to Watch

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 16, 2024

April 15, 2024

April 12, 2024

April 11, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link