Follow Datanami:
November 21, 2023

Expanso Raises $7.5M to Transform Big Data Landscape with Distributed Data Approach

SEATTLE, Nov. 21, 2023 — Expanso, a startup built to help enterprises manage their ever growing data needs with a distributed approach to big data processing powered by its open-source software Bacalhau, has raised $7.5 million in seed funding led by General Catalyst and Hetz Ventures, along with Array Ventures.

Based out of Seattle, Expanso is co-founded by alums of Google, AWS, and Microsoft and will be focusing on open source solutions and targeting enterprises to address what CEO David Aronchick believes is currently an enormous but overlooked challenge: “Actually making use of enterprise data.”

Distributed big data processing can be complex and challenging. One of the biggest challenges is dealing with the time and cost involved with transferring data between different nodes to a centralized data lake. This can make it difficult to be responsive to new data inflows in real time. Further, many platforms, while powerful, require converting existing code to new frameworks just to access the data, let alone get insights. And distributed big data processing systems are often a rich target for security issues, such as leaking personally identifiable information (PII), regulatory concerns, and data breaches.

The open-source software Bacalhau (, developed and backed by Expanso, is built on the principle of “Compute Over Data,” which means that it brings the processing jobs to where the data is, rather than moving the data to the cloud first. This has a number of advantages, including:

  • Reduced costs: Moving large amounts of data to and from the cloud is expensive.
  • Enhanced speed: Bacalhau processes data locally, removing cloud transfer latency and boosting performance for data-heavy applications.
  • Increased security: Not moving the data reduces the risk of data breaches and other security incidents.

Further, with Bacalhau, users can streamline their existing workflows without the need of extensive rewriting by running arbitrary Docker containers and WebAssembly (WASM) images as tasks. The software can run on-premises, or inside of any cloud including Amazon Web Services (AWS), Microsoft Azure, Google Cloud, Oracle Cloud, and many more.

“Infrastructure built to meet data where it is, even if distributed around the world, is long overdue. What Expanso is building with Bacalhau is intended to revolutionize the way big data is processed and global compute jobs are executed, while unlocking an entirely new class of applications,” Expanso CEO David Aronchick said. “We’re excited to partner with General Catalyst, Hetz Ventures, and Array Ventures and use this funding to accelerate the development of Bacalhau and Expanso, and bring it to even more users.”

“Expanso brings compute to the data, enabling businesses to operate securely at their operational pace and maximize the utility of valuable data. In less than a year, Dave and his team of exceptional technologists and entrepreneurs, have achieved significant milestones, with the platform now in use with various sectors, including some of the world’s largest defense organizations. We are proud to support Expanso as they work to enhance the impact of distributed data for businesses worldwide,” said Quentin Clark, Managing Director of General Catalyst.

Developers can use the tools they already know and enjoy using, like Python, R and Duck DB – with almost no changes. Nearly anything that can be containerized, can run on their network. “A missing part of the modern data stack is the ability to process data where it is being created rather than have to centralize everything first,” said Jordan Tigani, CEO and co-founder of MotherDuck. “Bacalhau fills in that missing link, allowing large numbers of remote workers to use DuckDB to filter, summarize, and transform data at the edge before communicating results to MotherDuck in the cloud.”

Bacalhau offers a free demo network which has been live for nearly six months. Since launching, their network has handled more than 1.5 million jobs for design partners like the University of Maryland, BOINC, New Atlantis Foundation, and many more.

Bacalhau is available today as Open Source Software. Click here to download. The public GitHub repo can be found here.

About Expanso

Expanso, a software company building upon their open-source software Bacalhau, offers a unique platform for fast, affordable, and secure computation. The Company orchestrates jobs to run where the data is generated and stored, eliminating costly and risky data transfers and storage. Expanso’s ‘Compute Over Data’ model keeps data stationary, simplifies management, and speeds up processing, enabling better focus on analytics and data science while reducing expenses and enhancing data security.

Source: Expanso