Follow Datanami:
October 10, 2023

Immuta Expands Hooks Into Starburst for Better Data Mesh Governance

(Photon photo/Shutterstock)

Starburst and Trino users will have an easier time hammering their distributed data governance policies into place thank to a deeper set of data mesh integrations between the open data analytics engines and Immuta, the data security platform provider.

Data governance rarely plays a staring role when organizations begin adopting a new technology, such as Starburst or Trino, the open source query engine that Starburst helps develop. But if the new analytic tool is successful and usage begins to grow, the need for data governance can quickly become apparent.

“Generally, when you start, it’s a little bit unclear. They just want to get things off the ground,” says Moritz Plassnig, chief product officer for Immuta. “But when you hit that scale issue, you really start to think through, what are the different domains? Who can bring in data? Who should be allowed to control what happens with the data? Who are those individuals?”

In an ideal world, organizations would start new data projects with a foundation in mind. Perhaps they would take to heart the four pillars of the data mesh, and adopt core data building blocks, such as data governance tools. But the reality is that few organizations do that, which means that data mesh concepts and the tools that enable them, such as data governance tools, are often shimmied into the stack after the fact.

Today’s announcement from Immuta makes that after-the-fact shimmying a bit easier. The company has eliminated a significant amount of work previously required to hook its ABAC capabilities into enterprise Starburst environments as well as data lakes using the open source Trino query engine.

According to Plassnig, the integration eliminates some of the tedious change management tasks that users previously had to do to enable Immuta to govern what data sets Starburst and Trino users can access and what they can do with that data.

“It was a little bit more difficult [previously],” the Immuta CPO tells Datanami. “You had to go through a lot of change management and repoint queries and a lot of things to basically make it work with Immuta, because they started without Immuta. Now it just seamlessly works with Starburst, so you don’t have to do any change management. You basically take over the access controls and then continue with Immuta. That ultimately removes friction and improves time to value for the customer.”

The advantages of using Immuta in a data mesh remains the same. By defining access control policies centrally in Immuta, users eliminate the need to define and maintain those policies in the analytics engines themselves. That’s particularly important for customers that have large and distributed teams of data analysts and who value open analytic environments, such as Staburst and Trino, which are query engines that don’t have storage hardwired to them. Customers that use Starburst and Trino often use many other open source query engines against their data too, such as Spark, Presto, Dremio, Flink, and others.

“What’s unique about Starburst or Trino is, if you [build a data mesh] in Snowflake or Databricks or Redshift or BigQuery or any of those technologies, you do it with those technologies,” Plassnig says. “Starburst gives you a little bit more flexibility because you can build the data mesh across all of the technologies I just mentioned, and many more. So with Starburst you could build a data mesh where you use some data from Snowflakes, some from Databrick, and then maybe you still have some data  in some traditional on premise legacy data stores. That’s where Starburst and Trino can really shine because they’re ultimately a query engine that can take data and make data availed from all those different places.”

(Den Rise/Shutterstock)

Immuta’s attribute-based access control (ABAC) approach provides much more flexibility compared to traditional role-based access control (RBAC) environments, primarily because each attribute-based policy can be used to cover many more users, Plassnig says. That significantly streamlines the ability for the data governance environment to scale as usage grows, both in the number of users and in the number of distributed data sets that users are accessing.

“You might need one policy or five policies [with ABAC] instead of 50 policies in the past, so you have much less ongoing maintenance,” he says. “That is ultimately better for you, because it also helps you understands better who can do what with the data. If you have thousands of policies, at some point it’s impossible for a human to underhand how is this specific individual governed? You have to look through all the policies. It makes it impossible.”

Immuta is also making it easier to replicate the data domains that naturally exist in customer’s Starburst and Trino environments into the Immuta tool itself, which will help them construct data domains as part of their data mesh, and ultimately deliver self-service access to data.

“There’s a capability inside of Immuta that is not specific to Starburst but is very important to whenever you build a data mesh, where you can basically replicate those data domains that you have in your company inside Immuta,” Plassnig says. “Then you can assign data to different domains. You can have different data owners in all the different domains. That makes it more applicable…and ultimately helps companies their head around how to do a data mesh.”

Related Items:

Immuta Raises $100M Series E as the Latest Data Access Unicorn

Data Mesh 2.0: Realizing the Promise of Decentralization

Data Mesh Vs. Data Fabric: Understanding the Differences