Follow Datanami:
October 4, 2023

Amazon DataZone Goes GA

AWS today announced that DataZone, the data management service it unveiled a year ago to streamline access to trusted data, is now open for business. The company has added several new features designed to improve user’s experience.

More than decade into the big data boom, data management remains a challenge for many companies seeking to get the most out of their data. The advent of artificial intelligence systems, including GenAI, is exacerbating poor data management practices, thereby putting pressure on vendors like AWS to come up with easier to use data management solutions.

DataZone is AWS’s latest attempt to thread that needle and create a place where data producers and data consumers can come together. The offering, which it first unveiled at re:Invent last year,  is composed of four different layers, including a data portal, a data catalog, data projects and environments, and a governance and access control layer.

The data portal is a Web application where different users can go to catalog, discover, govern, share, and analyze data in a self-service fashion, AWS says on its blog today. Users gain access to the portal by authenticating themselves through AWS Identity and Access Manager (IAM) or third-party IAM tool that can be integrated  through AWS’s IAM Identify Center.

Next to the portal, DataZone users will find a business data catalog, which is where data producers define their data taxonomy, or terms, in a language that is specific to their business. The catalog is also where data consumers go to search for interesting data they may want to use in their analysis.

Users work within the DataZone through data projects or data environments, which is where people, data assets, and analytics tools come together to satisfy a use case. In addition to assigning AWS infrastructure to the AWS tools, projects also provide a place where users can collaborate, exchange data, and share data assets.

Finally, administrator can stay in control of DataZone activities through the governance and access control layers. This is where the administrators grant or deny data access, or subscription, requests. This access control layer works in concert with the permissions set in underlying data stores, such as Amazon Redshift or AWS Lake Formation.

For the GA release, AWS made several improvements. For starters, it added the capability to customize the metadata generation using machine learning, as well as give users the option to link multiple glossary terms to individual columns. Users can also now bring any type of asset to the catalog using APIs. AWS also bolstered the data projects feature by enabling them to add multiple capabilites and analytics tools to the same project. You can also now attach data subscription terms to be attached to assets when published, streamlining access to Amazon Redshift and AWS data lake users via the EventBridge.

DataZone is generally available in 11 AWS Regions in nine countries. Customers can get a free trial for DataZone that supports 50 users for three months. For more info, see aws.amazon.com/datazone/.

Related Items:

 

AWS Unleashes the DataZone

Data Management Implications for Generative AI

What’s Holding Up Progress in Machine Learning and AI? It’s the Data, Stupid

Datanami