Follow Datanami:
October 17, 2018

Analysts Get A Guiding Hand

(Pewara Nicropithak/Shutterstock)

It can be tough knowing what you can and cannot do in analytics. While it may be possible to build a query that considers a person’s race during a credit check, it is also illegal (not to mention unethical). Now data catalog provider Alation is stepping up by providing more automation to help users conduct analyses in a responsible manner.

Earlier this year Alation rolled out a new feature in its Compose query building environment called TrustCheck that gives users instant feedback on whether the query they’re writing is kosher or not. Datanami caught up with Alation Vice President of Marketing Stephanie McReynolds during the recent Strata Data Conference in New York City to get the lowdown on TrustCheck and how it works.

“What’s interesting about TrustCheck is we’re trying to capture the user when they’re developing the logic or the query and alert them to different data governance issues that might be embedded in that project,” McReynolds says. “As you’re writing a SQL query in Compose, it will highlight different statements in the SQL in either red, yellow, or green. If it’s red, you must be doing something that’s against the rules, or maybe the context of what you’re doing is against the rules.”

It’s better to catch users in the act of writing a questionable query rather than wait for the governance team to comb through the audit logs to track down the source of the violation, McReynolds says. Plus, the real-time feedback provides a way to educate users about what is okay and what is not, which is a more efficient way to modify behavior in the long run.

“More often, it’s not a security policy violation, but a data-usage policy violation, such as ‘Don’t join gender and income if you’re running a credit check,'” McReynolds says. “Or it might be a nuance of the data, so it might be in yellow. Did this ETL process run last night? If you’re using this for some real-time decision making, pause. It will be updated at this moment in time.”

Most users who violate data usage policies are not bad actors, she adds. Most often, they violate policies because they don’t know any better or they’ve forgotten what the policies are. With TrustCheck, the company can provide a gentle reminder of what is okay and what is not

Customers who are coding within Alation’s own Compose environment get a traffic light indicator on their queries. TrustCheck is also available with analytic tools from two third-party providers, including Tableau and with SalesForce‘s Einstein Analytics, which have adopted Alation’s API to allow them to query metadata about which data sets have been certified for specific uses from the Alation data catalog.

In both of those products, Alation pushes the TrustCheck references directly into the user interface. In Tableau, a user will see a blue ribbon when they’re doing something good, whereas in Einstein Analytics users see a green check. The company is open to working with other BI vendors who want to adopt the API, McReynolds says.

The TrustCheck suggestions aren’t always about data policy violations. For example, a user could be signaled that they’re about to join two massive tables that could take hours to complete. There’s nothing unethical about running the join, but the systems administrator sure won’t appreciate her server being brought to a screeching halt.

Alation is hoping to capitalize on TrustCheck as another benefit of investing in its data catalog software.

“I think we’re the first vendor to have this capability to interrupt the workflow and say ‘Hey look did you realize you’re making this mistake?’ It’s one in a set of features that you’ll see us come out with in the next year or two years where we’re taking a different approach to data governance.”

Related Items:

How the Machine Learning Catalogs Stack Up

Cataloging Alation’s Growth Potential

Datanami