Immuta Introduces Apache Spark Ecosystem Support and Automated Governance Reporting for Data Science Programs
COLLEGE PARK, Md., April 10, 2018 — Immuta has unveiled new features of its data management platform, including native Apache SparkSQL policy enforcement and automated governance reporting. These advancements in Immuta v2.1 empower organizations to process, secure and audit on a massive scale in Apache Spark, and provide them with greater visibility and control over how data is being used with integrated compliance reporting.
According to Gartner, by 2020, most data and analytics use cases will require connecting to distributed data sources, leading enterprises to double their investments in data access, policy enforcement and metadata management. Unfortunately, the traditional methods for operationalizing data requires complex data engineering, manual policy enforcement, and labor-intensive reporting, which slows down data science teams and hampers innovation.
Immuta’s platform enables algorithm-driven enterprises to quickly connect to data with any tool, dynamically control data from any source, and fully comply with any regulation to enable fast, legal and ethical data science. This latest product release further extends Immuta’s powerful capabilities for the Spark ecosystem to enable massive scale processing with native policy enforcement within Spark. These policies include dynamic row and column level controls, time windowing, minimization, purpose limitations, and automated differential privacy.
“The ability to proactively manage and apply policy controls to data will increasingly be a fundamental requirement for enterprise AI adoption, and up until now, no tool existed that would allow organizations to use Spark in highly regulated environments,” said Steve Touw, Chief Technology Officer at Immuta. “With the new SparkSQL policy enforcement within the Immuta platform, we are enabling organizations to significantly increase the speed and scale at which they process, secure and audit data. This is key to meeting compliance and regulatory requirements from the GDPR, HIPAA and other laws, and we’re excited to be at the forefront in this market.”
Key benefits of SparkSQL with Immuta include:
- Native SparkSQL policy enforcement occurs during raw file read in Spark allowing column and row controls typically limited to Hive or Impala tables to now be enforceable in Spark jobs;
- Access to data on-cluster and joins with data external of the cluster without the need to configure drivers in SparkSQL, making all data immediately available within Spark in a compliant manner;
- The scale and speed of SparkSQL combined with the policy enforcement, entitlements and auditing provided by Immuta.
“Regulated data access and dynamic policy enforcement are key enablers for Apache Spark users to build algorithms that are legal and ethical. Immuta’s new functionality is a significant step in expanding Spark’s role as a unified analytics engine for big data and AI,” said David Cook, Chief Information Security Officer at Databricks. “By making these innovations available for the Spark ecosystem, more enterprises can automate access to their data and manage the risk of cloud-based data science programs.”
Please watch this brief demo video to learn more about Immuta’s Spark support: https://vimeo.com/262488742.
Automated Governance Reporting
Immuta also announced Automated Governance Reports (AGR), which allows Immuta “governors” or compliance officers to generate instantaneous records of all activities taking place throughout their data science workflows. Previously, lawyers and compliance personnel needed to undertake lengthy review processes to understand who was accessing what data, when they accessed it, and how. AGR allows compliance and legal teams to generate common reports based on the Immuta audit logs rather than having to rely on teams of IT personnel and database administrators to comb their own audit logs. Reports show governors clear audits of who is touching what data, how often, for what purpose, what policies are applied where, who has access to what data, and more.
Please watch this brief demo video to learn more about Immuta’s Automated Governance Reporting:https://vimeo.com/262488500.
Immuta is the fastest way for algorithm-driven enterprises to accelerate the development and control of machine learning and advanced analytics. The company’s hyperscale data management platform provides data scientists with rapid, personalized data access to dramatically improve the creation, deployment and auditability of machine learning and AI. Founded in 2014, Immuta is headquartered in College Park, Maryland. For more information, visit www.immuta.com and follow us on Twitter (twitter.com/ImmutaData) and LinkedIn (www.linkedin.com/company/