Paxata Announces Apache Spark-Powered Data Preparation Runtime Fabric with Native Kubernetes Support
REDWOOD CITY, Calif., Aug. 16, 2018 —, a pioneer in self-service data preparation for analytics, has announced the general availability of its Fall ’18 release, the next major update to the company’s award-winning Adaptive Information Platform. The latest release includes a new Adaptive Workload Management capability, which delivers an elastic resource allocation service on a number of orchestration frameworks, including Microsoft Azure HDInsight, Kubernetes, and Apache Hadoop YARN. The new offering also enables dynamic scaling of large data prep workloads across ephemeral clusters to lower cost and improve performance.
“With this release and Paxata’s elastic resource allocation, we can dynamically scale our data preparation workloads, increase our visibility of current and historical data and improve our overall efficiency,” said Byron Hernandez, Data Analyst II, Cox Automotive. “This, in addition to having the flexibility to define our own interactive data volumes, will help us manage our enterprise requirements effectively.”
Building upon Paxata’s Spark-based data prep engine, which provides the industry’s fastest interactive experience, Paxata now offers the industry’s only data prep runtime fabric that can dynamically allocate, execute, and free up processing resources to dramatically reduce infrastructure and compute costs when running automated batch jobs. Paxata is the first data prep engine on Kubernetes. According to Gartner,Kubernetes is not an application platform in and of itself as it relies on external components for enterprise-grade, production deployments.
The new Adaptive Workload Management feature also provides enterprises the choice in defining their own interactive data volumes. Unlike other solutions that limit interactive data prep to small, fixed samples, Paxata is the only solution with the flexibility to size the interactive dataset per tenant, meaning organizations can align its use to meet specific enterprise use cases and requirements.
“Today, enterprise data preparation scenarios vary in scale, complexity, and urgency. Paxata’s Adaptive Workload Management feature, and our support for orchestration frameworks such as Kubernetes, provides our customers the flexibility to define their own resource cost-curves,” said Nenshad Bardoliwalla, Co-Founder and Chief Product Officer at Paxata. “We are staying true to our vision; yet, adapting to enterprise data volumes which gives our customers the freedom to choose the amount of data they want to carve out for interactive, real time scenarios vs. regular, repeated workloads that operate in batch. This flexibility is truly unique in the market and reinforces our paradigm of interactive data prep with the totality of data, while giving our customers a way to allocate compute resources that are best suited for their requirements.”
“Azure Kubernetes Service (AKS) has the unique ability to support high performance containerized workloads, elastic scaling and portability across diverse cloud environments, which makes it an ideal orchestrator to standardize on,” said John ‘JG’ Chirapurath, General Manager, Azure Data & Artificial Intelligence. “We believe Paxata’s move to support AKS for dynamically scaling large data prep workloads across shared, ephemeral clusters will result in customers benefiting from optimized performance, cost, and portability.”
To learn more about this new release,“Introducing Paxata Fall ’18 Release” on Tuesday, August 28, 2018.
1 Using Kubernetes to Orchestrate Container-Based Cloud and Microservices Applications, April 24, 2018, Gartner, Inc.
At Paxata, we turn raw data into trustworthy information at the speed of thought. We provide an Adaptive Information Platform that enables business leaders and analysts with an enterprise-grade, self-service data preparation system for analytics, operations, and regulatory requirements. Business analysts work within an intuitive, visual application to access, explore, shape, collaborate and publish data with clicks, not code, with complete governance and security. IT is able to support the scale of data volumes and variety, enterprise and cloud data sources, and business scenarios for immediate and repeatable data service needs. Built on Apache SparkTM and optimized to run in hybrid, multi-cloud environments, Paxata leverages automated artificial intelligence, elastic cloud architecture and distributed computing to deliver an immersive business consumer experience that automates the data-to-insight pipeline.