Alluxio Reimagines Architecture for Multi-Tenant Environments at Scale
SAN MATEO, Calif., Nov. 16, 2022 — Alluxio, the developer of the open source data orchestration platform for data driven workloads such as large-scale analytics and AI/ML, today announced the immediate availability of version 2.9 of its Data Orchestration Platform. This new release strengthens its position as the key layer between compute engines and storage systems by delivering support for a scale-out, multi-tenant architecture with a new cross-environment synchronization feature, enhanced manageability with significant improvement in the tooling and guidelines for deploying Alluxio on Kubernetes, and improved security and performance with a strengthened S3 API and POSIX API.
“We have been working with Alluxio on several key projects across our data platform,” said Nirav Chotai, Senior DevOps Manager, Rakuten Group. “Since our infrastructure is spread across regions, compute engines and storage types, we envision Alluxio will continue to play a critical role to help scale the platform further. We are excited to leverage the latest release with several improvements, especially the new Kubernetes operator for our multi-tenant environment.”
“We are running one thousand nodes of Alluxio to optimize model training jobs and interactive queries,” said Peng Chen, Engineer Manager in the Big Data Team, Tencent. “Alluxio has become the de-facto choice for large internet companies to accelerate the development of their data analytics and AI applications. We are excited about the enhanced Kubernetes feature of the new release, which will make managing Alluxio even easier.”
“We have been using Alluxio as the data cache layer on top of multiple data centers to speed up the data access performance,” said Luo Li, Director of Data Infrastructure, Shopee. “Alluxio’s architecture enables us to support data ‘servitization.’ Furthermore, Alluxio has reduced our data infrastructure team’s management overhead, especially for data distributed in multiple data centers, or even across countries.”
“Tenant-dedicated satellite clusters have become more common while architecting data platforms,” said Adit Madan, Director of Product Management, Alluxio. “Alluxio’s ability to actively synchronize metadata across multiple environments is significant, making the adoption of such an architecture easier than ever.”
Tenant isolation provides the scale and economic benefits of a multi-tenant architecture while rigorously preventing different teams from competing for access to shared data lake storage. With the new cross-environment synchronization feature, Alluxio evolves its architecture with significantly improved scalability and manageability enabling data platform teams to deploy multiple per-tenant Alluxio clusters between compute and storage cluster across any environment, based on workload capacity. Running Alluxio on Kubernetes helps standardize deployment methodologies across cloud, multi-cloud, hybrid-cloud, and on-premises environments. This new release introduces the Alluxio operator, which simplifies deploying, configuring, provisioning, and managing multiple Alluxio clusters, reducing DevOps complexity. Alluxio on Kubernetes also makes data stack portable to any environment, preventing vendor lock-in. Lastly, in Alluxio 2.9, authentication and access policies are now centralized through the communications between compute engines and Alluxio via S3 API. Therefore, Alluxio provides a unified security experience across heterogeneous storage either on-premise or in the cloud.
“Alluxio’s data orchestration platform aims to simplify, secure, and accelerate data access in heterogeneous analytics environments,” said Kevin Petrie, VP of Research, Eckerson Group. “These v2.9 enhancements seek to give new analytics users, applications, and projects the resources they need, with less effort and higher confidence in meeting SLAs. Alluxio does this by helping enterprises manage metadata, containerized deployments, and the security of its APIs more effectively.”
Alluxio 2.9 Community and Enterprise Edition features new capabilities, including:
Multi-Environment Cluster Synchronization
Alluxio 2.9 introduces the new cross-environment synchronization feature. This feature makes one Alluxio cluster aware of another Alluxio cluster by automatically syncing the metadata between Alluxio clusters. Deploying Alluxio clusters across any environment can achieve tenant-level isolation with the metadata of Alluxio clusters in sync at scale. This feature is particularly useful when adopting satellite architecture with compute clusters segregated across team-level tenants for isolation. With this new feature, multi-tenant architecture with Alluxio allows the platform to scale out and onboard new use cases without a central resource bottleneck, ensuring SLAs and simplifying metadata management operations.
Extended Manageability for Kubernetes
The new Alluxio 2.9 has added the Alluxio operator for Kubernetes. Administrators can now deploy and manage Alluxio on Kubernetes easily through the newly introduced Alluxio operator with CRD (custom resource definitions). The operator offers configuration management for deployment, connections to under storage, configuration updates, and uninstallation. Using the Alluxio operator removes the burden of deploying Alluxio on different environments, greatly reduces the amount of manual work and simplifies DevOps when managing multiple instances of Alluxio.
Enhanced S3 API Security with Better User Experience
Alluxio 2.9 further strengthens its S3 API providing a unified security model to applications with better user experience. By adopting the open authentication protocol for S3 API, Alluxio users will be verified before their requests are processed. This new feature allows data platform teams to connect to more advanced identity management systems (such as PingFederate) and leverage Single-Sign on (SSO) to enhance user experience. With a uniform authentication and authorization model, applications connected to Alluxio are portable across on-premises, hybrid or multi-cloud.
Free downloads of Alluxio 2.9 open source Community Edition and trials of Alluxio Enterprise Edition are immediately available here: https://www.alluxio.io/download.
Proven at global web scale in production for modern data services, Alluxio is the developer of open source data orchestration software for the cloud. Alluxio moves data closer to data analytics and machine learning compute frameworks in any cloud across clusters, regions, and clouds, providing memory-speed data access to files and objects. Intelligent data tiering and caching deliver greater performance and reliability to customers in financial services, high tech, retail and telecommunications. Alluxio is in production use today at eight out of the top ten internet companies. Venture-backed by Andreessen Horowitz, Seven Seas Partners, Volcanic Ventures, and Hillhouse Capital. Alluxio was founded at UC Berkeley’s AMPLab by the creators of the Tachyon open source project. For more information, contact [email protected].