Follow BigDATAwire:

October 28, 2020

The Object (Store) of Your Desire

(Kristi Blokhin/Shutterstock)

As the data explosion continues to reverberate across the land, organizations are turning to object storage systems to keep it tidy and organized. Public clouds run the biggest and most popular object stores, but there are plenty of other options available for folks who want to store their own data on-prem and avoid the cost and lock-in associated with public clouds.

Unstructured data is growing at a rate of 30% to 60% annually, which in turn is driving major growth in object storage system adoption, according to IT industry analyst Gartner, which published its latest Magic Quadrant for Distributed File Systems and Object Storage two weeks ago (the analyst group says the two product categories are in the process of becoming one, with “universal storage” that can behave as either object or file storage).

Gartner shared some predictions about the growth of object stores in its report. For instance, between now and 2024, the amount of unstructured data that is stored on object storage systems or distributed file systems will triple, Gartner says. The percentage of leaders saying they will deploying a hybrid cloud storage increases from 15% today to about 40% in four years, it found, and half of the world’s unstructured data will be in a software-defined storage system, compared to less than 20% today.

Applications increasingly are being designed from the beginning to pull data from object stores and distributed file systems, usually over a RESTful interface, such as Amazon’s Simple Storage Service (S3), which has become the defacto standard for both storage systems. The standardization on the S3 API has helped grease the wheels for the massive shift in storage to object stores. Other factors in object storage’s favor are nearly infinite scalability, low cost, and high availability, thanks to erasure coding. About the only downside is rudimentary performance.

When companies have tens of petabytes to store, object storage and distributed file systems are the leading contenders for the job. The question for organizations, then, is no longer whether they should store their massive collection of unstructured data in an object storage system (or a distributed file system), but rather, which one should they choose.

On-Prem or Cloud

Whether to use public cloud or stay on prem is one of the biggest factors that goes into selecting a new big data repository, according to Mark Pastor, the director of product and solution marketing at Quantum, which sells the ActiveScale object storage system.

(Istel/Shutterstock)

“Educating people on what object storage…clearly was the task five, six, seven years ago,” he tells Datanami. “There’s still some need for that. But I think the question now is, do I buy on-premise or do I take advantage of some of the public clouds?”

Amazon S3 and Microsoft Azure Data Lake Storage (ADLS) are the leading cloud object stores, with Google Storage providing an S3-comatible interface to a cloud object store as well. If customers use any of the applications or data services in these public clouds, their data invariably will be stored in one of these repositories (ADLS is the only one not to be fully S3 compatible).

As companies have become more familiar with these cloud object stores, they have begun to look for other object stores they can use for on premise workloads, Pastor says. Not every company wants to run in the cloud.

“I do encounter a lot of customers who are very motivated to maintain an on-premise infrastructure for a lot of their workflows,” he says. “Because of the hype of public cloud, some people try it and some people continue to like it. But some people might say there’s more cost associated with that than I thought there was, or I‘ve decided I’m going to keep my data longer and need to define a strategy on how to bring it back and keep it on premise, just for financial purposes a lot of times. It doesn’t feel like the market is going to go 100% one way or the other.”

Of course, customers can run the own object store of their choice on AWS, Azure, or Google Cloud, and some vendors have partnered with the cloud providers to offer hosted data storage services to reduce the amount of work for the client.

But not all object stores and distributed file systems are alike. While most of them support an S3 API, there are differences among them, and companies should study them to determine which storage platform is the best match for their workload at hand. In some cases, a single storage system can exhibit both file and object personalities, which enables multiple applications to interact with the storage repository using the most appropriate protocol for the task at hand.

In any event, there are some familiar faces in the latest Gartner Magic Quadrant.

Object and File Leaders

Dell Technologies owns the top spot, thanks to its merger with EMC and its longstanding leadership position in this space with the Isilon file system, which is now called Dell PowerScale, and Dell EMC ECS.

Gartner Magic Quadrant for Distributed File Systems and Object Storage 2020 (Image courtesy Gartner)

IBM is also in the leader’s quadrant, thanks to its Spectrum Storage offering (which runs on-prem and in the cloud), as well as its IBM Cloud Object Storage (COS) offering, which is based on its $1.3 billion acquisition of Cleversafe in 2015. Gartner likes IBM’s move to support Spectrum Scale running in Red Hat OpenShift containers (really, who doesn’t?), but cautions about the lack of options for running COS on anything but IBM’s own cloud.

Qumulo seems to be gaining traction with its Qumulo File System, which can run on-prem and in public clouds, according to Gartner. Recent highlights are support for NVMe hardware, as well as the SMBv3 protocol, according to Gartner, which sees Qumulo used mostly in commercial HPC, analytics, and hybrid cloud storage.

Rounding out the leader’s quadrant is Scality, whose RING offering functions as an integrated storage solution that combines file and object paths, while supporting public clouds and on prem deployments. “Scality is best suited for multipetabyte geographical deployments of unstructured data for content distribution, media, backup and archiving requiring an SDS solution,” Gartner notes

Object and File Visionaries

In the Visionaries quadrant is Pure Storage, whose FlashBlade offering combines file and object storage paths. Gartner likes Pure Storage’s approach to scaling storage and performance by simply adding blades. FlashBlade is best suited for commercial HPC, analytics and backup where recovery time objective (RTO) performance is critical, Gartner says.

NetApp is another Visionary with its StorageGRID offering, which is an object storage system that’s available as hardware or as software and can be deployed on-premise or in the Azure and AWS clouds (no GCP, apparently). While StorageGRID offers an NFS connector (providing file system access), customers have complained about its performance, according to Gartner, which says NetApp is best suited for cloud storage, archiving, backup, and hybrid cloud.

Red Hat made the cut for Visionary in Gartner’s quadrant, even though it’s now part of IBM. Ceph Storage, which is part of Red Hat OpenShift Container Storage (OCS), provides block, object, and file storage and is a good pick for content delivery and hybrid cloud, while Gluster is best used for archiving, backup, home directories, and serving rich media.

Object and File Challengers

Object storage systems increasingly are running on Flash and NVMe storage

Gartner identified four vendors for the Challengers quadrant. Hitachi Vantara sells the Hitachi Content Platform (HCP), which is an object store that is sold either as an appliance or as software that can be deployed on-prem or the public cloud (or in a hybrid configuration). Analytics, cloud storage, and backup and archive are the best use cases for HCP.

Nearby is Cloudian, which develops an object storage system called HyperStore, which runs on-prem and in clouds. The offering is designed for high-throughput object workloads, although there is a file system add-on called HyperFile. Gartner sys HyperStore is best suited for analytics, archiving, and backup, as well as cloud and hybrid cloud storage.

Quantum, who we mentioned above, is also on this Magic Quadrant by virtue of its March 2020 deal with Western Digital to acquire the ActiveScale product (which was based on Amplidata). ActiveScale is designed for high-volume workloads, and is widely used for scientific, medical research, and media and entertainment workloads. Gartner likes that Quantum has plans to integrate ActiveScale with its StorNext File System and backup portfolio.

Huawei rounds out the Challengers section with two products, including the OceanStor 9000 V5 file system and its block and object storage offering, OceanStor 100D (formerly FusionStorage). The Chinese tech giant’s object storage system is best suited for private cloud, content distribution and archiving, Gartner says, while its file system is widely used for video surveillance, commercial HPC, and rich-media distribution.

Object and File Niche Players

Another Chinese firm, Inspur, is the top contender in the Niche Players quadrant. Inspur sells a unified file and object storage solution called AS13000G5,which Gartner says its best suited for backup and archiving, commercial HPC, hybrid cloud, and analytic workloads.

DDN is listed here thanks to EXAScaler, a distributed file system based on Lustre that is used for HPC and analytics use cases. Gartner likes the fact that DDN has bolstered EXAScaler with support for NFS, SMB, and S3 protocols. DDN also sells WOS (Web Object Scaler), its object software system, but only to existing customers, Gartner says.

Having hybrid cloud storage is a common desire among enterprises (via Shutterstock)

Last but not least is Caringo, developer of Swarm, an object store that also supports file interfaces. Gartner says Caringo is best suited for backup and archive, commercial HPC, and analytics.

There were several honorable mentions in Gartner’s list, including Cohesity, MinIO, Nutanix, VAST, and WekaIO. In some cases, one of these solutions might be a better fit, depending on what the client is looking to accomplish.

Nutanix offers object and file storage with AOS, its distributed storage offering. MinIO is an up and coming open source object storage system that was created by the co-founder of GlusterFS. Cohesity, which was founded by one of the co-founders of Nutanix, combines NFS, SMB, and S3 access with its Helios system. VAST specializes in speed with its all-Flash file and object store. WekaIO, meanwhile, chases ultra-low latency use cases with its parallel file system, which is often used in HPC.

The market for object and distributed file systems is as vibrant as it ever was. Like a kid in a candy store, there are abundant options to choose from, which is great news, even if it makes shopping for new object or file systems a bit harder.

Related Items:

Object Stores Starting to Look Like Databases

Rethinking Architecture at Massive Scale

Object and Scale-Out File Systems Fill Hadoop Storage Void

 

BigDATAwire