Amazon S3 Dominance Spawns Storage Challengers
When it comes to storing data, Amazon’s Simple Storage Service (S3) is a runaway success. With trillions of files under management spanning many exabytes worth of data, S3 is the largest digital storage system in the known universe. But things rarely stay the same for long in the IT business, and S3’s outright dominance is spawning attacks from several angles.
The first area S3 is coming under attack is from other object storage systems. Compared to traditional file systems, which suffer from scalability limitations, object storage systems such as S3 have no theoretical limit to the number of files or the amount of data they can hold. For data sets in excess of 1 petabyte, or when large unstructured files (dubbed binary large objects, or BLOBs) must be archived for a long period of time, a clustered object storage system is often the best and most affordable option on the market.
But increasingly, next-generation Web and mobile application developers have begun utilizing object stores as the primary data store, completely bypassing traditional databases and file systems. This is phenomena is driven largely by the success of Amazon S3. Instead of having a Windows or Linux application utilizing traditional file system protocols like NFS or SMB to access data via a network attached storage (NAS) or a storage area network (SAN) array, next-gen app developers are using Web services protocols like REST to transact on data stored in the cloud. That data almost uniformly exists in the AWS cloud as S3 buckets.
While AWS is by far the dominant public cloud today, there’s a lot of innovation happening outside of Amazon, in places like Microsoft Azure, Google Cloud Storage, and Pivotal CloudFoundry. Enterprise are also actively embracing the notion of hybrid infrastructure, whereby some of applications and databases sit in the public cloud and some of them sit behind a firewall in a data center controlled by the company.
Throw in a good dose of next-generation virtualization and containerization using an array of technology like Docker, Kubernetes, Mesophsere, and others, and you the recipe for a major shift in enterprise IT infrastructure.
And herein lies the problem: While every cloud is embracing containers, these other clouds don’t support the S3 protocol. Instead, they are using their own object, or BLOB, storage systems and protocols.
It’s true that many third-party object storage system providers are adopting the S3 protocol because it’s become so popular and prevalent among next-gen app developers. Amazon is open about sharing the details of the S3 standard, even if it’s not necessarily open source. But you’re not going to find Microsoft or Google handing over the reins of innovation and control to Amazon and its S3 protocol (although some of the cloud providers do support some elements of the S3 syntax).
S3 for Azure and GCP
This gap in public cloud object storage is something that Minio founder and CEO Anand Babu (AB) Periasamy views as a market opportunity. Periasamy, who created the Gluster file system in the 1990s, is an unabashed supporter of S3, which he considers a work in simplistic elegance. This is why the Minio object storage system uses S3 as its underlying storage protocol.
“Amazon was smart about letting go of all the legacy baggage and focusing on what the important aspect of what a storage system has to do,” Periasamy tells Datanami. “When I saw that, I found no good reason to re-implement the wheel. It was simpler. It was sophisticated. They understood what the core problems were.”
What Minio adds to the equation is the capability to expand S3-compatible storage to any public cloud or on-premise storage system. By implementing Minio software-defined storage – either on the cloud or via on-premise clusters developed by partners , customers who developed cloud-based applications to run on AWS and use S3 for storage can now easily move them to other public cloud providers.
“When I talk to these customer, what I see is they are adopting Docker, Kubernetes, CloudFoundery, or Mesosphere in their private cloud environment,” Periasamy says. “But they’re also talking about spinning out some of their workload into Azure or the Google cloud. They’re already using Amazon. Their applications are written for Amazon APIs, and they are now looking to [figure out] how can I bring that application back into our private cloud infrastructure or onto Azure.”
Minio yesterday announced a $20 million Series A funding, which pairs nicely with solid growth in its Apache 2.0 licensed product. The company has experience more than 10 million downloads of its open source object storage systems since January, making it one of the more active projects on GitHub this year.
When one vendor becomes too dominant, it usually opens up room for competing offerings, Periasamy says. “From our world view, what we saw was it was becoming to look at Android versus iOS, which is S3,” he says. “It was important in the product to embrace multiple layers outside of Amazon, and make them all look like standard locations.”
This is critical, he says, because of all the other innovation happening outside of AWS, such as Google’s BigQuery and Microsoft’s machine learning and artificial intelligence services on Azure.
S3-Like Scale with NFS
A second area where S3 dominance is giving rise to new storage regimes comes to us from Qumulo, which has developed a scale-out file system with object-like features.
Yesterday Qumulo (which was founded by former Isilon engineers) launched the second major release of its File Fabric. QF2, as it is called, adds several features that will help it be more like an object system, including support for running on AWS and a new continuous replication feature for on-prem and cloud clusters.
With these features added to its NFS- and SMB-loving file system, Qumulo says it has created a new product category, which it dubs a universal-scale file system.
Ben Gitenstein, Qumulo’s director of product management, says a universal-scale file system offers some of the same benefits found in an object storage system, including elastic scalability and support for cloud deployments, but does so without the drawbacks of object storage, such as the need to rewrite applications to use object protocols, namely Amazon’s S3.
“Our thesis is, what if you made file scale like object and offered the same hardware portability like object and run in the cloud like object, but it’s still file?” Gitenstein says. “That’s our fundamental thesis.”
Practically all on-premise applications were developed to support file systems, namely NFS and SMB (as well as CIFS, which is considered a version of SMB). While a new generation of cloud-native applications read and write data to storage using S3, PI, most existing applications developed to run on Windows, Mac, and Linux still speak SMB or NFS.
“There’s a lot of customers in the world who are actually file customers who are told they have to go buy object, and they’re told they have to go buy object for one of four reasons,” Gitenstein says. The first reason is because the customer needs tremendous scalability. Secondly, they need greater hardware portability. Thirdly, they want to run on the cloud. Lastly, they want to write and manipulate custom metadata in a way that object storage systems are preferable to file systems.
“Our argument, and what we proved with our customers, is that tradeoff is only true because existing file hasn’t innovated,” Gitenstein says. “Customers say, I have outgrown existing file system so I have to either break up my infrastructure in this terrible way, or I have to rewrite everything in object and move to object. Or I can use object with a file gateway on top, in which case I basically lose all the rich metadata of object and I don’t get the full file semantics because it’s just a gateway, it’s not file at the file level.”
With support for cloud deployments and geographic data replication in QF2, Qumulo thinks that customers no longer have to sacrifice the benefits of traditional file systems — which at this point include familiarity and stability — to get the benefits associated with object storage. “What’s different about us is we scale to the same breadth and depth as object does,” Gitenstein says. “But at its heart it’s an actual file system. It’s not an object system with a file gateway on top. “
Qumulo doesn’t think AWS customers are going to rip apart their existing EC2 and S3-based apps and move them to Qumulo’s file system. But for some organizations in data intensive industries, such as entertainment, oil and gas exploration, life sciences, who are considering shifting to S3 to get object scalability, the availability of scalable NFS via Qumulo could prove to be a better and more cost-efficient solution.
“We don’t think of ourselves as competing with Amazon,” Gitenstein says. “The customers that we sell to all know and love file. They just want it to scale to the same level as S3 does, and give them the same access to cloud that S3 does.”