Seeking Ideal Clouds for ML Workloads
As machine learning increasingly works its way into the world, technologists must find ways to integrate the new applications and incorporate the new workloads with existing systems. Many agree that the cloud is a good place for running big-data style workloads, but finding the right cloud for your particular workload may be harder than you originally envisioned.
Machine learning isn’t new, but as the heart of predictive analytics, it’s experiencing a renaissance in the age of big data. According to a recent Evans Data survey, more than one third of big data developers are using ML algorithms to crunch through massive volumes of data in search of patterns that are invisible to the human eye, with the most popular industries being financial services, manufacturing, and IoT use cases.
But as ML makes inroads, technologists are discovering that actually running ML workloads is not as straightforward as it may first appear. For starters, ML applications typically have two distinct phases: training the models on old data, then using the models to “score,” or process, new data. The actual computer workloads that originate from these two phases are different, and that’s one of the reasons why so many companies are turning to the cloud to run the workloads.
One proponent of leveraging the cloud for new ML workloads is Jason Stowe, CEO of Cycle Computing. The Greenwich, Connecticut-based company develops tools that help to simplify the movement of data and application workloads to and from its customers’ internal systems and the big public cloud providers, including Amazon AWS, Microsoft Azure, and Google Cloud Platform.
While the company has traditionally served customers looking to solve HPC problems, big data is now driving many of the workloads Cycle customers are looking to use, Stowe tells Datanami in a recent interview.
“We see big data workloads driving a lot of interesting cloud adoption, primarily because the workloads have different footprints than what people traditionally have sitting on their data center floor in house,” Stowe says. “We still have customers who are running entirely internally. But almost all of them have converted now into doing some form of cloud-based workloads.”
The cloud is an ideal place to run big data workloads, including ML applications, because of their dynamic flexibility enables them to match the application workloads varying needs, Stowe says.
“Data-oriented applications are a driver for this because you need a different RAM to CPU ratio than you might have otherwise,” he explains. “Most of our new customer base is essentially seeking to be able to deliver the no queue wait time for their end users, and they’re looking to AWS or Google or Azure to do that.”
Which Cloud for Big Data?
But not all cloud providers are equal, and figuring out which cloud is the best for a given big data workloads is not always an easy or straightforward question to answer. Stowe’s advice for would-be analytic cloud shoppers is to do the homework, decide whether you’re looking for the fastest environment or the cheapest environment, and then test, test, test.
“The key point for big compute with multiple clouds is that workload fit and performance are moving targets,” he says. “For example, right now Azure has a fast interconnect with Infiniband, so we orchestrate MPI there for some customers.”
The only way to know for sure which cloud environment is the best for any given big data workloads is to actually measure how the workload runs on the cloud. That might not sound enticing for a company that expected the cloud to smooth over these sorts of questions. But it’s important to know that the cloud isn’t a “one size fits all” solution.
“Because every application is different, it’s hard to extrapolate off of 50 applications where it might run best,” Stowe says. “So [in CycleCloud], you can actually spin the applications up in each environment and figure out whether it’s the fastest place to get the answer or is it the cheapest place to get the answer.”
At the end of the day, the cloud is a liberating force for big data workloads. Companies no longer need to invest millions of dollars in hardware to run a given application, only to find that new applications bring new performance characteristics that don’t fit with existing hardware.
“When you’re working in an external cloud, a lot of the data center technology decisions end up becoming less important,” Stowe says. “Really, you want to focus down on primarily those two axes: time to result and cost. You essentially enable users to not worry about anything other than time to result and cost.”
That’s considerably different than an internal environment, where cost was the only thing companies were optimizing for. “You’d have to figure out if you had the cooling or power required to put this environment or workload into production,” Stowe says. “Can I put that many GPUs into my data center without frying things? That’s a classic example.”
While the cloud can help save lots of money, matching the right cloud to the right application is essential, especially if you’re on a budget or your customers have high performance expectations.