Follow Datanami:
August 4, 2020

Splicing a Pause Button into Cloud Machines


Splice Machine develops a machine learning-enabled SQL database that is based on a closely engineered collection of distributed components, including HBase, Spark, and Zookeeper, not to mention H2O, TensorFlow, and Jupyter. Customers use it to build complex AI apps that include transactional, analytical, and ML components. The company just announced a Kubernetes operator for customers running in private cloud environments. So what’s CEO Monte Zweben’s favorite new feature?

The pause button.

“How about that pause button?” Zweben said during a demo of Splice Machine’s Kubernetes Ops Center. “When you pause on Splice Machine, it drains Kubernetes nodes and makes them available for other applications to use.”

Support for Kubernetes is not new at Splice Machine. The company relied on Mesos for some time before pivoting to Kubernetes a couple of years ago. Since then, the company has used K8S to manage customer environments as part of its software as a service (SaaS) offering). Now with Kubernetes Ops Center, which was unveiled last week, customers running the platform on their own gear in their own data center (or in a private cloud) can also leverage Kubernetes to maximize their compute resources.

The pause button is placed prominently at the top of the Kubernetes Ops Center screen. When pressed, it instructs the Kubernetes distribution (Rancher and OpenShift are currently supported, with more on the way) to essentially put Splice on ice and prevent it from consuming any more resources.

Companies are wasting billions of dollars on underutilized cloud resources

This is a big deal considering the amount of resources that customers are wasting in the cloud. A report issued last week by Pepperdata, a provider of tuning solutions for big data applications, found that big companies were wasting millions of dollars, and that even smaller companies could save hundreds of thousands of dollars by tuning their applications (in particular, Apache Spark) to make better use of cloud resources.

Hitting the pause button in Splice Machine is one way to achieve savings.

“I think it’s a powerful thing that we’re offering on premises,” Zweben says. “Even on prem, if you’ve got a small set of virtualization going on, if you can pause and give up your resource to another user, that’s pretty powerful.”

The pause button is pressed frequently for the AWS cluster that Splice Machine uses for its demos. Before getting on a call with a prospect or a journalist, Zweben hits the restore button, and the cluster quickly comes back online. “If we’re not demoing this cluster, why pay for the infrastructure?” Zweben said. “I just checked in five to 10 minutes before we talked and I hit the restore button and it comes back, just like it was.”

Zweben couldn’t put a dollar amount on the savings, but says that they are substantial. “It is more than 50% savings when you’re shutting a cluster off overnight,” he said. “We do that on our trials. We have an automatic trial mechanism, where you can come to Splice, and get it for a few weeks for free. If somebody is not active during their trial, we just auto-pause it.”

With Kubernetes running herd on compute resources, Splice Machine is free to concentrate on more important things, like ensuring that all the complex distributed components function as a seamless unit.

“All of the Splice Machine clusters have that elasticity where you can turn it off, and it basically doesn’t consume resources,” Zweben said. “The ability to separate storage and compute in that way saves an enormous amount of money.”

The split between on-prem and cloud customers is roughly 50/50 for new accounts, Zweben said. The nature of Splice Machine’s customer base – one of its credit card customers runs its data center in an underground bunker protected by armed guards – precludes the cloud from being adopted more often.

In addition to enabling elasticity, the Kubernetes Ops Center supports Helm Charts, which allow customers to augment their Splice Machine environment with other capabilities. For example, a customer could package a new machine learning model or a Kafka queue as Helm Charts, and integrate them into Splice Machine via Kubernetes.

“The ability for them to add this componentry extremely quickly and to be managed within the same infrastructure–this is really creating a new level of agility that you didn’t have before,” Zweben said.

Kubernetes is a hot technology at the moment, but it’s just one piece of the puzzle in Splice Machine’s big game. The San Francisco company’s end goal is delivering an AI platform that can do all “three legs of the stool” – transactional, analytical, and machine learning workloads – and thereby enable smaller companies to succeed with AI.

“There’s too many moving parts today for AI to really be brought into the world at scale,” Zweben said. “Right now you still have leaders building AI system, not your traditional companies, in production. Operationalizing it has been too hard. We’re democratizing it. That’s why we put these components together to make it easy to scale for AI.”

Splice Machine was born in the days of Hadoop, and uses some of the same underlying data processing engines that were distributed in that platform. But Splice Machine has surpassed the capabilities of that earlier platform by ensuring tight integration with those engines in support of its customers enterprise AI initiatives, not to mention elastic scaling via Kubernetes.

The way that Splice Machine engineered HBase (for storage) and Spark (for analytics), and its enablement of ACID capabilities for SQL transactions, are core differentiating factors that weigh in Splice Machine’s favor for being a platform on which to build real-time AI applications, according to Zweben.

“Doing table scans as the basis of an analytical workload is abysmally slow in HBase, and so, in Splice Machine, we engineered at a very low level the access to the HBase storage with a wrapper of transactionality around it, so you’re only seeing what’s been committed in the database based on ACID semantics,” Zweben explained.

“That goes under the cover at a very well-engineered level, looking at the HBase storage and grabbing that into Spark dataframes,” he continued. “We’ve engineered tightly integrated connectivity for performance. I don’t think anybody is going to be able to do that easily without the same level of effort that we put into it, especially being transactionally consistent with ACID compliance, like Splice Machine is.”

Splice Machine holds patents on the work, which took years to develop, and it’s being well-received by companies in financial services, healthcare, retail, government, and other sectors. The new Kubernetes operator doesn’t necessarily help with the core database development effort, but it definitely helps with managing the whole kit and caboodle in support of AI.

And, of course, Kubernetes enables that pause button, which is a big deal when running this stuff in the real world.

Related Items:

Big Data Apps Wasting Billions in the Cloud

Seeing the Big Picture on Big Data Market Shift

Making Hadoop Relatable Again