Splicing a Pause Button into Cloud Machines
Splice Machine develops a machine learning-enabled SQL database that is based on a closely engineered collection of distributed components, including HBase, Spark, and Zookeeper, not to mention H2O, TensorFlow, and Jupyter. Customers use it to build complex AI apps that include transactional, analytical, and ML components. The company just announced a Kubernetes operator for customers running in private cloud environments. So what’s CEO Monte Zweben’s favorite new feature?
The pause button.
“How about that pause button?” Zweben said during a demo of Splice Machine’s Kubernetes Ops Center. “When you pause on Splice Machine, it drains Kubernetes nodes and makes them available for other applications to use.”
Support for Kubernetes is not new at Splice Machine. The company relied on Mesos for some time before pivoting to Kubernetes a couple of years ago. Since then, the company has used K8S to manage customer environments as part of its software as a service (SaaS) offering). Now with Kubernetes Ops Center, which was unveiled last week, customers running the platform on their own gear in their own data center (or in a private cloud) can also leverage Kubernetes to maximize their compute resources.
The pause button is placed prominently at the top of the Kubernetes Ops Center screen. When pressed, it instructs the Kubernetes distribution (Rancher and OpenShift are currently supported, with more on the way) to essentially put Splice on ice and prevent it from consuming any more resources.
This is a big deal considering the amount of resources that customers are wasting in the cloud. A report issued last week by Pepperdata, a provider of tuning solutions for big data applications, found that big companies were wasting millions of dollars, and that even smaller companies could save hundreds of thousands of dollars by tuning their applications (in particular, Apache Spark) to make better use of cloud resources.
Hitting the pause button in Splice Machine is one way to achieve savings.
“I think it’s a powerful thing that we’re offering on premises,” Zweben says. “Even on prem, if you’ve got a small set of virtualization going on, if you can pause and give up your resource to another user, that’s pretty powerful.”
The pause button is pressed frequently for the AWS cluster that Splice Machine uses for its demos. Before getting on a call with a prospect or a journalist, Zweben hits the restore button, and the cluster quickly comes back online. “If we’re not demoing this cluster, why pay for the infrastructure?” Zweben said. “I just checked in five to 10 minutes before we talked and I hit the restore button and it comes back, just like it was.”
Zweben couldn’t put a dollar amount on the savings, but says that they are substantial. “It is more than 50% savings when you’re shutting a cluster off overnight,” he said. “We do that on our trials. We have an automatic trial mechanism, where you can come to Splice, and get it for a few weeks for free. If somebody is not active during their trial, we just auto-pause it.”
With Kubernetes running herd on compute resources, Splice Machine is free to concentrate on more important things, like ensuring that all the complex distributed components function as a seamless unit.
“All of the Splice Machine clusters have that elasticity where you can turn it off, and it basically doesn’t consume resources,” Zweben said. “The ability to separate storage and compute in that way saves an enormous amount of money.”
The split between on-prem and cloud customers is roughly 50/50 for new accounts, Zweben said. The nature of Splice Machine’s customer base – one of its credit card customers runs its data center in an underground bunker protected by armed guards – precludes the cloud from being adopted more often.
In addition to enabling elasticity, the Kubernetes Ops Center supports Helm Charts, which allow customers to augment their Splice Machine environment with other capabilities. For example, a customer could package a new machine learning model or a Kafka queue as Helm Charts, and integrate them into Splice Machine via Kubernetes.
“The ability for them to add this componentry extremely quickly and to be managed within the same infrastructure–this is really creating a new level of agility that you didn’t have before,” Zweben said.
Kubernetes is a hot technology at the moment, but it’s just one piece of the puzzle in Splice Machine’s big game. The San Francisco company’s end goal is delivering an AI platform that can do all “three legs of the stool” – transactional, analytical, and machine learning workloads – and thereby enable smaller companies to succeed with AI.
“There’s too many moving parts today for AI to really be brought into the world at scale,” Zweben said. “Right now you still have leaders building AI system, not your traditional companies, in production. Operationalizing it has been too hard. We’re democratizing it. That’s why we put these components together to make it easy to scale for AI.”
Splice Machine was born in the days of Hadoop, and uses some of the same underlying data processing engines that were distributed in that platform. But Splice Machine has surpassed the capabilities of that earlier platform by ensuring tight integration with those engines in support of its customers enterprise AI initiatives, not to mention elastic scaling via Kubernetes.
The way that Splice Machine engineered HBase (for storage) and Spark (for analytics), and its enablement of ACID capabilities for SQL transactions, are core differentiating factors that weigh in Splice Machine’s favor for being a platform on which to build real-time AI applications, according to Zweben.
“Doing table scans as the basis of an analytical workload is abysmally slow in HBase, and so, in Splice Machine, we engineered at a very low level the access to the HBase storage with a wrapper of transactionality around it, so you’re only seeing what’s been committed in the database based on ACID semantics,” Zweben explained.
“That goes under the cover at a very well-engineered level, looking at the HBase storage and grabbing that into Spark dataframes,” he continued. “We’ve engineered tightly integrated connectivity for performance. I don’t think anybody is going to be able to do that easily without the same level of effort that we put into it, especially being transactionally consistent with ACID compliance, like Splice Machine is.”
Splice Machine holds patents on the work, which took years to develop, and it’s being well-received by companies in financial services, healthcare, retail, government, and other sectors. The new Kubernetes operator doesn’t necessarily help with the core database development effort, but it definitely helps with managing the whole kit and caboodle in support of AI.
And, of course, Kubernetes enables that pause button, which is a big deal when running this stuff in the real world.
January 21, 2021
- Narrative Unveils Universal Onboarding Self-Service Solution for Onboarding Offline Customer Data
- VAST Data Unveils Joint Reference Architecture with NVIDIA for Large-scale AI Workloads
- GridGain Continues its Strong Momentum in 2020
- Abu Dhabi Dept of Culture and Tourism Drives Data-driven Digital Transformation with Informatica
January 20, 2021
- GigaSpaces Doubles Annual Recurring Revenues, Fueled by New Product Innovations in 2020
- Emerson Receives 2021 IoT Breakthrough Award for ‘Analytics Platform of the Year’
- John Snow Labs Releases Annotation Lab 1.1 with Improvements to Speed, Accuracy and Productivity
- Behavox Partners with Red Box to Enhance Value of Voice Data Analytics
- UK Enterprises Look for Help with Rising Volume of Data
- PSU Data Science Seed Grant Program Accepting Proposals Until Feb. 1
January 19, 2021
- More US Companies Embracing Analytics to Improve Operations and Grow During COVID-19
- Hot Cloud Storage Provider Wasabi Fuels Continued Growth with $27.5M Funding Round
- ScaleOut Software Announces Tools for its Real-time Digital Twins for Streaming Analytics
- Datatron Releases New Governance Dashboard
- NIH and PhysIQ Complete Initial Phase of COVID-19 Digital Biomarker Development
- Palantir Providing Technology to Enhance Safety of Electric Grid in California
- Fivetran Expands Enterprise Data Integration with Release of New Data Source Connectors
- Splice Machine Launches Feature Store to Simplify Feature Engineering
- Teradata Vantage Now Available on the Google Cloud Marketplace
January 15, 2021
Most Read Features
- 2021 Predictions: Data Science
- Dremio Officially a ‘Unicorn’ As it Reaches $1B Valuation
- Big Data File Formats Demystified
- Peering Into the Crystal Ball of Advanced Analytics
- The Maturation of Data Science
- Why Data Science Is Still a Top Job
- 2021 Prediction from the Edge and IoT
- Is Python Strangling R to Death?
- 2020: A Big Data Year in Review
- Predictive Maintenance Drives Big Gains in Real World
- More Features…
Most Read News In Brief
- A Virtual Ride Along with John Deere at CES
- War Unfolding for Control of Elasticsearch
- AWS Launches Managed Services for Grafana, Prometheus
- Verizon Expands Deal with AWS for 5G Edge Platform
- Elastic Shifts Licensing Terms, Citing Amazon Moves
- Databricks Plotting IPO in 2021, Bloomberg Reports
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- ACLU Objects to Expansion of Facial Recognition by CBP
- The Rise and Fall of Qlik
- Cloudera CEO: Enterprise Data Cloud Vision Nearly Complete
- More News In Brief…
Most Read This Just In
- PrestoSQL Rebranding as Trino
- BlackSky Awarded Govt Contract to Develop AI Platform for Global Construction Monitoring
- Datatron Releases New Governance Dashboard
- O’Reilly Announces 2021 Superstream Series Lineup and Dates
- SAS Fulfills Pledge to Support HBCUs with Software and Partnerships
- Fujitsu Signs $8M Contract with Palantir, Becomes First Distributor of Foundry Modules in Japan
- Using AI to Fight Substance Abuse in Youth Experiencing Homelessness
- New Data-driven Global Climate Model Provides Projections for Urban Environments
- Starburst Secures $100M Series C Financing Led by Andreessen Horowitz
- Datanami Reveals Winners of Fifth Annual Readers’ and Editors’ Choice Awards
- More This Just In…