Splicing a Pause Button into Cloud Machines
Splice Machine develops a machine learning-enabled SQL database that is based on a closely engineered collection of distributed components, including HBase, Spark, and Zookeeper, not to mention H2O, TensorFlow, and Jupyter. Customers use it to build complex AI apps that include transactional, analytical, and ML components. The company just announced a Kubernetes operator for customers running in private cloud environments. So what’s CEO Monte Zweben’s favorite new feature?
The pause button.
“How about that pause button?” Zweben said during a demo of Splice Machine’s Kubernetes Ops Center. “When you pause on Splice Machine, it drains Kubernetes nodes and makes them available for other applications to use.”
Support for Kubernetes is not new at Splice Machine. The company relied on Mesos for some time before pivoting to Kubernetes a couple of years ago. Since then, the company has used K8S to manage customer environments as part of its software as a service (SaaS) offering). Now with Kubernetes Ops Center, which was unveiled last week, customers running the platform on their own gear in their own data center (or in a private cloud) can also leverage Kubernetes to maximize their compute resources.
The pause button is placed prominently at the top of the Kubernetes Ops Center screen. When pressed, it instructs the Kubernetes distribution (Rancher and OpenShift are currently supported, with more on the way) to essentially put Splice on ice and prevent it from consuming any more resources.
This is a big deal considering the amount of resources that customers are wasting in the cloud. A report issued last week by Pepperdata, a provider of tuning solutions for big data applications, found that big companies were wasting millions of dollars, and that even smaller companies could save hundreds of thousands of dollars by tuning their applications (in particular, Apache Spark) to make better use of cloud resources.
Hitting the pause button in Splice Machine is one way to achieve savings.
“I think it’s a powerful thing that we’re offering on premises,” Zweben says. “Even on prem, if you’ve got a small set of virtualization going on, if you can pause and give up your resource to another user, that’s pretty powerful.”
The pause button is pressed frequently for the AWS cluster that Splice Machine uses for its demos. Before getting on a call with a prospect or a journalist, Zweben hits the restore button, and the cluster quickly comes back online. “If we’re not demoing this cluster, why pay for the infrastructure?” Zweben said. “I just checked in five to 10 minutes before we talked and I hit the restore button and it comes back, just like it was.”
Zweben couldn’t put a dollar amount on the savings, but says that they are substantial. “It is more than 50% savings when you’re shutting a cluster off overnight,” he said. “We do that on our trials. We have an automatic trial mechanism, where you can come to Splice, and get it for a few weeks for free. If somebody is not active during their trial, we just auto-pause it.”
With Kubernetes running herd on compute resources, Splice Machine is free to concentrate on more important things, like ensuring that all the complex distributed components function as a seamless unit.
“All of the Splice Machine clusters have that elasticity where you can turn it off, and it basically doesn’t consume resources,” Zweben said. “The ability to separate storage and compute in that way saves an enormous amount of money.”
The split between on-prem and cloud customers is roughly 50/50 for new accounts, Zweben said. The nature of Splice Machine’s customer base – one of its credit card customers runs its data center in an underground bunker protected by armed guards – precludes the cloud from being adopted more often.
In addition to enabling elasticity, the Kubernetes Ops Center supports Helm Charts, which allow customers to augment their Splice Machine environment with other capabilities. For example, a customer could package a new machine learning model or a Kafka queue as Helm Charts, and integrate them into Splice Machine via Kubernetes.
“The ability for them to add this componentry extremely quickly and to be managed within the same infrastructure–this is really creating a new level of agility that you didn’t have before,” Zweben said.
Kubernetes is a hot technology at the moment, but it’s just one piece of the puzzle in Splice Machine’s big game. The San Francisco company’s end goal is delivering an AI platform that can do all “three legs of the stool” – transactional, analytical, and machine learning workloads – and thereby enable smaller companies to succeed with AI.
“There’s too many moving parts today for AI to really be brought into the world at scale,” Zweben said. “Right now you still have leaders building AI system, not your traditional companies, in production. Operationalizing it has been too hard. We’re democratizing it. That’s why we put these components together to make it easy to scale for AI.”
Splice Machine was born in the days of Hadoop, and uses some of the same underlying data processing engines that were distributed in that platform. But Splice Machine has surpassed the capabilities of that earlier platform by ensuring tight integration with those engines in support of its customers enterprise AI initiatives, not to mention elastic scaling via Kubernetes.
The way that Splice Machine engineered HBase (for storage) and Spark (for analytics), and its enablement of ACID capabilities for SQL transactions, are core differentiating factors that weigh in Splice Machine’s favor for being a platform on which to build real-time AI applications, according to Zweben.
“Doing table scans as the basis of an analytical workload is abysmally slow in HBase, and so, in Splice Machine, we engineered at a very low level the access to the HBase storage with a wrapper of transactionality around it, so you’re only seeing what’s been committed in the database based on ACID semantics,” Zweben explained.
“That goes under the cover at a very well-engineered level, looking at the HBase storage and grabbing that into Spark dataframes,” he continued. “We’ve engineered tightly integrated connectivity for performance. I don’t think anybody is going to be able to do that easily without the same level of effort that we put into it, especially being transactionally consistent with ACID compliance, like Splice Machine is.”
Splice Machine holds patents on the work, which took years to develop, and it’s being well-received by companies in financial services, healthcare, retail, government, and other sectors. The new Kubernetes operator doesn’t necessarily help with the core database development effort, but it definitely helps with managing the whole kit and caboodle in support of AI.
And, of course, Kubernetes enables that pause button, which is a big deal when running this stuff in the real world.
October 19, 2021
- Quantum Announces Partnership with IBM for Next Generation of LTO Technology
- Scality Delivers Comprehensive Portfolio for Splunk SmartStore Deployments
- Splunk Announces Enhancements to its Enterprise Observability Portfolio
- Datatron Introduces New Features to MLOps and AI Governance Solution
- Snowflake Launches Media Data Cloud
- SolarWinds Introduces Database Mapper and Task Factory
- Tintri Expands VMstore Portfolio of NVMe-based Platforms
- Cockroach Labs Introduces CockroachDB Serverless
- AnalyticsIQ Marketing Data Now Available on AWS Data Exchange
- Query.AI Closes Oversubscribed $15 Million Series A Round
- Couchbase Introduces Couchbase Capella Hosted Database-as-a-Service on AWS
- SambaNova Introduces Enterprise Grade GPT AI-Powered Language Model
- Paradigm4 Launches flexFS for Geospatial Data in the Cloud
October 18, 2021
- Fujitsu Analyzes Japanese Election Data with Foundry from Palantir Technologies
- WANdisco Announces General Availability of LiveData Platform for Azure
- Akridata Joins National Exascale Day Celebrations
October 15, 2021
- Elastic And Optimyze Join Forces to Deliver Continuous Profiling Platform
- Coveo Acquires Qubit
- Aicadium and SambaNova Partner to Bring AI Hardware Solution to Singapore
October 14, 2021
Most Read Features
- Google Cloud Gives Spanner a PostgreSQL Interface
- What Is Data Science? A Turing Award Winner Shares His View
- Big Data File Formats Demystified
- One on One with Google Cloud Product Director Irina Farooq
- We’re In the Moneyball 3.0 Era. Here’s What It Means for Live Sports
- Who’s Winning In the $17B AIOps and Observability Market
- What’s the Difference Between AI, ML, Deep Learning, and Active Learning?
- SambaNova Brings Custom Silicon To Bear on High-End AI Workloads
- OpenTelemetry Gains Momentum as Observability Standard
- Five Real-World Applications for Sports Analytics
- More Features…
Most Read News In Brief
- Data and AI Salaries Continue Upward March, O’Reilly Says
- Bigeye Observes $45 Million in Funding
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- Why Is SAS Going Public?
- LinkedIn Open Sources Tech Behind 10,000-Node Hadoop Cluster
- Gartner Shuffles the Technology Deck with Latest ‘Hype Cycle’ Report
- Feature Stores Emerging as Must-Have Tech for Machine Learning
- Sisu Nabs $62M to Grow Data Analytics Biz
- Logistics Operators Look to Data, Technology for Advantage
- The Next Breakthrough in Long-Term Data Storage is….Gold?
- More News In Brief…
Most Read This Just In
- TIBCO NOW 2021 Showcases Limitless Power of Data
- Databricks Acquires Low-code/No-code Company to Expand its Lakehouse Platform
- PrivaceraCloud 4.0 Enables Governed Data Sharing Across the Open Cloud
- BriefCam Introduces Video Analytics Enabled on Deep Learning Cameras from Axis Communications
- NetApp to Acquire CloudCheckr and Expand its Spot by NetApp CloudOps Platform
- MicroAI to Bring AI Training to Renesas MCUs
- TIBCO Delivers a Comprehensive, Connected Platform for the Adaptable Digital Business
- Scality Awarded New U.S. Patent for Breakthrough Technology in Hyper-Scale Data Protection
- OneTrust Enhances First-Party Data Solution to Strengthen Holistic Consent and Preference Management Platform
- Nutanix Cloud Platform to Deliver Strengthened Data Services for Unstructured and Structured Data
- More This Just In…
Sponsored Partner Content
October 27 - October 28
November 29 - December 3
December 6 - December 10San Diego CA United States
February 7, 2022 - February 9, 2022Houston TX United States
June 26, 2022 - June 30, 2022Hollywood FL United States