Follow Datanami:
May 5, 2021

Performance, Complexity Dog K8S Growth

A new survey from Pepperdata shows that Kubernetes is doing what it was predicted to do: Becoming the de-facto standard for big data runtimes, such as Spark, Kafka, and Presto. However, customers are still having issues with complexity and management of K8S environments, the survey found.

More than three quarters of the 800 folks surveyed for Pepperdata’s “2021 Kubernetes & Big Data Report” reported using Kubernetes (or K8S), the open source container orchestration software that was originally developed by Google.

This should not come as a surprise, as K8S can dramatically simplify the operations effort required to run containerized applications. Instead of manually matching application utilization needs to infrastructure resources, K8S essentially automates all of that work. In large environments, when users have many applications competing for resources on a cluster, K8S’s automation can pay big dividends.

Kubernetes is most popular in private cloud environments, Pepperdata’s survey finds (Image source: Pepperdata)

This is why K8S has become so popular, and has outlasted other orchestration systems, including Docker Swarm, Apache Mesos, Hadoop’s YARN, and other offerings from server makers. All of the major public clouds have adopted K8S, and it has become a standard deployment option for on-prem system too.

Private cloud deployments, surprisingly, was the most popular usage scenario for K8S, according to Pepperdata’s survey, which showed a 47% share in private cloud deployments among survey-takers. That was followed by 35% for on-prem and 18% in the public cloud.

In terms of workloads, Apache Spark was the most popular application running on survey-takers K8S clusters, with a 30% share, followed by Apache Kafka at 25% and Presto at 23%. Deep learning workloads, such as Tensorflow or PyCharm, were used by 18% of survey-takers, it found.

Organizations cited several reasons for using K8S, including: improved resource utilization (30%); preparing to move to the cloud (23%); shorten deployment cycles (18%); and making applications and platforms cloud agnostic (15%).

Spark is the most popular workload on K8S, according to Pepperdata’s survey (Image source: Pepperdata)

However, it wasn’t all puppies and rainbows with K8S in Pepperdata’s study. While the software can automate many of the technical tasks that operators and administrators would otherwise have to do, K8S introduces its own level of complexity to cluster environments, and engineers with specialty skills are typically required to run it. Monitoring what is going on in an abstracted K8S environments, including mapping application resource demands to the actual underlying hardware, is best described as “opaque” in K8S.

The biggest challenge in moving to K8S was with the initial deployment, Pepperdata’s study found. That was followed by migration, monitoring and alerting complexity and increased cost, and reliability.

While it’s clear that K8S adoption is booming, there are also some growing pains associated with the new tech, says Ash Mushni, the CEO of Pepperdata.

“Kubernetes is increasingly being adopted by our customers for big data applications. As a result, we see customers experiencing performance challenges,” Munshi said in a press release. “This survey clearly indicates that these problems are universal and there is a need to better optimize these big data workloads.”

Related Items:

Is Kubernetes Overhyped?

The Biggest Reason Not to Go All In on Kubernetes

Pepperdata Adds Kafka Monitoring to Tune Queries

Datanami