Follow Datanami:
March 13, 2019

Can AIOps Save IT Management?


IT professionals are struggling to keep up with the demands of today’s modern applications and infrastructure. Rising use of containers and distributed systems – not to mention the big data explosion itself – is putting enormous pressure on IT staffs to keep it all running. Luckily, artificial intelligence will be lending a hand through an emerging discipline called AIOps.

Automation has always been a component of the IT management gig. For decades, IBM, HPE, BMC, and CA – the so-called “Big 4” of systems management – offered tools that automate common data center tasks, such as monitoring logs, scheduling jobs, executing backups, and distributing and applying patches.

Up until recently, Fortune 2000 firms got most of what they needed from The Big 4 to keep on top of their servers, storage, and networks (application performance management and security have traditionally been separate). There was always manual work required to keep it all running smoothly – the “Sneakernet” of last resort — but the companies managed to get by.

This approach worked in the past, but it’s crumbling under the weight of the increased complexity found in today’s public and private data centers. Containerization technologies like Kubernetes and Docker, hybrid cloud infrastructure, and micro-services architectures all bring their own benefits to system architects, developers, and the businesses that consume the applications that run on them. But these technologies also obfuscate what’s going on under the covers, rendering older rules-based approaches to IT automation obsolete.

That pain has driven the need for a new approach to of IT operations. Some people call it AIOps.

Enter the AIOps

AIOps refers to the next generation of IT operations technology that uses machine learning and AI techniques to deliver greater clarity and automation to IT pros tasked with keeping the servers up, the network on, and the applications generating revenue for the business.

(Source: Gartner)

All of the Big 4 vendors are embracing AIOps, which you would expect. But they have competition from a new crop of vendors, such as FixStream, OpsRamp, and Moogsoft, as well as established players like Splunk and AppDynamics, which is owned by Cisco.

Here’s how Gartner defines AIOps:

“AIOps platforms utilize big data, modern machine learning and other advanced analytics technologies to directly and indirectly enhance IT operations (monitoring, automation and service desk) functions with proactive, personal and dynamic insight. AIOps platforms enable the concurrent use of multiple data sources, data collection methods, analytical (real-time and deep) technologies, and presentation technologies.”

Matt Chotin, senior director of product and strategy at AppDynamics, says that AIOps platforms have become a necessity for companies that manage their own IT infrastructure.

“Their technology environments are getting too large and too dynamic to manage manually,” he tells Datanami. “We’re talking about multiple clouds, we’re talking about IOT, and containers. And now everybody is expected to be more agile and constantly be releasing into these new environments.

“AIOps is a way for companies to start managing this complexity and helping them navigate this challenge,” he continues. “We see AIOps as a philosophy. It values being predictive over being reactive, about actually getting answers presented to you as opposed to spending all your time doing investigations, and being able to start taking action over analysis paralyses.”

New Focus for AppD

AppDynamics traditionally focused on application performance management (APM), where it is an undisputed leader in the field. Following its 2016 acquisition by Cisco, AppDynamics transformed into a provider of AIOps solutions for the enterprise.

AppDynamics still surfaces application usage trends and statistics up through a GUI, but correlations are increasingly being made automatically under the covers using machine learning

The transformation makes sense for Cisco, which has well-respected business lines in networking hardware and software; servers with its Unified Computing Systems (UCS) line; and the security business through its McAfee acquisition years ago. Being so close to the server, network, and security layers gives AppDynamics a wealth of data to crunch to find interesting correlations that could impact applications. Cisco calls it the “central nervous system for IT.”

AppDynamics uses traditional machine learning technology (as opposed to neural nets) to find correlations among the different streams of operations-related data that it crunches. This allows it to have a greater understanding of how the various applications relate to one another.

“The app developer doesn’t want to think about the network that much, but obviously it’s hugely important,” Chotin says. “Being able to tie all this information together [let’s us avoid] playing the blame game, of just saying ‘Well, the application seems to be OK, so it’s got to be the networks fault.'”

Virtualization, Fragmentation

VMware kicked off the virtualization wave nearly two decades ago, allowing companies to carve up their X86 servers to get more useful work out of them. That bolstered utilization, but it also pushed IT operations folks further away from the underlying iron.

As applications become more virtualized and data becomes more fragmented, tracing performance issues becomes more difficult

With Docker, Kubernetes, and other containerization technologies proliferating in today’s cloud and on-prem data centers, IT pros benefit from a more powerful abstraction layer atop the servers. However, the complexity level has also magnified with things like Kubernetes, says Todd Brannon, senior director of data center marketing at Cisco.

“It makes it harder,” he says. “Instead of one machine to keep an eye on, you have hundreds or thousands of containers that might comprise some sort of service of the business.”

Data is also becoming more fragmented, which adds to the complexity. “Data is no longer centered in the data center,” Brannon says. “It’s being produced and analyzed and consumed in greater quantities outside of the traditional data center than ever before, and 5G is going to accelerate that.”

Keeping this distributed morass of data-hungry applications well fed and cared for is an enormous burden on IT professionals. The best way to alleviate that burden is through AIOps, Brannon says.

“Not only are the applications breaking down and becoming more fragmented or distributed, the data itself is more distributed, and so that presents an enormous challenge for IT because it’s the connections between all these things that can throw off problems,” he says. “So you have to go AIOps types of toolsets to help you manage and process an increasingly fragmented landscape.”

Related Items:

AIOps Emerges as ‘Air Traffic Control’ for IT

Why IT Ops Has Become Such a Rich Target for Big Data Analytics