Guiding a Successful AIOps Implementation – 6 Critical Considerations

AIOps

There’s a great AI scene in the sci-fi movie Elysium that nicely illustrates the important connection between man and machine. It’s the bit where the hero (Matt Damon) visits his mannequin-style parole officer after a run in with robotic law enforcement. Extending his parole period because of the misdemeanour and detecting elevated stress levels, the AI infused dummy asks the timeless question – “Would you like to talk to a human?”

Brilliant — even in a dystopian future it’s reassuring to know there’s still a place for humankind doing some of the hard stuff.

Back in today’s reality, IT operations leaders are getting all warm and fuzzy about the prospect of employing AI to ease the burden of managing increasingly complex IT systems and applications. Whether it’s machine learning algorithms to detect complex performance problems or natural language processing bots for customer support, managers are starting to buy into the notion of an AI-driven operational future.

Yet even with the AI promise of delivering better and lower cost support, faster problem resolution and fully automated and intelligent workflows, the uptake of AI for IT operations (AIOps) has to date been relatively slow. This may be partly due to grandiose marketing claims not quite matching the actual data science on offer, but also the simple fact that most organizations are hesitant when AI failure in an operational context is too much to for the business to tolerate. While we’re willing to laugh off the occasional stuff ups from Alexa style virtual assistants, we’re not quite ready to relinquish full operational control over business critical digital systems to AI; the cost of failure is just too high.

But maintaining an operational status-quo doesn’t work either. Today’s complex systems have rendered many traditional monitoring practices inadequate, resulting in both financial and human capital cost increases. As organizations persist with obsolete tools and practices, then staff become trapped in reactive break-fix mode; failing to acquire the critical skills and insights now needed to optimize complex systems in line with business goals and objectives. 

This catch 22 dynamic suggests that the best applications of AIOps will be those that harness the smarts of a human-machine ‘collective’. To lower the cost of failure, staff must adopt new skills necessary to apply machine learning algorithms when and where they have the biggest impact. As for complete AIOps systems themselves, well, they’ll eliminate many burdensome tasks across IT operations, while surfacing obscured insights needed to optimize and enhance complex systems.  

Achieving the benefits of collective intelligence requires an AIOps implementation that balances AI power with a hefty dose of pragmatism. Here are six important things to consider:

  1. Start Small, think Big – going all-in on AIOps by attempting to fully automate business critical applications based on machine learning could be a recipe for disaster. Better to first seek out applications where the cost of failure is lower and use AI successes (and failures) to drive incremental improvements across the organization.
  • Not all AIOps is Equal – accurately and precisely pinpointing problem root-cause across complex and interdependent systems is infinitely more challenging than highlighting performance anomalies across known seasonal demand patterns. Always consider that the simplest AIOps feature could be the one that delivers the biggest business return, so seek out solutions that deliver diagnostic, predictive and prescriptive analysis, including: anomaly detection, advanced root-cause analysis, optimization insights (e.g. cloud capacity and cost analysis), and intelligent automation.
  • Outcomes over Outputs – the best AIOps solutions will deliver highly-prized predictive analysis. Always ensure that that systems can take these outputs to guide human actions (e.g. during stressful on-call situations) or even trigger fully automated workflows. Without this, the full business value of AIOps will never be realized.
  • Avoid the Build Your Own Trap – attempting to build an all-singing – all-dancing intelligent automation system that relies on fine-grained analysis across multi-dimensional application stacks will be a highly complex (and costly) exercise. A better bet is partnering with providers that can back a prescriptive analytics vision with hard-earned experience at the coal-face of enterprise IT.
  • Consider the Fine Print – however much vendors try to dispel operational overhead, enterprise-wide AIOps offerings can involve massive data collection and analysis requirements. Organizations should always consider the data trade-offs and constraints of commercial AIOps solutions and the approaches providers use to address them –in terms of technical architecture, data modelling approaches – even licencing.
  • Black Box or Pandora’s Box? – while it’s natural to assume that AIOps solutions shouldn’t require commensurate increases in data science expertise, it’s important to consider that a closed or black box system easily become shelf-ware – especially when results can’t be trusted. To mitigate this, work with vendor solutions that have the flexibility to support your unique data requirements and attempt to make the AI processes clear and understandable. This way staff will grow to trust the system and expand it to support many more use-cases.

All these points illustrate the important role staff have in a successful implementation of AIOps. Whether it’s scrutinizing AI capability against vendor claims or understanding the optimum application in a variety of operational contexts, people will still play a critical role.

Perhaps though, it’s the strength of an important and developing symbiotic association between AIOps systems and IT operations staff that should become the real benchmark of success. While machine learning algorithms will reduce the burden of monitoring by detecting real problems from false alarms, they’ll improve exponentially over time based on staff interactions, automated feedback and a constant stream of more varied data types.

Unlike the dummy in the aforementioned sci-fi movie that just handballs tricky stuff, true AIOps systems will utilize a collective form of intelligence that’s now needed to perform modern IT operations functions neither human or machine can achieve alone.

About the Author:

Kieran Taylor has 20 years of high-tech product marketing experience with a focus on application performance management, cloud computing, content delivery networking and wide area network technologies. He is presently Head of Product Marketing for Broadcom’s AIOps segment and is responsible for thought leadership and sales enablement for AIOps, Operational Intelligence, APM and Infrastructure Management. Prior he led product marketing teams at Adobe, Akamai, DataPower/IBM and Nortel Networks. His career began as an editor of high-tech publications at Mc-Graw Hill.