What is AIOps, DevOps, and MLOps?
In this introductory blog post, we will learn the concepts and main differences among AIOps, DevOps and MLOps. We will also examine why MLOps processes are important to machine learning and the personas involved in its lifecycle.
AIOps represents the intersection of IT operations and Artificial Intelligence.
It defines the application of AI to enhance IT operations. More specifically, AIOps uses machine learning, big data and analytics to:
- Collect, analyse and aggregate data from the vast amount of information generated by multiple IT infrastructure components, applications and performance monitoring systems;
- It also intelligently extract signal out of noise to identify and predict events and patterns related to system performance, security, availability issues
- It diagnoses the root cause of those and report them to IT teams to ensure rapid response and remediation — or, in some cases, it automatically resolve these issues without having to rely on human intervention
There are several advantages with replacing multiple fragmented and manually operated IT operation systems with a single, intelligent, automated — and sometimes — autonomous AIOps platform.
- It equip teams to respond timely and even proactively to slowdowns, security threats and outages with a lot less effort.
- AIOps also bridges the gap between an ever increasing diverse, dynamic and difficult to monitor IT Landscape and user expectations. User expectations are for little or no interruption in application performance and availability
Most IT experts consider AIOps to be the future of IT operations management.
In the case of software engineering, the intersection between IT operations, software development and application delivery forms what we call DevOps.
To start from the beginning, let’s take a look at the lifecycle of traditional software. It is arguably quite straight forward. At its simplest, you:
- develop,
- test,
- and deploy the software,
- and then release a new version with features, updates, and/or fixes as needed.
To facilitate the process of development, testing, deployment and further releases, the traditional software development can rely on DevOps. Traditional software development can then be automated and rely less on manual processes which are time consuming and prone to errors. Amongst the processes of DevOps we highlight the concepts of
- Continuous Integration (CI), which is all about automating the integration of code changes created by multiple contributors into a single software project.
- Continuous Delivery (CD), CD is all about delivery and deployment. It means that we’ve got the software that has been going through the various processes and now we’re ready to push it out the door and deliver it to end users.
- and Continuous Testing (CT), it is the process of executing automated tests as part of the software delivery pipeline
These are employed to reduce development time while continuously delivering new releases and maintaining software quality.
In machine learning system we have a different process called MLOps. It is the integration of software engineering, data engineering and data science processes.
Let’s take a step back and understand what happens in ML systems.
When it comes to Machine Learning (or ML) systems, it is easy to assume that the ML workflow will follow a similar process as in software engineering. After all, it should just be a matter of developing and training an ML model, deploying it and releasing a new version as required, right? Not quite.
The environment in which ML systems operate complicate things and that is why we need a different approach as to traditional software engineering.
For starters, the ML systems themselves are very different from traditional software due to their data-driven, non-deterministic behaviour. In software engineering the code drives the behaviour of the system. In Machine Learning systems, the data will drive the behaviour of the system. As such, any change in the behaviour of the data — for instance a change in its distribution — will lead the Machine Learning system to operate differently.
In an ever changing world, ML practitioners must anticipate that the real-world data on which production ML models infer, will inevitably change.
To support the complexities involved in productionising ML systems, a special kind of DevOps for ML has emerged — hence MLOps — which stands for ML operations.
MLOps helps to manage the constant changes in ML systems and the subsequent need for model redeployments. MLOps embraces DevOps continuous integration and continuous delivery, but replaces the continuous testing phase with continuous training.