This article describes the ModelOp Life Cycle (MLC) Manager and the MLC Process, and how they are used to drive automation and repeatability for a robust ModelOps program.
Table of Contents
The ModelOp Life Cycle Manager (MLC Manager) automates operations related to the deployment, monitoring, and governance of models so that you can get them into service quickly, keep track of how each model is performing, and have easy access to the entire history of each model. For a large enterprise, there are hundreds or thousands of models, each of which has differing business requirements and different pathways to production. The MLC Manager provides flexibility with how you manage and automate portions of a model’s life cycle to meet the disparate needs across groups -- all in a central, governed location.
There are two core concepts to how ModelOp Center achieves enterprise-scale automation and repeatability: the MLC Manager and the MLC Process. The subsequent sections provide more detail on each, and then dive into several scenarios of how to leverage the MLC Manager and MLC Processes.
The MLC Manager is a low-code automation framework that executes, monitors, and manages MLC Processes. The MLC Manager is built on top of Camunda: a leading Java-based framework supporting Business Process Model and Notation (BPMN) for workflow and process automation. The MLC Manager is the answer to a number of obstacles faced by teams:
Reduces the time it takes to get a model from the model factory into production by defining a consistent methodology within your business to move the model through each required step, and track its progress throughout your organization.
Scales the functions necessary to manage the hundreds or thousands of models across the enterprise, controlling the most important tasks and processes for a variety different models.
Incorporates visualization tools to display real-time status and availability of system processes and resources.
The MLC Process encodes and automates a set of steps in a model’s life cycle, which can range from model registration, to submitting models for full productionization, to continuous production testing, and eventual retirement. The MLC Manager executes and monitors each MLC Process, and automatically captures metadata and information about the model’s journey through the MLC Process.
An MLC Process can apply to an individual model or a set of models, using common criteria such as business unit, model language, or the model framework they employ. Regardless, the MLC Process provides the consistent methodology for managing the various pathways of a model’s journey in an enterprise, across all models and all groups. This could include highly regulated models that require strict government requirements, or rapid deployment internal-use-only models that require a minimal process.
A typical ModelOp Center implementation will have more than one MLC Process. Each MLC Process is defined in any BPMN compliant editor, such as Camunda Modeler, as a BPMN file.
MLC Processes leverage the standard elements of a Camunda BPMN along with custom delegates that interface with ModelOp Center. This allows the flexibility to orchestrate complex operations within ModelOp Center. The common entities within an MLC Process include:
Signal events - events that initiate the MLC Process or trigger an action to occur from within an MLC. These can be triggered on events such as when a model is changed or based on a timer.
Tasks - there are a variety of tasks within an MLC Process:
User tasks - manual tasks for specific users to perform, such as approvals. These pause the progress of the workflow until completed.
External service calls - used to integrate and interact with other systems.
Script tasks - runs custom code including inline Groovy. Typically, you utilize variables and model metadata to determine parameters for calls to ModelOp Center.
ModelOp Center calls - specific calls to ModelOp Center that automate interactions with the model including Batch Jobs (see: Model Batch Jobs and Tests) and Model Deployments.
Gateway - decision logic gates that control the flow based on information in the process, such as model metadata, test results, etc.
The automated operations within an MLC Process include collecting key metrics to help calculate Key Performance Indicators (KPI), such as how long it takes to get Models into Business or get changes approved.
The MLC Manager provides flexibility with how you manage and automate the various life cycles of models across the enterprise. Each model in the enterprise can take a wide variety of paths to production, have different patterns for monitoring, and have various continuous improvement or retirement steps.
MLC Processes are often triggered by external events. Some typical examples of this include:
a model is marked ready for productionization
a time based event
new data arrives in a location
a notification is received
a manual intervention by a user
In fact, the ModelOp Command Center has several UI features that leverage MLC Processes under the hood, including submitting the model for productionization from the Model Details screen or executing a Batch Job from the Runtimes screen. As you can already tell, there are many ways to automatically do a variety of different things with MLC Processes. The following provides more details of these typical processes.
MLC Processes can automate the productionization of a model, regardless of whether the path to production is simple or complex. For example, you can use an MLC Process to deploy a newly registered model into your QA runtime, run the model through a series of tests, trigger an automated security scan, and seek appropriate approvals before it is deployed into Production. MLC Processes can be created in a flexible manner to meet the needs of your team. They can be configured to automatically locate an available runtime that is compatible with the current model, or a specific group of runtimes can be targeted by tags. See this article for more details on how this is accomplished. The example in On Model Changed includes these deployment pieces.
Model Refresh & Retraining
After the initial deployment, it’s important to have a way to rapidly retrain or trigger the refresh of a model to ensure it is performing optimally. Retraining can be automated within an MLC process to run on a schedule or when new labeled data is available. Using the same MLC process, the new candidate model can be compared against the current deployed model using a Champion/Challenger Model Comparison. Finally, the MLC Process can automate the steps required for Change Management including re-testing and approvals. The example On Model Changed demonstrates how you can build these operations into an MLC Process.
Approval & Tasks
Throughout the MLC Process, you can include User Tasks to direct specific team members or roles to review and approve changes to the models. ModelOp Center integrates with existing IT task management systems, such as JIRA, and ticketing systems, such as ServiceNow to incorporate these user based tasks. For each of these externally-created tasks or approvals, the MLC can inject model-specific metadata to provide context for the task or approval approval.
You can monitor models using MLC Processes by automatically running Model Batch Jobs and Tests on a model. You can run Batch Jobs on a schedule or based on new labeled or ground truth data becoming available. For example, you can run a Batch Metrics Job to calculate the statistical performance and/or determine if the model has started to produce ethically biased predictions, and then use decision criteria to determine which action to take. A common pattern is to generate an alert into ModelOp Center for the ModelOp Support Team to triage.
This section describes some specific examples of MLC Processes in detail.
On Model Changed MLC Process
On Model Changed is an MLC Process that incorporates several of the patterns described earlier in a single MLC Process for managing the creation of a model, or changes to an existing model.
Model Submitted - a deployable model object has been created, which is a snapshot in time of the model with the typical goal of moving the model into production. Clicking Submit in the Model Details page triggers the start of this MLC process.
Training - based on a metadata flag, a Training Job can be initiated to train the model. A polling task automatically checks for the Training Job to finish before proceeding.
Testing - an automated, reproducible Metrics Job executes, and then the results are persisted in ModelOp Center.
Approval Based on Type - based on the metadata, you can route the model to the correct group to approve the changes. The details of the model, including all of the core information about the model, the changes to the model and the test results, are passed on to the reviewer.
Model Deployment - the MLC Process finds a ModelOp Runtime that is tagged with the appropriate tag, and attempts to deploy the model onto that runtime. If there are any errors during deployment, a task is created to review the error.
Error handling - when models are rejected, errors occur running the tests, or the model fails to deploy, the process creates human tasks to review the reasons for failure so they can take the appropriate actions.
Daily Backtesting MLC Process
The Daily Backtesting MLC Process is an example of a simple monitor that gets the exact version of a model that is currently deployed, and tests against a new set of labeled data from the night before.
Start Event - a timed start event initiates the monitor every morning at a specific time.
Run Test -next, a Test Job evaluates the model with the new data.
Parse Test Results - parses the raw test results into a standardized format and captures the data within Model Manager for future review.
Exceeded Threshold - if the the business threshold is not met, an alert is sent to the support team to investigate.
Create & Deploy a New MLC Process Using Camunda Modeler
When the MLC Process is ready, select the Deploy icon to put it in the MLC Manager. Note: while the URL will be highly dependent on your environment’s exact setup, it likely uses a path such as http://<moc-base-url>/mlc-service/rest
4. Verify that your new MLC Process is registered with MLC Manager. Go to the Command Center and click the Models icon in the left column.