This article describes how to use the ModelOp Command Center as the central station to monitor your models. It also describes the MLC processes that generate the alerts, and how to react to alerts, tasks and notifications reported by the MLC. The primary audience for this article is the ModelOps Support Team.
Table of Contents
ModelOp Center enables comprehensive monitoring of a deployed model through several mechanisms:
Backtest metrics monitoring
Alerting & notifications
To get real-time insight into how your model is performing, you can click into a detailed, real-time view of the Runtime information for the deployed model. This includes real-time monitors about the infrastructure, data throughput, model logs, and lineage.
To see the Runtime Monitoring, navigate to the deployed model: Runtimes → Runtime Dashboard → <Runtime where your model is deployed>
The Runtime monitor displays the following information about the Runtime environment:
Endpoint throughput - volume of data through the deployed model
CPU Utilization - User CPU utilization and Kernel CPU usage
System Resource Usage - real-time memory usage
Lineage of the deployment - MLC Process metadata that details the deployment information and history
Logs - A live scroll of the model logs
Backtest Metrics Monitoring
While some models may allow for inline efficacy monitoring, most models do not obtain ground truth until a future date, which necessitates the use of regular backtesting. ModelOp Center allows you to define metrics functions that can be used to execute regular backtests. An MLC process can automate the regular execution of a backtest to compute statistical metrics. See Model Efficacy Metrics and Monitoring for more details on defining and executing backtesting metrics.
Alerting & Notifications
Alerts, Tasks, and Notifications Messages provide visibility into information and actions that need to be taken as a result of model monitoring. These “messages” are surfaced through the Command Center Dashboard, but also can be tied into enterprise ticketing and alerting systems.
The types of messages generated from Model Monitoring include:
Alerts - test failures, model errors, and other situations that require a response.
For details about viewing and responding to test failures, see Responding to Notifications & Alerts on this page.
Tasks - user tasks such as approve a model, acknowledge a failed test, etc.
For details about viewing and responding to test failures.
Notifications - includes system status, runtime status and errors, model errors, and other information generated by ModelOp Center automatically.
Responding to Notifications, Tasks & Alerts
ModelOp Center integrates with the ticketing system of your choice, such as ServiceNow or Jira. Model Lifecycle processes can be configured to generate alerts on these events and assigned to an appropriate party for further investigation.