Operational Monitoring
This article describes how to use the ModelOp Command Center to enable operational monitoring, focused on ensuring that models are available and running within SLA’s on the target runtimes.
Table of Contents
Introduction
Operational performance monitors include:
Runtime Monitoring:
Model availability and SLA performance
Data throughput and latency with inference execution
Model Data Monitoring:
Input (and output) data adherence to the defined schema for model
Runtime Monitoring
To get real-time insight into how your model is performing, you can click into a detailed, real-time view of the Runtime information for the deployed model. This includes real-time monitors about the infrastructure, data throughput, model logs, and lineage, where available.
To see the Runtime Monitoring, navigate to the deployed model: Runtimes → Runtime Dashboard → <Runtime where your model is deployed>
The Runtime monitor displays the following information about the Runtime environment:
Endpoint throughput - volume of data through the deployed model
CPU Utilization - User CPU utilization and Kernel CPU usage
System Resource Usage - real-time memory usage
Lineage of the deployment - MLC Process metadata that details the deployment information and history
Logs - A live scroll of the model logs
Model Data Monitoring
While not required, ModelOp Center provides its own runtime out of the box, which has the capability to validate incoming and outgoing data from the model for adherence to a defined schema. This schema is a defined structure that the model expects to ensure that erroneous data is not accidentally processed by the model causing model stability errors or downtime.
Overview
ModelOp Center enforces strict typing of engine inputs and outputs at two levels: stream input/output, and model input/output. Types are declared using AVRO schema.
To support this functionality, ModelOp Center’s Model-Manage maintains a database of named AVRO schemas. Python and R models must then reference their input and output schemas using smart comments. (PrettyPFA and PFA models instead explicitly include their AVRO types as part of the model format.) Stream descriptors may either reference a named schema from Model Manage, or they may explicitly declare schemas.
In either case, ModelOp Center performs the following type checks:
Before starting a job: the input stream’s schema is checked for compatibility against the model’s input schema, and the output stream’s schema is checked for compatibility against the model’s output schema.
When incoming data is received: the incoming data is checked against the input schemas of the stream and model.
When an output is produced by the model: the outcoming data is checked against the model and stream’s output schemas.
Failures of any of these checks are reported: schema incompatibilities between the model and the input or output streams will produce an error, and the engine will not run the job. Input or output records that are rejected due to schema incompatibility appear as messages in the ModelOp runtime logs.
Examples
The following model takes in a record with three fields (name
, x,
and y
), and returns the product of the two numbers.
# modelop.schema.0: input_schema.avsc
# modelop.schema.1: output_schema.avsc
def action(datum):
my_name = datum['name']
x = datum['x']
y = datum['y']
yield {'name': my_name, 'product':x*y}
The corresponding input and output AVRO schema are:
{
"type":"record",
"name":"input",
"fields": [
{"name":"name", "type":"string"},
{"name":"x", "type":"double"},
{"name":"y", "type":"double"}
]
}
and
{
"type":"record",
"name":"output",
"fields": [
{"name":"name", "type":"string"},
{"name":"product", "type":"double"}
]
}
So, for example, this model may take as input the JSON record
and score this record to produce
Note that in both the model’s smart comments, the CLI commands, and the stream descriptor schema references, the schemas are referenced by their name in model manage, not the filename or any other property.
Next Article: Drift Monitoring >