Preparing a Model for the Executive Dashboard

Overview

When a model owner is ready to productionize a model, the model owner should ensure that their model has all the prerequisites in place so that the model can be monitored by the “Dashboard model” and therefore visualized in the ModelOp Center home “Executive Dashboard.” For more background on the Executive Dashboard, see this article.

The objective of this article is describe the typical steps that a model owner takes to prepare the model to appear as a row in the Executive Dashboard:

 

Note that a Model Owner can see the specific view of the main elements in the Executive Dashboard for their model in the Business Model details page:

 

Pre-Requisites

To help a Model Owner verify that the model is ready for the Executive Dashboard, ModelOp provides an interactive Jupyter notebook. This notebook can be found here.

Please download and load this Jupyter notebook before proceeding, as it will be used throughout the remainder of the article.

Default Dashboard Included Monitors

As background, the default “Dashboard model” contains a number of individual monitors that are commonly used across all models across the enterprise. While the “Dashboard model” can be customized per the needs of the business, the below tables provide details of the default monitors and metrics calculated, as well as the inputs required.

These items will be required in order for the model to be ready to be monitored by the “Dashboard model.” Please note that not all of the below monitors and metrics will be applicable to all models. The “Dashboard model” will simply skip those items and output only the metrics that are relevant. The model owner should attempt to apply as many relevant monitors/metrics as possible, but again, not all may be relevant.

Business KPI & Inferences

Metric

Description

Required Inputs for a Given Business Model

Metric for Evaluation

(from Dashboard Model Test Result)

Metric

Description

Required Inputs for a Given Business Model

Metric for Evaluation

(from Dashboard Model Test Result)

Business KPI

The cumulative business value for the Business model.

  • Extended Schema: requires the following fields defined in the schema

    • “positiveClassLabel”

    • “isAmountField”

    • “label_value”

    • “score”

  • Model Metadata: requires the following elements set in the “Details” tab of the Business Model:

    • TRUE POSITIVE RATE BASELINE METRIC

    • TRUE NEGATIVE RATE BASELINE METRIC

    • TRUE POSITIVE COST MULTIPLIER

    • TRUE NEGATIVE COST MULTIPLIER

    • FALSE POSITIVE COST MULTIPLIER

    • FALSE NEGATIVE COST MULTIPLIER

actualROIAllTime

Daily Inferences

The count of inferences processed by the given Business model over the period

  • Comparator Data

allVolumetricMonitorRecordCount

Heatmap Monitors

Monitor name

Description

Required Inputs for a Given Business Model

Metric for evaluation

Heatmap criteria

Monitor name

Description

Required Inputs for a Given Business Model

Metric for evaluation

Heatmap criteria

Output integrity

Determines that all input records received a corresponding output inference/score using a unique identifier to ensure the model produced the appropriate output for all inputs.

  • Extended Schema: requires the following fields defined in the schema

    • Role=”identifier” for one field

  • Comparator Data

identifiers_match

identifiers_match = true → GREEN

identifiers_match = false → RED

identifiers_match is NULL or Monitor error → GRAY

 

Data drift

Calculates the p-value from a kolmogorov_smirnov test for each feature and compares the max value against the thresholds (from the DMN) to determine the status

  • Extended Schema: requires the following fields defined in the schema

    • Role=”driftCandidate” for one or more fields

  • Comparator Data

  • Baseline Data

max( <feature_1>: <p-value>, ...:..., <feature_n>: <p-value>)

i.e. the max of all the p-values across all the features

max(p-value) < 0.05 → RED

0.05 <= max(p-value) <= 0.1 → YELLOW

0.1 < max(p-value) <= 1.0 → GREEN

max(p-value) IS NULL or test fails → GRAY

Concept drift

Calculates the p-value from a kolmogorov_smirnov test for the output score column(s) and compares the max value against the thresholds (from the DMN) to determine the status

  • Extended Schema: requires the following fields defined in the schema

    • Role=”driftCandidate” for one or more fields

  • Comparator Data

  • Baseline Data

max( <score_column>: <p-value>)

i.e. the max of all the p-values across the score columns (usually there is only one but there could be multiple)

max(p-value) < 0.05 → RED

0.05 <= max(p-value) <= 0.1 → YELLOW

0.1 < max(p-value) <= 1.0 → GREEN

max(p-value) IS NULL or test fails → GRAY

Statistical performance

Calculates the performance metrics (e.g. auc or rmse) for the model using ground truth. Compares against the thresholds (from the DMN) to determine the status

  • Extended Schema: requires the following fields defined in the schema

    • “score”

    • “label_value”

    • “positiveClassLabel” (for classification models)

  • Comparator Data

<auc>

<auc> >= 0.7 → GREEN

0.6 < <auc> < 0.7 → YELLOW

<auc> <= 0.6 → RED

<auc> IS NULL or test fails → GRAY

Characteristic Stability

Calculates the characteristic stability index for each feature and compares the max value against the thresholds (from the DMN) to determine the status

  • Extended Schema: requires the following fields defined in the schema

    • Role=”driftCandidate” for one or more fields

    • (Optional) specialValues

  • Comparator Data

  • Baseline Data

max( <predictive_feature.stability_index>:)

i.e. the max of all the stability indexes across all features

max(stability_index) >= 0.2 → RED

0.1 < max(stability_index) < 0.2 → YELLOW

max(stability_index) <= 0.1 → GREEN

max(stability_index) IS NULL or test fails → GRAY

Ethical Fairness

Calculates the maximum and minimum proportional parity for each protected class and compares the max and min values against the thresholds (from the DMN) to determine the status

  • Extended Schema: requires the following fields defined in the schema

    • protectedClass

  • Comparator Data

max(ppr_disparity) and min(ppr_disparity) across all protected classes

max(ppr_disparity) > 1.2 or min(ppr_disparity) < 0.8 → RED

 

max(ppr_disparity) <= 1 .2 or min(ppr_disparity) >= 0.8 → GREEN

 

 

Dashboard Model Jupyter Notebook Guide

Credential Setup

The Jupyter Notebook begins by ensuring it can connect to the ModelOp Center environment. The first step is to set the ModelOp Center URL (moc_url in the image below) as well as the S3 access information. The example credentials are for demonstration purposes. Running the cell below will produce a UI for editing the URL and credentials. Modify the values as needed and click the Update Values button. If other credentials are required to access specific data systems for the Model’s assets update the cell as needed. This data may reside in any supported datastore such as a SQL database/data warehouse, S3 object store, HDFS, etc.

 

 

The following two cells are for ensuring a valid connection to ModelOp Center. If running in a secured environment, the notebook will create a pop-up window to the OAuth provider sign-in page. Enter credentials here to ensure subsequent requests have the required auth to access the model data. The pop-up window will close automatically after entering valid credentials.

Choose Target Model

In the Choose Target Model section the notebook guides the user to choose a model to test the dashboard model against. The notebook will call out to ModelOp Center and retrieve a list of model candidates. Select the model to test against.

Run the cell in the Load Target Model section to retrieve information about the model selected. The notebook will display a link back to the model in ModelOp Center.

Check for Valid Schema

Run the cell in the Check for valid schema section to verify that the model contains a valid schema. The notebook provides a link back to the schema.

If the schema is invalid, the notebook generates a message like the one below. The schema can be modified and the model reloaded to iterate on this check.

Preparing Assets

The next step is preparing the data assets. The notebook guides the user through this process. In the Examine model assets section run the first cell. If the model has an asset data with a role of TRAINING_DATA that should be used as BASELINE_DATA check the checkbox.

Run the following cell to generate two tables. The first table indicates the assets needed to run the dashboard model. The second table shows the assets available for use on the model. The matching asset roles are highlighted in yellow to indicate which assets will be used by the dashboard model. The link to the assets can be used to make modifications in ModelOp Center.

If the needed assets cannot be found a message like the following will appear indicating the error(s). Use the link to modify the model assets as needed.

Once assets are properly configured, run the subsequent two cells under Validate the Baseline Data and Validate the Comparator Data to validate that the notebook can retrieve the assets for BASELINE_DATA and COMPARATOR_DATA, and that the data looks as expected.

Testing the Dashboard Model

In the cell below Define Dashboard Model provide dashboard model code (or use the example provided). Run the cell after making any updates and then run the subsequent cells under Generate Example Job, Run Model Init Function, and Run Model Metrics Function.

The final cell will produce the results of running the Dashboard model against the target model. The results include logs, a heat map (if applicable), and the raw JSON Model Test Result.

Testing and Configuring Thresholds

The thresholds for the heat map are configured in dashboard_model.dmn and invoked by the dashboard monitor via evaluated_results = mlc_api.evaluate_results(monitor_results, "dashboard_model.dmn"). To test with different threshold values, modify the DMN values and re-run the cells from that point forward.