Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This article provides background on the different types of Batch Jobs in ModelOp Center, and how to use operationalize models in for batch scoring.

Table of Contents

Table of Contents

Batch Job Overview

Batch Job Types

A Batch Job is an execution of a model against a set of input records for scoring or for testing. You can build Batch Jobs into an MLC Process, execute a job manually from the Command Center, or run a Batch Job from the command line. The main types of Batch Jobs are detailed in the following table.

Type

Description 

Scoring Job

Executes the Scoring Function to yield predictions for each of the records in the input data. This can be used for conducting testing or for production batch scoring jobs.

Metrics Job

Executes the Metric Function against labeled test data. This yields efficacy metrics and/or bias detection and interpretability metrics.

Training Job

Executes the Training Function to train or re-train a model. The output is typically a trained model artifact or other type of attachment.

Batch Job Scenarios

You can use Batch Jobs for several different scenarios in a Model’s Life Cycle as detailed in the following table.

Scenario

Job Type

Description

Testing a Model

Scoring Job

Use the Scoring Job to score test data so that you can conduct functional, performance, or system testing of the model execution code. You can enable or disable schema checking during this test.

Model Back-Test/Evaluation

Metrics Job

The Metrics Job executes the Metrics Function against labeled test data to generate evaluation metrics such as F1, a Confusion Matrix, ROC Curve, AUC, etc. See Model Model Efficacy Metrics and Monitoring for more information.

Ethical Fairness Detection

Metrics Job

Use the Metrics Job to run the Metrics Function against labeled data to generate metrics that detect ethical fairness. See Model Governance: Bias & Interpretability for more information.

Re-Training/Refresh

Training Job

When new labeled data is available, use a Training Job to create a new trained model artifact. This can be automated in an MLC Process. See Model Lifecycle Manager: Automation for more information.

Input & Output Data Sets for Batch Jobs

Batch Jobs require input data in order to run. You can upload your input data, provide a reference to an S3 or HDFS location for your data, or specify a SQL statement to define your input data set. These options are available via the ModelOp Center UI, MLC, CLI, or API.

Manually Create a Batch Job in the Command Center

  1. Click Jobs in the left column. A list of models appears.

  2. Click Create a New Batch Job. A list of models appears.

    Image Removed

    Click the

Image Added

3. Select the model for which you want to

test.
  1. Provide the input data set. Click Choose File, select the file with your input data, and then click Upload File or Embed File.

  2. Designate the name of the output data set and the location of where it should be posted upon completion.

  3. Click Create Scoring Job, Create Metrics Job, or Create Training Job at the bottom of the page to tell ModelOp Center which function to leverage.

    Image Removed

  • The Job Details screen displays the status of the job as it runs.

  • Image Removed

    run a batch job.

    Image Added

    4. Select the specific snapshot (version) for your batch job.

    Image Added

    5. Propose a Runtime for your batch job, or elect to allow an MLC to find the best runtime to run the job

    Image Added

    6. Choose the type of Job (Training, Scoring, Metrics).

    Image Added

    7. Optionally select to use input and/or output schemas for the job.

    Image Added

    8. Add Input Assets, either by uploading, providing a file reference to an S3 or HDFS location, or provide a SQL statement.

    Image Added

    9. Specify the Output Asset(s), either by providing a file reference to an S3 or HDFS location, provide a SQL statement, or specify to embed the output.

    Image Added

    10. Review the Job information and select “Run Job” when ready.

    Image Added

    Create a Batch Job from the CLI

    1. Install the ModelOp CLI if it is not already installed. See the ModelOp CLI Reference for install instructions.

    2. Type moc job create [batchjob | testjob | trainingjob] <deployable-or-deployed-model-uuid> input.json output.json Where:

      1. batchjob is a Scoring Job as described earlier in this article

      2. testjob is a Metrics Job as described earlier in this article

      3. trainingjob is a Training Job as described earlier in this article

      4. deployable-or-deployed-model-uuid is the uuid of a model already registered with the ModelOp Center (see the ModelOp CLI Reference for how to find these uuids)

      5. input.json is the name of the data set to run the job against

      6. output.json is the name of the output file

    3. Type moc job result <uuid> where <uuid> is the unique identifier generated by the command in the previous step. If the output file is embedded, the results are displayed in the terminal. If the job utilizes an external file asset (S3) for the output, then the results will yield a link to the S3 object where the results are placed.