Triggering Batch Scoring Jobs

The following are steps to run a Batch Scoring Job on a Batch Deployed Model.

Table of Contents

Pre-requisites:

Deploy a Model as Batch

Prior to running a Batch Scoring job, you should have a Model Deployed as Batch. To do so, please refer to the section https://modelopdocs.atlassian.net/wiki/spaces/dv25/pages/1655341915/Operationalizing+Models%3A+Batch#Operationalize-a-Model---Batch-Deployment-in-a-ModelOp-Runtime.

Pepare Runtimes

Identify the target Runtimes across the requisite Environments.

Please note, it is possible this step has already been done given the pre-requisites, but it’s worth noting that the Runtime matching also happens at Job scheduling so the engine still has to match at Job creation, not only at deployment time.

For each target Runtime, complete the following:

  1. Add “Environment/Stage Tags”. Based on the environments/stages required (see pre-requisites), add the necessary “environment/stage tag” to the runtime.

    1. Example: add a “DEV” tag to the Runtime in their development environment, an “SIT” tag to the Runtime in their SIT environment, a “UAT” tag to the Runtime in their UAT environment, and ultimately a “PROD” tag to the Runtime in their Prod environment

  2. Add “Model Service Tags”. The Model “Service” tag will be used to identify that this specific runtime is designed to be a target runtime for that particular model. Add the appropriate “Model Service Tag” to the runtime.

    1. Example: add a “cc-fraud” Model Service Tag to the runtime for a 3rd party credit card model to the “Dev”, “SIT”, “UAT”, and “Prod” runtimes.

Running batch Job with MLC

Trigger Job Creation:

Launch MLC via REST API

Example request:

curl --request POST 'http://gateway/mlc-service/rest/signalResponsive' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer {{token}}' \ --data-raw '{ "name": "com.modelop.mlc.definitions.Signals_DEPLOYED_BATCH_JOB", "variables": { "TAG": { "value": "model-service-tag" }, "MODEL_STAGE": { "value": "PROD" } } }'

 

{ "name": "com.modelop.mlc.definitions.Signals_DEPLOYED_BATCH_JOB", "variables": { "TAG": { "value": "model-service-tag" }, "MODEL_STAGE": { "value": "PROD" }, "INPUT_ASSETS": { "value": "[{\"name\": \"input_data.json\",\"assetType\": \"EXTERNAL_FILE\",\"repositoryInfo\": {\"repositoryType\": \"S3_REPOSITORY\",\"secure\": false,\"host\": \"modelop\",\"port\": 9000,\"region\": \"default-region\"},\"fileUrl\": \"http://modelop:9000/modelop/input_data.json\",\"filename\": \"input_data.json\",\"fileFormat\":\"JSON\"}]", "type": "Object", "valueInfo": { "objectTypeName": "java.util.ArrayList<com.modelop.sdk.dataobjects.v2.assets.ExternalFileAsset>", "serializationDataFormat": "application/json" } } } }

Notice the escaped value with the serialized asset list and the valueInfo with serialization info.
More info here (Variables in the REST API Camunda docs).

Launch MLC via MOC CLI

  • Make sure to have the MOC CLI installed.

  • Create a json file a similar structure as the one described in the body of the request above.

  • Trigger signal with the following command:
    moc mlc trigger --file <local file>

Additional details

moc mlc trigger -h

Trigger/launch an MLC process by providing signal object json body using --file or --body flag. Usage: moc mlc trigger [flags] Examples: # Trigger mlc using signal object from a file moc mlc trigger --file ./path/to/file/signal.json # Trigger mlc using raw json moc mlc trigger --body {"name":"com.modelop.mlc.definitions.Signals_start_data_drift","variables":{"TAG":{"value":"model_a","type":"Object","valueInfo":{"objectTypeName":"java.lang.String","serializationDataFormat":"application/json"}}}} Flags: --body string Provide JSON body for launching the MLC -f, --file string Use json from the file for launching the MLC -h, --help help for trigger

Follow up:

To follow up the process triggered via MLC there are several points of validation. Please look at the following diagram to identify them as explained below.

 

  1. Use the ‘processInstanceId' returned by the signalResponsive endpoint mentioned above, to call the following endpoint and retrieve the “jobId" from the JSON response.
    http://gateway/model-manage/api/jobHistories/search/findAllByJobMLCS_ProcessInstanceRootProcessInstanceId?processInstanceId={processInstanceId}

  2. Use the ‘jobId’ returned by the previous call, to check the status of the job in the following endpoint. 
    http://gateway/model-manage/api/jobs/{jobId}
    If the job finished successfully or finished in error will be (or is still running), will be visible on this state.

  3. But if the job never ran due to an error during the MLC, we can follow up on the runningInstance incidents through this endpoint.
    http://gateway/mlc-service/rest/incident?processInstanceId={processInstanceId}

 

Running batch job with CLI

Trigger Job Creation:

Launch MLC via REST API

  • Make sure to have the MOC CLI installed.

  • Create a json file a similar structure as the one described in the body of the request above.

  • Retrieve the deployment id
    moc deployment ls <storedModel name> --state deployed --tag <target stage>

  • Trigger signal with the following command:
    moc job create deployedbatch <deployedModel ID> <input_file> <output_file> [flags]

Additional details

moc job create deployedbatch -h

More info about the CLI command on the (moc job docs).

Follow up:

The above command returns the jobId which can be used in the following REST API endpoint to query the status:
http://gateway/model-manage/api/jobs/{jobId}