Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Background

Avro Schemas

 

ModelOp Center utilizes the Avro specification for schema checking. It can be found here:

 

https://avro.apache.org/docs/current/spec.html

 

For most models, records, arrays, or simple types suffice, but in some instances other aspects of the specification are more natural.

 

An Example

 

The following is an input schema for the Consumer Credit demo model

 

And for the output schema:

 

 

 

These specify the field names and types which are allowable. The following JSON objects would be allowable for each respective schema:

 

 

And on the output:

Missing Values

 

Missing values are not allowed under the Avro schema if the field type is any of the typical simple types (int, float, string, etc.). If a field is allowed to be missing, a union type can be used. For instance, if the field “annual_inc” is allowed to be missing, the appropriate line in the input schema could be replaced with:

 

{“name”:”annual_inc”, “type”: [“null”, “float”]},

 

ModelOp Runtime – Schema Enforcement

 

Batch Jobs

 

Batch jobs can be run from the ModelOp Center UI or from the CLI with schema checking enabled. When schema checking is enabled, records which do not conform to the provided schema are filtered out. If the input record fails the check against the input schema, then it is simply rejected by schema and not scored. If the output fails the output schema check, then the record is scored, but won’t be piped to the output file. This is so that the output is not allowed into a downstream application where it could cause errors.

 

REST

 

If a model is deployed to a ModelOp Center Runtime as a REST endpoint with schema checking enabled, requests made to that Runtime which fail either the input or output schema checks return a 400 error with a rejected by schema message.

 

Out-of-Bounds Values

 

If a value is out-of-bounds, it is recommended to put a check in the model source code which changes the type of that field. For instance, if a model is supposed to output a probability as above, then in the (Python) source code, a user could insert a line:

 

output[‘probability’] = output.probability.apply(lambda x: x if (x < 1) and (x > 0) else “Out of Bounds”)

 

The normal Avro schema checking will then fail the check as the numerical probability has been replaced by a string.

  • No labels