This article explains how to prepare a model for production within ModelOp Center. It is targeted to enable data scientists to prepare models for deployment and operationalization.

Table of Contents

Table of Contents

Introduction

ModelOp Center provides a standard framework for defining a model for deployment and managing those models as first-class enterprise assets. Enterprise model deployment requires candidate models for production to be packaged as software assets for deployment. Model Development Notebooks cannot be deployed into Production. ModelOp Center provides the abstractions required to deploy, monitor, and govern a model.

The below article provides details on preparing a model for productionization within ModelOp Center, and specifically, in running a model in a ModelOp Runtime.

Additionally, please see the following Jupyter notebooks for a step-by-step code walk-through:

Loan Default Predictor: using the open source German Credit data set
House Pricing Predictor: using the open source Housing data set

Standard Model Abstractions in ModelOp Center

Of the several assets that define how a model will perform upon deployment, only a Model Source with a Scoring Function is required to initially register a model. The next section describes the available functions.

The list of available abstractions includes:

Model Source - Code that is called as the model is deployed, scored, trained, or validated. The Model Source can be backed with a git repo where it can be versioned and managed. See the next section for details.
Model Functions - Special functions within the Model Source which the runtime uses for specific tasks. See the next section for details.
Attachments - External files to be utilized during code execution. The contents of the attachments get extracted into the current working directory of the model source. This abstraction allows data scientists to load in different versions of the trained model artifact without changing the code used for scoring. Examples include any files or binaries that will be referenced during scoring including the trained model artifact.
Schemas - Define the data structure and features the model uses on its input as well as its output. The schema can reject the record received for scoring in order to prevent the model from erring. This also serves as the contract between the data pipeline and the model, as well as the data scientist and the data engineer.
Model Platform - Details the dependencies the model requires once deployed. This metadata is used to determine which runtime will be used during the deployment of the model. Note: this information is automatically captured based on the Jupyter Notebook environment if using the Jupyter Notebook plugin (see Integrate with Jupyter).
See Model Governance: Standard Model Definition for additional detail.

Functions within Model Source

Within the Source Code, you can designate the specific entry points, or functions, into the model code. The scoring function is the only function required for models that are to be deployed as REST. The other functions (metrics, training) define the other steps of a model’s life cycle and operationalization.

These functions are executed at different portions of the Model Life Cycle utilizing Batch Jobs. Batch Jobs can be executed manually as part of testing or automated within an MLC Process. This provides a flexible and scalable framework for deploying Models into Business. Batch Jobs including scoring, test (metrics), and training can be run manually or automated using an MLC Process.

The function examples in the following sections are a walk-through of the Model Source for a lasso regression model that detects emails that are forwarded outside the organization. These functions can be mapped from the Model Source using the UI or using smart tag comments. The smart tag comments follow the standard # modelop.<function_type>, where function_type is one of init, score, metrics, or train.

See Model Lifecycle Manager: Automation for more information about MLC Processes.
See Model Batch Jobs and Tests for details about Batch Jobs.

Init Function

The init function is executed when the model is initially deployed into the runtime. It initializes the model and loads any dependencies for scoring. If the model uses an attachment, the Init Function typically loads the trained model artifact for scoring. In this example of a regression model, the model artifacts are loaded from the pickle file with the trained model weights.

Code Block

language	py

# modelop.init
def begin():
    global lasso_model_artifacts 
    lasso_model_artifacts = pickle.load(open('lasso_model_artifacts.pkl', 'rb'))
    nltk.download('averaged_perceptron_tagger')
    pass

Scoring Function

The score function returns (or yields) predictions from input records. It executes when a record is sent to the endpoint of the runtime the model is deployed in. It can also be called using a Scoring Batch Job in the Command Center, and from the CLI. In this example of a regression model, the action function is tokenizing an incoming email, applying a bag-of-words transformation, applying a TFIDF vectorizer to the bag of words, and then using a LASSO regularized logistic regression to produce a classification.

NOTE: the below function must use yield as opposed to return. This ensures that the thread is not ended, as ModelOp wants to continually receive model requests.

Code Block

language	py

# modelop.score
def action(x):
    lasso_model = lasso_model_artifacts['lasso_model']
    dictionary = lasso_model_artifacts['dictionary']
    threshold = lasso_model_artifacts['threshold']
    tfidf_model = lasso_model_artifacts['tfidf_model']
    
    x = pd.DataFrame(x, index=[0]) 
    cleaned = preprocess(x.content)
    corpus = cleaned.apply(dictionary.doc2bow)
    corpus_sparse = gensim.matutils.corpus2csc(corpus).transpose()
    corpus_sparse_padded = pad_sparse_matrix(sp_mat = corpus_sparse, 
                                             length=corpus_sparse.shape[0], 
                                             width = len(dictionary))
    tfidf_vectors = tfidf_model.transform(corpus_sparse_padded)

    probabilities = lasso_model.predict_proba(tfidf_vectors)[:,1]

    predictions = pd.Series(probabilities > threshold, index=x.index).astype(int)
    output = pd.concat([x, predictions], axis=1)
    output.columns = ['content', 'id', 'prediction']
    output = output.to_dict(orient='records')
    yieldreturn output

Metrics Function

The metrics function calculates metrics around the model's performance based on labeled data. These metrics can include, for example, goodness-of-fit (back-test), bias, and interpretability. The function executes when a Test Batch Job is run on the model. in the example below, he the metrics function calculates the confusion matrix, ROC curve, AUC, and F2 score of the model.

For more details on how to write monitoring metrics for a model, see Model Efficacy Metrics and Monitoring .
The Metrics Function can also be used to calculate bias and Interpretability as discussed in Model Governance: Bias & Interpretability .

NOTE: per above, the below function must use yield as opposed to return.

Code Block

language	py

# modelop.metrics
def metrics(x):
    lasso_model = lasso_model_artifacts['lasso_model']
    dictionary = lasso_model_artifacts['dictionary']
    threshold = lasso_model_artifacts['threshold']
    tfidf_model = lasso_model_artifacts['tfidf_model']

    actuals = x.flagged
    
    cleaned = preprocess(x.content)
    corpus = cleaned.apply(dictionary.doc2bow)
    corpus_sparse = gensim.matutils.corpus2csc(corpus).transpose()
    corpus_sparse_padded = pad_sparse_matrix(sp_mat = corpus_sparse, 
                                             length=corpus_sparse.shape[0], 
                                             width = len(dictionary))
    tfidf_vectors = tfidf_model.transform(corpus_sparse_padded)

    probabilities = lasso_model.predict_proba(tfidf_vectors)[:,1]

    predictions = pd.Series(probabilities > threshold, index=x.index).astype(int) 
    
    confusion_matrix = sklearn.metrics.confusion_matrix(actuals, predictions)
    fpr,tpr,thres = sklearn.metrics.roc_curve(actuals, predictions)

    auc_val = sklearn.metrics.auc(fpr, tpr)
    f2_score = sklearn.metrics.fbeta_score(actuals, predictions, beta=2)

    roc_curve = [{'fpr': x[0], 'tpr':x[1]} for x in list(zip(fpr, tpr))]
    labels = ['Compliant', 'Non-Compliant']
    cm = matrix_to_dicts(confusion_matrix, labels)
    test_results = dict(roc_curve=roc_curve,
                   auc=auc_val,
                   f2_score=f2_score,
                   confusion_matrix=cm)    
    yieldreturn test_results

Training Function

The train function trains and retrain the model. It executes when a Training Batch Job is called on the model. The Training Function also provides context and traceability on the model origins. When a training job is executes, any files written to the temporary directory outputDir/ (as in the example below) will be written to S3 as an External File Asset. The outputDir/ directory is created at job execution on the container where the runtime is running. It is therefore only persisted until the job completes.

Code Block

language	py

# modelop.train
def train(data):
	y_train = data.flagged
	removed_proper_nouns = data.content.astype(str).apply(remove_proper_nouns)

	CUSTOM_FILTERS = [lambda x: x.lower(), 
                  gensim.parsing.preprocessing.strip_tags, 
                  gensim.parsing.preprocessing.strip_punctuation]

	removed_punctuation = removed_proper_nouns.apply(functools.partial(gensim.parsing.preprocess_string, filters=CUSTOM_FILTERS))
	
	stemmer = nltk.stem.porter.PorterStemmer()

	#Remove stop words, words of length less than 2, and words with non-alphabet characters.
	cleaned = removed_punctuation.apply(lambda x: list(map(gensim.parsing.preprocessing.remove_stopwords, x)))

	cleaned = cleaned.apply(lambda x: list(filter(lambda y: len(y) > 1, x)))

	cleaned = cleaned.apply(lambda x: list(filter(lambda y: y.isalpha(), x)))

	cleaned = cleaned.apply(lambda x: list(map(stemmer.stem, x)))
	
	#Create a dictionary (key, value pairs of ids with words which appear in the corpus.
	dictionary = gensim.corpora.dictionary.Dictionary(documents=cleaned)

	dictionary.filter_extremes(no_below=5, no_above=0.4)
	
	# Produce a sparse bag-of-words matrix from the word-document frequency counts
	corpus = cleaned.apply(dictionary.doc2bow).to_list()

	corpus_sparse = gensim.matutils.corpus2csc(corpus).transpose()
	
	# Train a tf-idf transformer and transform the training data
	tfidf_model = sklearn.feature_extraction.text.TfidfTransformer()
	train_tfidf = tfidf_model.fit_transform(train_corpus_sparse)
	
	# Define and fit a logistic regression model
	logreg = sklearn.linear_model.LogisticRegression(penalty='l1', class_weight='balanced', max_iter=2500, random_state=740189)

	logreg_model = logreg.fit(X=train_tfidf, y=y_train)
	
	lasso_model_artifacts = dict(lasso_model = logreg_model, 
                             dictionary = dictionary, 
                             tfidf_model = tfidf_model, 
                             threshold = thresh)
                             
	with open('outputDir/lasso_model_artifact.pkl', 'wb') as f:
		pickle.dump(lasso_model_artifacts, f)
	pass

Preparing Production Ready Spark Models

Include Page

	dv30:Spark Details: Preparing a Spark Model
	dv30:Spark Details: Preparing a Spark Model

Next Article: Register a Model >

Versions Compared

Old Version 3

New Version Current

Key

Introduction

Standard Model Abstractions in ModelOp Center

Functions within Model Source

Init Function

Scoring Function

Metrics Function

Training Function

Preparing Production Ready Spark Models

Page Comparison

Versions Compared

Old Version 3

New Version Current

Key

Introduction

Standard Model Abstractions in ModelOp Center

Functions within Model Source

Init Function

Scoring Function

Metrics Function

Training Function

Preparing Production Ready Spark Models