Integrate with AWS Sagemaker

ModelOp allows users to connect with AWS SageMaker in order to view and operate their models, jobs, and endpoint configurations / endpoints.

Table of Contents

 

AWS SageMaker preparation

Model

Navigate to your Models in AWS and select your target model.

 

As you can see, the model has certain settings, such as:

  1. Name, ARN, Creation Time

  2. A container in which a training job was run

  3. Certain network properties and tags

You will be using your target model’s name and your AWS credentials to upload to ModelOp center.

Endpoint configuration / Endpoints

Endpoint configuration

The AWS endpoint configuration is equivalent to a Snapshot/Deployable model in ModelOp Center. When creating an endpoint you must associate it with your target model. You can also use the relevant tags for your instance. Below is an example of an endpoint configuration.

 

Endpoint configurations include:

  1. Name, ARN, Creation Time

  2. Production variants such as a model

    1. An Endpoint configuration could have 1 or more models associated with it.

  3. Tags

 

Endpoint

Endpoints are the equivalent of a Deployed Model running on a run time in ModelOp Center. On the endpoint configuration page, the user is able to create an endpoint for a given endpoint configuration.

 

Endpoints include:

  1. Name, ARN, Creation Time, Status

  2. Endpoint Configuration Settings from your Endpoint Configuration

  3. Monitor

 

As part of the configuration we have to make sure that the AWS service account has the right access to the SageMaker operations.

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "sagemaker:ListEndpointConfigs", "sagemaker:DescribeEndpointConfig", "sagemaker:ListTags", "sagemaker:ListEndpoints", "sagemaker:DescribeEndpoint", "sagemaker:DescribeModel", "sagemaker:Search", "sagemaker:ListTransformJobs", "sagemaker:DescribeTransformJob", "sagemaker:CreateTransformJob", "sagemaker:StopTransformJob", "sagemaker:CreateEndpoint", "sagemaker:CreateEndpointConfig", "sagemaker:DeleteEndpoint", "sagemaker:DeleteEndpointConfig", "sagemaker:UpdateEndpoint", "sagemaker:UpdateEndpointWeightsAndCapacities" ], "Resource": [ "*" ] }, { "Sid": "Statement1", "Effect": "Allow", "Action": [ "cloudwatch:ListMetrics", "cloudwatch:GetMetricData" ], "Resource": [ "*" ] }, { "Sid": "Statement2", "Effect": "Allow", "Action": [ "logs:DescribeLogStreams", "logs:GetLogEvents" ], "Resource": [ "*" ] } ] }

ModelOp Center AWS SageMaker Configuration

To leverage the AWS SageMaker integration, the ModelOp Center “SageMaker Service” must be running and configured, per the instructions below.

Depending on what authentication type used, there will be different credentials. ModelOp Center supports BASIC and ASSUME_ROLE_WITH_WEB_IDENTITY types.

 

For BASIC authentication type, the following should be applied in the ModelOp Center SageMaker configuration

sagemaker-credentials: storedCredentials: - group: authenticationType: BASIC accessKeyId: xxxxxxxxxxxxxxx secretAccessKey: xxxxxxxxxxxxxxxxxxxxxxxxxx region: us-east-2
  • accessKeyId and secretAccessKey:

    • These act as a username and password for AWS SageMaker so both are required to authenticate. These will be unique to a given user.

  • region:

    • This is the AWS location of your SageMaker instance. The region information can be found in the AWS account webpage or on the header bar of the AWS webpage.

 

For ASSUME_ROLE_WITH_WEB_IDENTITY authentication type, the following should be applied in the ModelOp Center SageMaker configuration

sagemaker-credentials: eksServiceAccountCredentials: - path: roleArn: <AWS-ROLE-ARN> webIdentityTokenFile: <AWS-Injected-file>
  • roleArn:

    • This is the Role Amazon Resource Name. This information can be found in AWS console in your desired resource

  • webIdentityTokenFile:

    • The file containing the token that should be used in order to authenticate the user.

Assume Role by group configuration (optional)

Optionally ModelOp Center allows a granular group-to-role mapping where we can match a specific AWS IAM Role to be used for a specific ModelOp Center user group (from the user’s Identity Directory).

The above allows for the organization to control the specific access level that each model owner gets upon importing the SageMaker model to ModelOp Center.

Import onto ModelOp UI

  1. To register a model in ModelOp Center, navigate to the Business Models tab and select “Add Business Model” in the top right corner.

  2. Provide the model name and optional description.

  3. It is recommended that the credentials be configured within the environment as seen in the Integration section above. If not, chose to Enter Credentials and enter the region, access key and secret key at this step.

Please note the option to “Enter Credentials” in the screen above, can be visually removed by setting the following property in the gateway-service configuration yaml:

 

When a SageMaker model is registered with ModelOp Center, it is represented and handled as a Stored Model.

  1. The same model specific information available on the AWS console is available in ModelOp Center, including the model name, ARN, tags, creation time, and links to training jobs.

  2. Within ModelOp Center, an Endpoint Configuration is represented as a snapshot. Each snapshot includes its name, tags, creation time, endpoint configuration specific fields, model details, tests and custom metadata. If there is no endpoint configuration for your model, one will be created by default.

  3. If a SageMaker Endpoint was created, it is represented in ModelOp Center as a “SageMaker Runtime” with a snapshot deployed. The engine details show the most relevant information that is available on the AWS console.

  4. All previously run jobs can be viewed and new jobs can be created within ModelOp Center.

  5. Finally, when ModelOp Center imported the model, all of the TRANSFORM and TRAINING jobs from AWS, for this particular model, were imported as well.

 

Import Elements

Importing a SageMaker model will import the following elements into ModelOp Center:

  • Model Information on the Registry

  • Related Training Jobs

  • Related Batch Transform Jobs

  • Related Endpoint configurations

  • Related Endpoints

 

AWS SageMaker Job import will include:

  • Job execution dates, execution time, and other available basic info

  • Input data s3 url

  • Output data s3 url

  • Link to the SageMaker console for the job

 

AWS SageMaker Endpoint Config Import will include:

  • Basic details of the endpoint config.

  • Link to the SageMaker console for the endpoint config

 

AWS SageMaker Endpoint Import will include:

  • Basic details of the endpoint.

  • Link to the SageMaker console for the endpoint

 

MLCs

SageMaker deploy model

In this MLC, there is an option to run a transform job and/or deploy a SageMaker endpoint. The MLC is triggered by:

  • Importing a SageMaker model with existing Endpoint Configurations.

OR

  • Creating an Endpoint Configuration in SageMaker on an already imported model.

 

Once the MLC is triggered it will follow this flow:

  1. Verifies that the required assets (documentation, schema, training data, test result comparator, test data asset) are present.

  2. If Test Data Asset is available, the MLC runs a test SageMaker transform job.

  3. Creates a Jira approval ticket.

  4. If approved, a SageMaker Endpoint will be created, which translates to a new Runtime.

 

In these steps we see that a transform job is ran if Test Data Asset is present, once the job has completed successfully it is sent to be deployed. Alternatively, if you do not wish to run a transform job do not include Test Data in your model and simply approve the ticket requesting deployment and the model will be sent on the path to deploy. When a SageMaker model is deployed this creates a SageMaker Endpoint.

 

There are cases that this MLC fails. These include:

  1. The MLC will trigger for every new Snapshot, but will quickly exit if the model is not a SageMaker model.

  2. If the SageMaker Transform job is run and failed, it will create a Jira ticket notification, and won't try to deploy to an Endpoint.