Integrate with Git
ModelOp Center seamlessly integrates with existing enterprise source code management (SCM) systems, such as Bitbucket, Github, and Git, to allow enterprises to leverage existing IT investments.
Table of Contents
Introduction
ModelOp Center allows integration with Git platforms in order to externalize the management of model assets as well as allow for a distributed development of such resources. This integration allows for the development of such assets to be done from the platform (IDE) of choice of the Data Scientist providing greater flexibility using a widely accepted technology such as Git.
ModelOp Center allows different configured levels of interaction with Git:
No Git interaction: Model assets are not stored in a Git repository but rather are stored in a local database.
Local Git repository: It is possible to create and work in a local Git repository within ModelOp Center where changes to the asset will be committed changes. A remote git repository can be added to the asset at any point.
Remote Git repository. It is also possible to configure a remote git repository to sync external changes back into ModelOp Center working directory. A configuration option controls whether the commits in the repo are pushed to the remote.
Remote import
Given a remote repository URL to clone, ModelOp Center clones down the repo and creates a “Model” with all files represented as assets within it. All non-source files are imported as external assets and loaded into the configured external file asset repository instance. ModelOp Center works with multiple external file asset repository instances, including MinIO, S3, HDFS and Azure Blob Storage. The user can select the desired repository instance during import or it will be automatically determined based on the selected OAuth2/LDAP group, if configured.
All source assets will be loaded into the “Model” instance, and have their repository information loaded appropriately.
Source files are automatically determined (by file extension):
Non-source files are considered any other type of files.
Note that the following assets are considered “meta-assets” and are only loaded to store additional information into the model metadata
metadata.json
required_assets.json
external_assets.json
Setting the following property to “true” allows these files to also appear as model assets as they are imported.
# Enable the meta-asset load into the model assets list (metadata, required_assets, external_assets)
model-manage:
meta-assets:
add-to-model: false
ModelOp Center allows importing a git repository through ModelOp Center Dashboard
Click on the “Inventory” menu item.
You can add a “Use Case” or directly Import a model by switching the option to “Implementations” (see Use Cases )
On the top right corner click on “+ Add Implementation”. (see Add an Implementation )
Click on the “Git” option.
Provide the details of the “Remote Repository Url” , “Branch” , and “Access Group”. Other attributes like “Model Name“ and “Description”, “Model Methodology” and “Type” are optional, as they are inferred from the imported model by default.
Please note that the URL to extract depends on your Git platform. Generally, you can find this URL by going to the “Clone” button and selecting the “HTTPS clone” option.Click on “Next”
The following screen will contain an additional form required by your Organization to be filled out upon model registration. (see Custom Form Administration )
Click on “Submit”
Note: these steps assume the repository is public without authentication. The following sections detail how to configure the integration when authentication is required.
Git Integration
In order to integrate authenticated remote repository, ModelOp Center will need to be configured to utilize a Service Account to pull from the specified repository. This Service Account should be able to read and/ or write for the repositories to imported and integrated with ModelOp Center.
Git Credentials
The username and password to integrate with the Git repository are set as properties in the container definition for model-manager. These can be obscured using Kubernetes Secrets or other existing credential management systems.
model-manage.git.username=<The git username>
model-manage.git.password=<The username's passphrase>
The above configuration will be used to access all remote repositories from ModelOp Center, which requires the Service Account to be added to integrated repositories.
Alternatively, it is possible to add more granularity to the credentials used for git.
model-manage:
git:
username: <Overall git username> # <-- same as specified above
password: <Overall git passphrase> # <-- same as specified above
storedCredentials:
- context: https://github.com/
username: <first context user>
password: <first context passphrase>
- context: https://gitlab.com/
username: <second context user>
password: <second context passprhase>
The above allows us to determine different git credentials for ModelOp Center depending on the URL the instance is trying to reach out to.
Note: The same credential selection criteria is used as it works for git, so we can leverage Git Config (refer to subsection below) to add additional features like useHttpPath
(refer to git docs Git - gitcredentials Documentation ) and achieve a configuration that allows us one set of credentials per repository, if we really needed to.
Local Repository Settings
ModelOp Center also provides additional parameters to configure the behavior of the local repository.
Container: "model-manager":
Git Config
It is possible to add git-specific configurations to control how ModelOp Center’s git repository behaves. Please refer to git scm for more details on the git configuration and how it can be used to achieve a certain behavior.
The idea behind ModelOp Center’s git config customization is that any git config variable can be mapped in the following way.
The example desired git config file being the following:
Then the expected ModelOp Center configuration would be the following:
Please refer to git-scm for a more comprehensive list of variables that can be set through git config.
Load on startup
Model Manage can be configured to automatically import repositories upon startup.
Please note that this import operation will be performed only once per model in this environment, if the model has already been imported it will not attempt to import it again, or to perform any of the valid post-import operations.
Valid options
repositoryBranch
- Indicates the remote git repository branch to import from.repositoryRemote
- Indicates the remote git repository clone URL to import from.createBaseSnapshot
(optional) - Indicates if an initial Snapshot is desired to be created right after import. Value is “false” by default.group
(optional) - Indicates the group that this model will be imported as. If this value is not present it will default to ModelOp’s default group (configured in the property here:oauth2.group-base-access.default-access-group
, ‘modelop’ by default).deployedModel
.runtimeName
(optional) - The name of the target runtime to deploy as batch right after import.deployedModel
.schedule
.quartzSchedule
(optional) - A valid Quartz expression to schedule the execution of a provided signal name.deployedModel
.schedule
.signalActionName
(optional) - A valid Signal name to trigger for a given schedule.runtimeWaitTimeout
(optional) - The amount of time in milliseconds to wait in the background for the runtime to be available so that we can proceed to deploy the model after import (if the deployedModel section was provided). This value is 10 minutes by default (600000 ms).
A snapshot (and deployment) will only be created if deployedModel.runtimeName
is not empty. If this is set a background thread will wait (up to a configurable amount of time) for the runtime to register with model-manage so that it can be used as the engine for the deployed model. The name, type, and group are also used in the target runtime for the snapshot.
View Asset Git Details
ModelOp Center provides multiple ways to see the details of the git integration for various assets.
ModelOp Center Dashboard
The model’s assets repository configuration can be seen within the ModelOp Center Dashboard.
Click on the “Models” menu item.
From the list on the main panel, select the desired model to inspect.
Click the Repository tab.
Note that ModelOp Center provides details of the last sync with the backing git repository. The sync rate is set in the ModelOp Center core configuration files, but is typically 2-3 minutes by default. The user can click “Sync Git” to force a git sync immediately.
Jupyter Notebook Plugin
Git configuration is available directly within a Notebook via the Jupyter Notebook Plugin. When registering or opening a model, these details are available in the “View” button, a ModelOp specific Cell Toolbar button.
Within the Jupyter Notebook, click on the menu “View”.
Click on “Cell Toolbar”.
Select the “ModelOp Model” toolbar option.
On the desired model asset (cell) click on the toolbar button “Asset Details”. This button will be visible for ModelOp registered models only.
If the selected asset does not have a repository configured previously, it will allow the user to configure it.
If the selected asset already had a repository configured, these values can be modified as well.
Related Articles