ModelOp Center seamlessly integrates with existing Spark environments, such as Databricks or Cloudera, allowing enterprises to leverage existing IT investments in their data platforms.

Table of Contents

Table of Contents

Overview

To enable integration seamlessly integrate with existing Spark environments, ModelOp Center provides offers a Spark runtime microserviceservice. This component is in charge of service is responsible for submitting Spark jobs to a pre-defined Spark cluster, monitoring their statuses, and updating them at in model-manage accordingly. AlsoAdditionally, it supports auto-enrollment with Eureka and model-manage, as well as OAuth2 secured interactionsalong with secure interactions via OAuth2.

The Spark runtime should be able to run usually runs outside the K8s fleet, likely but not exclusively in an edge node.

Consult Please check the following page for additional information on configuring the spark runtime service via helm: Configuring the Spark Runtime via Helm

Pre-requisites:

The node hosting the spark-runtime-service needs to meet the next criteria:

Ensure Apache Spark is installed on the host machine.
- Current validated spark and hadoop versions:
  - Spark 2.4
  - Hadoop 2.6
ENV variables are set
- SPARK_HOME
- HADOOP_HOME
- JAVA_HOME
- HADOOP_CONF_DIR
Hadoop cluster configuration files (e.g):
- hdfs-site.xml
- core-site.xml
- mapred-site.xml
- yarn-site.xml
Ensure host machine can communicate with Spark cluster. (e.g)
- master-node
- yarn
  - nodeManager:
    - remote-app-log-dir
    - remote-app-log-dir-suffix
  - resourceManager:
    - hostname
    - address
- hdfs
  - host
Ensure host machine can communicate with ModelOp Center and ModelOp Center Eureka (Registry)
Security
- Kerberos:
  - krb5.conf
  - Principal
  - keytab
  - jaas.conf ( optional )
  - jaas-conf-key ( optional )

Service	Port
Spark	7077 18088
Yarn	8032
HDFS	8020
ModelOp Center	8090 8761 (Eureka)

Kerberos glossary:

krb5.conf - Tells host machine how to talk to Kerberos. Tells host machine where to find the kerberos server and what rules to follow.

keytab - Secret key for host machine. Key used by the host machine to prove its allowed to execute specific actions in the Kerberos environemnt.

jaas.conf - file used by host machine that tells it how to interact with other applications or programs (such as kerberos) - It helps host machine to know location of the keytab.

Core components

of the spark-runtime-service

roles and responsibilities:

ModelOpJobMonitor

Monitor job repository for Monitors jobs of type MODEL_BATCH_JOB, MODEL_BATCH_TEST_JOB and MODEL_BATCH_TRAINING_JOB in CREATED state with a SPARK_RUNTIME as the runtime type
- Update the Updates job status from CREATED to WAITING
- Submit the Submits job for execution to the Spark cluster Update the job by appending the
- Updates job with Spark application id generated by the Spark cluster
Monitor the job repository for
Monitors jobs of type MODEL_BATCH_JOB, MODEL_BATCH_TEST_JOB and MODEL_BATCH_TRAINING_JOB in WAITINGor RUNNING state with a SPARK_RUNTIME as the runtime type
- Use the Uses Spark application id to query monitor job the status of the job while still running on the Spark cluster
- Update the Updates job status based on the final latest Spark application status
- Update the Updates job with the logs generated by the Spark cluster
- Update the job by storing the Updates job with output data , (if the output data contains embedded asset(s))
- Clean the Cleans job’s temporary working directory

PySparkPreprocessorService

Translate Translates the jobs of type MODEL_BATCH_JOB/MODEL_BATCH_TEST_JOB/MODEL_BATCH_TRAINING_JOB into a PySparkJobManifest
- Create temporary files for the Creates temp files used during execution, such as ModelOpPySparkDriver, primary source code, metadata, attachments other assets and non primary source code
  - The ModelOpPySparkDriver file is the main driver of the Spark job
  - The primary source code file is what will be executed in the Spark cluster
- Create Creates temporary HDFS file(s) , - (if input data contains embedded asset(s))

SparkLauncherService

Build a Builds the SparkLauncher from the content of the PySparkJobManifest for execution

ModelManageRegistrationService

Auto enroll enrolls the Spark runtime service as a SPARK_RUNTIME with model-manage

LoadRuntimesListener

Maintain an alive/healthy status with Eureka
Re-register again, if spark-runtime-service is lost for some reason

KerberizedYarnMonitorService

Authenticate Sends a heart beat to Eureka in order to keep service status as alive

KerberizedYarnMonitorService

Authenticates the principal with Kerberos before attempting to use the YarnClient

Versions Compared

Old Version 1

New Version Current

Key

Overview

Pre-requisites:

Kerberos glossary:

Core components

roles and responsibilities:

ModelOpJobMonitor

PySparkPreprocessorService

SparkLauncherService

ModelManageRegistrationService

LoadRuntimesListener

KerberizedYarnMonitorService

KerberizedYarnMonitorService

Page Comparison

Versions Compared

Old Version 1

New Version Current

Key

Overview

Pre-requisites:

Kerberos glossary:

Core components

roles and responsibilities:

ModelOpJobMonitor

PySparkPreprocessorService

SparkLauncherService

ModelManageRegistrationService

LoadRuntimesListener

KerberizedYarnMonitorService

KerberizedYarnMonitorService