Integrate with Spark

ModelOp Center seamlessly integrates with existing Spark environments, such as Databricks or Cloudera, allowing enterprises to leverage existing IT investments in their data platforms.

Table of Contents

 

Overview

To enable integration with existing Spark environments, ModelOp Center provides a Spark runtime microservice. This component is in charge of submitting Spark jobs into a pre-defined Spark cluster, monitoring their statuses, and updating them at model-manage accordingly. Also, it supports auto-enrollment at Eureka and model-manage, as well as OAuth2 interaction.

Spark runtime service should be able to run outside the K8s fleet, likely as an Edge Node.

 

Spark-Runtime-Service core components:

 

ModelOpJobMonitor for ModelBatchJobs:

  • Monitor the job repository for CREATED ModelBatchJobs with SparkRuntime engine.

    • Launch SparkSubmit Job using the SparkRuntime engine data.

    • Update Job status from CREATED to WAITING.

    • Update Job by appending SparkApplication Job ID.

  • Monitor job repository for WAITING & RUNNING ModelBatchJobs with SparkRuntime engine.

    • Extract SparkApplication Job ID and use it to query Job status at Spark cluster.

    • Update Job status and job repository accordingly with Spark cluster Job updates.

PySparkPreprocessor:

  • Component in charge of translating a ModelBatchJob into a PySparkJobManifest.

  • Read all the ExternalFileAsset inputs & outputs leaving them available for the PySparkJob as lists during runtime.

  • From ModelBatchJobs:

    • Extract and write StoredModel primary source code as tmp file to be executed inside SparkSubmit.

SparkSubmitService:

  • Build SparkSubmit execution out of PySparkJobManifest.

  • Read/Recover PySparkJob output from different sources.

  • Clean tmp file once SparkSubmit finished or failed.

  • Support Kerberos authentication.

ApplicationListener:

  • Auto-enroll as Spark engine at model-manage.

  • Support secured OAuth2 to talk with Eureka and model-manage.

Health Check (goal to leverage existing SpringCloud library actuator functionality):

  • Maintain alive/health status with Eureka.

  • If spark cluster experiences connectivity issues, it should be able to remove itself from Eureka and model-manage.