Yarn plugin: Overview

Goal

Aim of the Yarn plugin is to handle Yarn services Life cycle

By 'Yarn services', we mean Yarn jobs running indefinitely, such as Spark Streaming jobs.

This all also named as 'Yarn Long Running Services', or 'Yarn Long Running Jobs'.

This plugin is NOT intended to manage Batch jobs.

How it works

This is achieved by:

All these operations are performed on a specific node included in the cluster. This node is designated as a yarn_relay.

Requirement

There is some requirement for the launching script.

Yarn services deployment.

The deployment of each services by itself is NOT in the scope of this plugin. Typically this consist in:

This at least on the Yarn relay node and eventually on one or several other nodes, for resiliency.

All these tasks can be achieved using this HADeploy folders and files specification.

Templating mechanism and support of Maven repository built in the files plugin will be of great help here.

Actions stop,start and status

The yarn plugin introduce three new actions:

hadeploy --src ..... --action start

Will start all services described by the yarn_services list. And

hadeploy --src ..... --action stop

Will kill the same services. While

hadeploy --src ..... --action status

will display current status of the services, in a rather primitive form.

Also, the Yarn plugin kill all running services at one of the first step of the removal action (--action remove).

Of course, all this will occur only on services HADeploy is aware of (Defined with yarn_services). Other services will not be impacted.

Services shutdown.

When HADeploy is instructed to halt all services (--action stop), by default, it will use the RM REST API, setting application in the 'KILLED' state. This is equivalent to a yarn application --kill command.

An alternate way to shutdown a yarn job is to provide a script issuing the kill command, and to define such script using the killing_cmd attribute. This can be used in the following case:

Notifications: Services restart

Let's say we now want to update the service's jar or one of the associated configuration files.

We can modify it and trigger a new deployment. HADeploy will notice the modification and push the new version on the target hosts. But, the running services will be unaffected.

We can restart it manually. But, HADeploy provide a mechanism to automate this. By adding a notify attribute to the files definition. See the example below.

Ranger support.

Ranger handling on Yarn jobs is based on Yarn Queue management. HADeploy allow you to define such permission using yarn_ranger_policies.

Example

Here is a snippet describing the deployment of a simple Yarn services 'datastep':


vars:
  yarn_launcher_host: en1
  basedir: "/opt/datastep"
  user: dsrunner
  group: dsrunner
  datastep_version: "0.1.0-SNAPSHOT"

yarn_relay:
  host: ${yarn_launcher_host}

maven_repositories:
- name: myrepo
  snapshots_url: http://myrepo.mydomain.com/nexus/repository/maven-snapshots/
  releases_url: http://myrepo.mydomain.com/nexus/repository/maven-releases/

folders:
- { path: "${basedir}", scope: "${yarn_launcher_host}", owner: "${user}", group: "${group}", mode: "755" }

files:
- { scope: "${yarn_launcher_host}", src: "mvn://myrepo/com.mydomain/datastep/${datastep_version}/uber", 
    notify: ['yarn://datastep'], dest_folder: "${basedir}", owner: "${user}", group: "${group}", mode: "0644" }

- { scope: "${yarn_launcher_host}", src: "tmpl://submit.sh", dest_folder: "${basedir}", 
    notify: ['yarn://datastep'], owner: "${user}", group: "${group}", mode: "0744" }

- { scope: "${yarn_launcher_host}", src: "tmpl://kill.sh", dest_folder: "${basedir}", 
    notify: ['yarn://datastep'], owner: "${user}", group: "${group}", mode: "0744" }

yarn_services:
- name: datastep
  launching_cmd: ./submit.sh
  launching_cmd: ./kill.sh
  launching_dir: ${basedir}

And here is what could be a simplistic submit script template:

#/bin/bash

{% if kerberos is defined and kerberos %}
kinit -kt /etc/security/keytabs/{{user}}.keytab {{user}}
{% endif %}

spark-submit --name datastep --master yarn --deploy-mode cluster --class com.mydomain.datastep.Main \
    --conf "spark.yarn.submit.waitAppCompletion=false" \
    --jars {{basedir}}/datastep-{{datastep_version}}-uber.jar 

{% if kerberos is defined and kerberos %}
kdestroy
{% endif %}

And a killing script:

#/bin/bash

{% if kerberos is defined and kerberos %}
kinit -kt /etc/security/keytabs/{{user}}.keytab {{user}}
{% endif %}

APPLICATION_ID=$(yarn application --appStates RUNNING --list 2>/dev/null | awk "{ if (\$2==\"datastep\") print \$1 }")

if [ "$APPLICATION_ID" = "" ]
then
    echo "?? Not running"
else
    yarn application --kill ${APPLICATION_ID} 2>/dev/null
    echo "$APPLICATION_ID Killed!"
fi

{% if kerberos is defined and kerberos %}
kdestroy
{% endif %}

This is of course not complete, as it lack at least the target cluster definition.

Please refer to yarn_relay and yarn_services for a complete description. And to files for the notify syntax.

Of course, before being able to launch the services (--action start), a deployment must be performed before (--action deploy)