ranger_relay

Synopsis

All Apache Ranger commands will be performed using Ranger HTTP/REST API. This definition will provide informations for HADeploy to use this interface.

There should be only one entry of this type in the HADeploy definition file.

Attributes

ranger_relay is a map with the following attributes:

Name req? Description
host yes From which host these commands will be launched. Typically, an edge node in the cluster. But may be a host outside the cluster.
ranger_url yes The Ranger base URL to access Ranger API. Same host:port as the Ranger Admin GUI. Typically:
http://myranger.server.com:6080
or
https://myranger.server.com:6182
ranger_username no The user name to log on the Ranger Admin. Must have enough rights to manage policies
ranger_password no The password associated with the admin_username. May be encrypted. Refer to encrypted variables
principal no A Kerberos principal allowing all Ranger related operation to be performed. See below
local_keytab_path no A local path to the associated keytab file. This path is relative to the embeding file. See below
relay_keytab_path no A path to the associated keytab file on the relay host. See below
tools_folder no Folder used by HADeploy to store keytab if needed.
Default: /tmp/hadeploy_<user>/ where user is the ssh_user defined for this relay host.
validate_certs no Useful if Ranger Admin connection is using SSL. If no, SSL certificates will not be validated. This should only be used on personally controlled sites using self-signed certificates.
Default: yes
ca_bundle_relay_file no Useful if Ranger Admin connection is using SSL. Allow to specify a CA_BUNDLE file, a file that contains root and intermediate certificates to validate the Ranger Admin certificate in .pem format.
This file will be looked up on the relay host system, on which this module will be executed.
ca_bundle_local_file no Same as above, except this file will be looked up locally, relative to the main file. It will be copied on the relay host at the location defined by ca_bundle_relay_file
hdfs_service_name no In most cases, you should not need to set this parameter. It defines the Ranger Admin HDFS service, typically: <yourClusterName>_hadoop.
It must be set if there are several such services defined in your Ranger Admin configuration, to select the one you intend to use.
hbase_service_name no In most cases, you should not need to set this parameter. It defines the Ranger Admin HBase service, typically: <yourClusterName>_hbase.
It must be set if there are several such services defined in your Ranger Admin configuration, to select the one you intend to use.
kafka_service_name no In most cases, you should not need to set this parameter. It defines the Ranger Admin Kafka service, typically: <yourClusterName>_kafka.
It must be set if there are several such services defined in your Ranger Admin configuration, to select the one you intend to use.
hive_service_name no In most cases, you should not need to set this parameter. It defines the Ranger Admin Hive service, typically: <yourClusterName>_hive.
It must be set if there are several such services defined in your Ranger Admin configuration, to select the one you intend to use.
yarn_service_name no In most cases, you should not need to set this parameter. It defines the Ranger Admin Yarn service, typically: <yourClusterName>_yarn.
It must be set if there are several such services defined in your Ranger Admin configuration, to select the one you intend to use.
storm_service_name no In most cases, you should not need to set this parameter. It defines the Ranger Admin Storm service, typically: <yourClusterName>_storm.
It must be set if there are several such services defined in your Ranger Admin configuration, to select the one you intend to use.
policy_name_decorator no To distinguish Ranger policy managed by HADeploy, a naming convention is applied by default. The policy name, as it will appears in the GUI Ranger interface will be in the form HAD[<policyName>], where <policyName> is the name of the policy as you provide it.
This is achieved by wrapping the name with this pattern, where {0} is substituted with the policy name.
For python aware reader, this is performed as:
"HAD[{0}]".format(policyName).
If you just want to have raw policy name, simply define the parameter with {0}.
Default: HAD[{0}]
no_log no Boolean. Prevent some messages to be displayed, as they can contains sensitive information. This can make debugging somewhat more difficult, so temporary setting this swith to False may be useful to get more information (Including sensitive ones!).
Default True
when no Boolean. Allow conditional deployment of this item.
Default True

Kerberos authentication

When principal and ..._keytab_path variables are defined, Kerberos authentication will be activated for all Ranger operations. This means a kinit will be issued with provided values before any Ranger access, and a kdestroy issued after. This has the following consequences:

Regarding the keytab file, two cases:

Example

A simple configuration:

ranger_relay:
  host: en1
  ranger_url:  http://ranger.mycluster.mycompany.com:6080
  ranger_username: admin
  ranger_password: admin

A more secure configuration, with https, certificate validation and encrypted password.

encrypted_vars:
  ranger_password: |
    $ANSIBLE_VAULT;1.1;AES256
    34396662613462623565323936616330623661623065343033646136643635653430636238613962
    3537343131346462343138343064313937646366363435340a633532366162623838376436366362
    61393033343932303636653066336130616132383463373934396265306364363562613565613165
    6163613739303430650a356136353865623534643237646166393230613933396166663963633538
    3664

ranger_relay:
  host: en1
  ranger_url:  https://ranger.mycluster.mycompany.com:6182
  ranger_username: admin
  ranger_password: "{{ranger_password}}"  
  ca_bundle_local_file: cert/ranger_mycluster_cert.pem
  ca_bundle_relay_file: /etc/security/certs/ranger_mycluster_cert.pem

When using such encryption feature, you will need to provide a password when launching HADeploy. Otherwise you will have an error like the following:

The offending line appears to be:

  vars:
      rangerPassword: !vault |
                      ^ here

More detail on how to encrypt a value and providing a password on execution at encrypted variables

A configuration using Kerberos authentication:

ranger_relay:
  host: en1
  ranger_url:  http://ranger.mycluster.mycompany.com:6080
  principal: sa
  local_keytab_path: ./sa-gate17.keytab  

CA_BUNDLE

Internally, HADeploy use the python requests API to access Ranger. The provided ca_bundle_relay_file will be used as the verify parameter of all HTTP requests. More info here.

In some cases, a CA_BUNDLE may be simply the certificate of the Ranger server, in PEM format.

To grab this certificate, you may use a tiny python program like the following:

import ssl

if __name__ == '__main__':
    cert = ssl.get_server_certificate(("ranger.mycluster.corp.com", 6182), ssl_version=ssl.PROTOCOL_SSLv23)
    print cert
    f = open("cert.pem", "w")
    f.write(cert)
    f.close()