MapR

DSS supports MapR clusters with the following versions:

  • MapR Core Components: versions 5.2.0 to 6.0.1
  • MapR Ecosystem Pack (MEP): versions 1.1.x, 3.0.x, 4.1.x and 5.0.0

Security

  • Connecting to MapR clusters secured with MapR security is supported through a custom installation sequence described below.
  • Multi-user security is not supported.

Connecting to secure MapR clusters

DSS can connect to secure MapR clusters through a permanent service ticket, issued ahead of time by a cluster administrator, and accessed through environment variable MAPR_TICKETFILE_LOCATION.

The installation sequence thus becomes the following:

  • Open a shell session to a cluster administrator account (typically mapr).

  • Create a permanent service ticket for the service account used by DSS as follows:

    maprlogin generateticket -type service -user DSS_USER -out DSS_TICKET_FILE
    

    This creates a permanent service ticket (default duration 10 000 years). You can further adjust this with options to maprlogin.

    You can check the service ticket generated with:

    maprlogin print -ticketfile DSS_TICKET_FILE
    
  • Store this ticket file in a location accessible, and private to, the DSS service account.

  • Define the following environment variable in the persistent session initialization file for the DSS service account (.bash_profile, .profile or equivalent):

    export MAPR_TICKETFILE_LOCATION=/ABSOLUTE/PATH/TO/DSS_TICKET_FILE
    
  • Switch to the DSS service account, and run the DSS installation or upgrade command as usual:

    /PATH/TO/dataiku-dss-VERSION/installer.sh ARGS ...
    

    This script will detect that the cluster is secure, and warn you that it will not automatically run the Hadoop integration step.

  • Run the install-hadoop-integration command with no arguments:

    /PATH/TO/DSS_DATADIR/bin/dssadmin install-hadoop-integration
    

    This script will warn you that you did not specify a Kerberos principal and keytab though the cluster is secure. Type <Enter> to confirm.

    The Hadoop integration step should proceed without errors, using the ticket file to authenticate to the cluster.

  • You can then run the Spark and/or R integration steps using the standard procedures, as needed.

  • Start DSS and connect to the user interface using an administrator account:

    /PATH/TO/DSS_DATADIR/bin/dss start
    
  • Complete the installation by configuring HiveServer2 and optionally Impala connection parameters as suitable for your cluster.

    If using the default HiveServer2 authentication mode for secure MapR (MapR-SASL), the HiveServer2 connection parameters should be:

    • Principal : leave empty
    • Extra URL : auth=maprsasl;saslQop=auth-conf

Others