MapR¶

Warning

REMOVED Support for MapR is REMOVED. We recommend that users plan a migration toward a Kubernetes-based infrastructure.

DSS used to support MapR clusters with the following versions:

MapR Core Components: versions 5.2.0 to 6.1.0
MapR Ecosystem Pack (MEP): versions 3.0.x, and 4.1.x to 6.0.0

Security¶

Connecting to MapR clusters secured with MapR security is supported through a custom installation sequence described below.
User isolation is not supported.

Connecting to secure MapR clusters¶

DSS can connect to secure MapR clusters through a permanent service ticket, issued ahead of time by a cluster administrator, and accessed through environment variable MAPR_TICKETFILE_LOCATION.

The installation sequence thus becomes the following:

Open a shell session to a cluster administrator account (typically mapr).
Create a permanent service ticket for the service account used by DSS as follows:
```
maprlogin generateticket -type service -user DSS_USER -out DSS_TICKET_FILE
```
This creates a permanent service ticket (default duration 10 000 years). You can further adjust this with options to maprlogin.

You can check the service ticket generated with:
```
maprlogin print -ticketfile DSS_TICKET_FILE
```
Store this ticket file in a location accessible, and private to, the DSS service account.
Define the following environment variable in the persistent session initialization file for the DSS service account (.bash_profile, .profile or equivalent):
```
export MAPR_TICKETFILE_LOCATION=/ABSOLUTE/PATH/TO/DSS_TICKET_FILE
```
Switch to the DSS service account, and run the DSS installation or upgrade command as usual:
```
/PATH/TO/dataiku-dss-VERSION/installer.sh ARGS ...
```
This script will detect that the cluster is secure, and warn you that it will not automatically run the Hadoop integration step.
Run the install-hadoop-integration command with no arguments:
```
/PATH/TO/DSS_DATADIR/bin/dssadmin install-hadoop-integration
```
This script will warn you that you did not specify a Kerberos principal and keytab though the cluster is secure. Type <Enter> to confirm.

The Hadoop integration step should proceed without errors, using the ticket file to authenticate to the cluster.
You can then run the Spark and/or R integration steps using the standard procedures, as needed.
Start DSS and connect to the user interface using an administrator account:
```
/PATH/TO/DSS_DATADIR/bin/dss start
```
Complete the installation by configuring HiveServer2 and optionally Impala connection parameters as suitable for your cluster.

If using the default HiveServer2 authentication mode for secure MapR (MapR-SASL), the HiveServer2 connection parameters should be:
- Principal : leave empty
- Extra URL : auth=maprsasl;saslQop=auth-conf

Limitations¶

Using S3 as a Hadoop filesystem (see Hadoop filesystems connections (HDFS, S3, EMRFS, WASB, ADLS, GS)) is not supported
Validation of Hive recipes with “UNION” or “UNION ALL” statements is not possible