Prerequisites and limitations

Prerequisites

Hadoop

DSS multi-user security is supported on:

  • Cloudera CDH 5.8 to 5.11
  • Hortonworks HDP 2.5 and 2.6

The following configuration is required on your Hadoop cluster:

  • ACL support must be enabled.
  • Kerberos security must be enabled.
  • You need a keytab for the dssuser, as described in Connecting to secure clusters
  • You need administrator access to your Hadoop cluster to setup multi-user security (to setup the DSS impersonation authorization)

Spark

Multi-user security is only supported when Spark runs in the yarn-client mode of Spark.

Hive security

  • You must have Sentry or Ranger enabled
  • HiveServer2 impersonation must be disabled (this is the default setting)

DSS can work with a restricted-access Hive metastore (ie, when only HiveServer2 can talk to the Metastore server), but due to limitations in Spark, a restricted-access metastore will disable the following features:

  • Using a HiveContext in Spark code recipes (SQLContext remains available)
  • Using table definitions from the Hive metastore in Spark code recipes (including SparkSQL)
  • Running some visual recipes on Spark (since they require HiveContext-only features)

See Interaction with Hive and Impala, Interaction with Spark and DSS and Hive for more information.

Local machine

  • ACL support must be enabled on the filesystem which hosts the DSS data directory
  • You need root access to setup multi-user security

Accounts

For each UNIX / Hadoop user which will be impersonated by the DSS end-user (see Concepts for more details), the following requirements must be met:

  • The user must have a valid UNIX account.
  • The user must have a valid shell (ie, must be able to perform shell actions).
  • The user must have a writable home directory on HDFS

Groups

Each group of users in DSS should have a matching group of users locally and on Hadoop.

LDAP

While manual configuration of all user accounts is fully possible, we recommend that you use a LDAP directory to have a unique source of truth for all users and group mappings, in DSS, on UNIX, and on Hadoop.

DSS

Migrating a DSS instance which was previously running with regular security is highly not recommended. We highly recommend starting with an empty DSS instance when setting up multi-user security.

“Downgrading” a DSS instance from multi-user security to regular security is not supported.

Required information

In addition to the above prerequisites, you need to gather some information.

You will need to obtain an initial list of UNIX groups that your end users belong to. Only users belonging to these groups will be allowed to use the impersonation mechanisms.

Limitations

Unsafe features

When multi-user security is enabled, the following features are not available for end-users unless they have the “Write unsafe code” permission:

  • Write custom models in machine learning
  • Write custom partition dependency functions
  • Write Python UDF in data preparation

For more information about the “Write unsafe code” permission, see Write unsafe code: details

Hadoop distributions

Multi-user security is not supported on MapR and EMR

HDFS datasets

Write in “append” mode in a HDFS dataset can only be done if you always use the same end-user. Append by multiple Hadoop end-users is not supported.