Java runtime environment

Customizing Java runtime options

The main backend for Data Science Studio is a Java application. Runtime options can be customized.

What can be customized

All Java options can be customized via environment files.

Most often, you will want to customize the -Xmx variable, which is the maximum memory allocated to the Java process.

By default, Xmx is set to 2GB. This might not be enough for large dataset exploration samples. In that case, you could have backend crashes.

Customization instructions

Runtime options (as other environment variables) can be set in the DATA_DIR/bin/env-site.sh file in the Data Science Studio data directory.

The runtime options are stored in the DKU_BACKEND_JAVA_OPTS. The default value is stored in the DATA_DIR/bin/env-default.sh.

Note

On Mac OS X, the DATA_DIR is always: $HOME/Library/DataScienceStudio/dss_home

  • Open the DATA_DIR/bin/env-default.sh file
  • Copy the line starting by export DKU_BACKEND_JAVA_OPTS
  • Open the DATA_DIR/bin/env-site.sh file
  • Paste the DKU_BACKEND_JAVA_OPTS line and modify it to your needs
  • Restart Data Science Studio: DATA_DIR/bin/dss restart

Customizing the JVM

Data Science Studio requires an installation of Java Development Kit version 7 or 8. Supported versions are OpenJDK (http://openjdk.java.net) and Oracle JDK (http://www.oracle.com/technetwork/java/javase/downloads/index.html).

As part of the standard Data Science Studio installation, a suitable version of Java is looked for in standard locations, and if none is found the OpenJDK 7 system package appropriate for this distribution is pulled by the dependency installation phase.

You can force Data Science Studio to use a specific version of Java (for example, when there are several versions installed on the server, or when you manually installed Java in a non-standard place) by setting the DKUJAVABIN environment variable while running the DSS installer script. This variable should point to the java binary to use. For example:

$ DKUJAVABIN=/usr/local/bin/java dataiku-dss-VERSION/installer.sh <INSTALLER_OPTIONS>

Note that the installer script stores this value in the file DATA_DIR/bin/env-defaults.sh. You do not need to define it permanently for the Linux user account running the Studio.