Customizing DSS installation¶
Installation configuration file¶
The installation process for Data Science Studio can be customized through the DATADIR/install.ini
configuration file.
This file is initialized with default values when the data directory is first created. It can be edited to specify a number of non-default installation options, which are then preserved upon upgrades.
Modifying this file requires running a post-installation command to propagate the changes, and restarting DSS, as follows:
# Stop DSS
DATADIR/bin/dss stop
# Edit installation options
vi DATADIR/install.ini
# Regenerate DSS configuration according to the new settings
DATADIR/bin/dssadmin regenerate-config
# Restart DSS
DATADIR/bin/dss start
The install.ini
installation configuration file is a standard INI-style Python configuration file with [section]
headers followed by key = value
entries.
The following entries are set up by the initial installation and are mandatory:
[general]
# DSS node type (design node, api node...)
nodetype = design
[server]
# DSS base port
port = 11200
Additional installation options are described throughout this manual.
Configuring HTTPS¶
By default, DSS listens to HTTP connections on the given base port, i.e. is accessible at address http://DSS_HOST:DSS_PORT
. Using installation
configuration directives, you can switch DSS to accepting HTTPS connection instead, i.e. answering https://DSS_HOST:DSS_PORT
.
You will need to generate and provide a SSL server certificate and private key file matching the domain name used by end users to reach DSS. You can
then configure DSS to switch to HTTPS by adding the following entries to the [server]
section of the install.ini
installation configuration
file:
[server]
ssl = true
ssl_certificate = PATH_TO_CERTIFICATE_FILE
ssl_certificate_key = PATH_TO_PRIVATE_KEY_FILE
ssl_ciphers = recommended
You should then regenerate DSS configuration and restart DSS, as described in Installation configuration file.
Note
The optional ssl_ciphers = recommended
configuration key restricts the set of SSL ciphers accepted by DSS to a safe subset, for better protection
against known attacks, while staying compatible with most recent browsers and DSS-supported Linux platforms.
Setting this key to default
(or omitting it altogether) does not configure any restriction on the accepted SSL ciphers, which then fall back to the default list built into the nginx server.
Note
You can also expose DSS to users over HTTPS by interposing a reverse proxy. This option is mandatory if you want to use default HTTPS port 443, as DSS cannot run with the superuser privileges necessary to listen on this port.
Note
If all DSS users access it over HTTPS, you can enforce session cookies security as described in Advanced security options.
Configuring IPv6 support¶
By default, DSS listens to IPv4 connections only. Using the following installation configuration directive, you can configure DSS to listen to IPv6 connections to its base port, in addition to IPv4 connections.
[server]
ipv6 = true
You should then regenerate DSS configuration and restart DSS, as described in Installation configuration file.
Configuring log file rotation¶
Main DSS processes log files¶
DSS processes write their log files to directory DATADIR/run
:
backend.log |
Main DSS process (backend) |
hproxy.log |
Hadoop connectivity process (hproxy, optional) |
nginx.log |
HTTP server (nginx) |
ipython.log |
Python / R notebook server (ipython) |
supervisord.log |
Process control and supervision |
By default, these log files are rotated when they reach a given size, and purged after a given number of rotations. The following installation configuration directives can be used to customize this behavior:
[logs]
# Maximum file size, default 100MB.
# Suffix multipliers "KB", "MB" and "GB" can be used in this value.
logfiles_maxbytes = SIZE
# Number of retained files, default 10.
logfiles_backups = NUMBER_OF_FILES
You should then regenerate DSS configuration and restart DSS, as described in Installation configuration file.
Additional DSS log files¶
In addition to the main log files described above, DSS generates two additional log files in directory DATADIR/run
, which are handled differently:
nginx/access.log
: This is the access log for DSS HTTP server. Under normal utilization this file grows only slowly compared to the previous ones. It is not rotated automatically, but can be rotated manually through the standard nginx procedure, or using the manual log file rotation command described below.frontend.log
: This is a low-level log for debug purposes only. It is rotated independently of the others, on a non-configurable schedule.
Manual log file rotation¶
The following command forces DSS to close and reopen its log files (main DSS processes log files and nginx access log). Combined with standard tools
like logrotate(8)
, and the possibility to disable automatic log rotation as described above, this lets you take full control over the DSS log
rotation process, and integrate it in your log file handling framework.
# Use standard Unix commands to rename DSS current log files
...
# Force DSS to reopen new log files
DATADIR/bin/dss reopenlogs