DSS behind a reverse proxy

There are several cases where you might want to run DSS behind a reverse proxy:

  • Have DSS run on the standard HTTP / 80 port
  • Have DSS run in HTTPS mode (on the 443 port)

Note about Websockets

Data Science Studio uses the WebSocket protocol for parts of its user interface. This web protocol is fairly recent, and not yet supported by all HTTP proxies.

Make sure any direct or reverse proxy configured between Data Science Studio and its users correctly supports WebSocket, and is configured accordingly.

At the time of writing, this includes nginx version 1.3.13 and above (see nginx websocket proxying) and Apache 2.4.5 and above (with mod_proxy_wstunnel).

See Troubleshooting websockets for related details and troubleshooting advice.

HTTP Deployment behind a nginx reverse proxy

The following configuration snippet can be adapted to forward Data Science Studio interface through an external nginx web server, to accomodate deployments where users should access it through a different base URL than that of its native host and port installation (for example to expose Data Science Studio on the standard HTTP port 80, or on a different host name).

# nginx reverse proxy configuration for Dataiku Data Science Studio
# requires nginx version 1.4 or above
server {
    # Host/port on which to expose Data Science Studio to users
    listen 80;
    server_name dss.example.com;
    location / {
        # Base url of the Data Science Studio installation
        proxy_pass http://DSS_HOST:DSS_PORT/;
        proxy_redirect off;
        # Allow long queries
        proxy_read_timeout 3600;
        proxy_send_timeout 600;
        # Allow large uploads
        client_max_body_size 0;
        # Allow protocol upgrade to websocket
        proxy_http_version 1.1;
        proxy_set_header Host $http_host;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Warning

Data Science Studio does not currently support being remapped to a base URL with a non-empty path prefix (that is, to http://HOST:PORT/PREFIX/ where PREFIX is not empty).

HTTPS/SSL deployment behind a nginx reverse proxy

DSS can also be accessed using secure HTTPS connections, provided you have a valid certificate for the host name on which it should be visible (some browsers do not accept secure WebSocket connections using untrusted certificates).

You can configure this by deploying a nginx reverse proxy server, on the same or another host than Data Science Studio, using a variant of the following configuration snippet:

# nginx SSL reverse proxy configuration for Dataiku Data Science Studio
# requires nginx version 1.4 or above
server {
    # Host/port on which to expose Data Science Studio to users
    listen 443 ssl;
    server_name dss.example.com;
    ssl_certificate /etc/nginx/ssl/dss_server_cert.pem;
    ssl_certificate_key /etc/nginx/ssl/dss_server.key;
    location / {
        # Base url of the Data Science Studio installation
        proxy_pass http://DSS_HOST:DSS_PORT/;
        proxy_redirect off;
        # Allow long queries
        proxy_read_timeout 3600;
        proxy_send_timeout 600;
        # Allow large uploads
        client_max_body_size 0;
        # Allow protocol upgrade to websocket
        proxy_http_version 1.1;
        proxy_set_header Host $http_host;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Deployment behind an Apache reverse proxy

The following configuration snippet can be used to forward DSS through an Apache HTTP server:

# Apache reverse proxy configuration for Dataiku Data Science Studio
# requires Apache version 2.4.5 or above
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule proxy_wstunnel_module modules/mod_proxy_wstunnel.so

<VirtualHost *:80>
    ServerName dss.example.com
    ProxyPass / ws://DSS_HOST:DSS_PORT/
    ProxyPassReverse / ws://DSS_HOST:DSS_PORT/
    ProxyPreserveHost on
    ProxyTimeout 3600
</VirtualHost>

Warning

Data Science Studio does not currently support being remapped to a base URL with a non-empty path prefix (that is, to http://HOST:PORT/PREFIX/ where PREFIX is not empty).