Using the APIs outside of DSS

Most of the Python APIs offered by DSS can be used outside of DSS. This allows you:

  • To automate various kinds of tasks related to DSS

  • To develop recipes, webapps, … or any kind of DSS code in your favorite IDE, outside of DSS.

Using the dataikuapi REST client package

Please see The main DSSClient class

Important note: if you follow the following instructions for using the dataiku package, you can then also use dataiku.api_client() idiom in order to get a handle to a REST client. You’ll still need to install the package, however

Using the dataiku package

The dataiku package is not available through pip. Instead, it can be obtained directly from your DSS instance itself.

Installing the package

Directly through pip

Run:

pip install http(s)://DSS_HOST:DSS_PORT/public/packages/dataiku-internal-client.tar.gz

If you use HTTPS without a proper certificate, you may need to add --trusted-host=DSS_HOST:DSS_PORT to your pip command line.

In a requirements.txt file

In your requirements.txt file, add a line:

http(s)://DSS_HOST:DSS_PORT/public/packages/dataiku-internal-client.tar.gz

Then update your requirements with pip install -r requirements.txt

If you use HTTPS without a proper certificate, you may need to add --trusted-host=DSS_HOST:DSS_PORT to your pip command line.

Manually with download

  • Download the package’s tar.gz file from your DSS instance: http(s)://DSS_HOST:DSS_PORT/public/packages/dataiku-internal-client.tar.gz

  • Install it with pip install dataiku-internal-client.tar.gz

Setting up the connection with DSS

In order to connect to DSS, you’ll need to supply:

  • The URL of DSS

  • A REST API key in order to perform actions

We strongly recommend that you use a personal API key. Please see Public API Keys for more information

There are three ways to supply this information:

  • Through code:

import dataiku

dataiku.set_remote_dss("http(s)://DSS_HOST:DSS_PORT/", "Your API Key secret")
  • Through environment variables. Before starting your Python, export the following environment variables:

export DKU_DSS_URL=http(s)://DSS_HOST:DSS_PORT/
export DKU_API_KEY="Your API key secret"
  • Through configuration file. Create or edit the file ~/.dataiku/config.json (or %USERPROFILE%/.dataiku/config.json on Windows), and add the following content:

{
  "dss_instances": {
    "default": {
      "url": "http(s)://DSS_HOST:DSS_PORT/",
      "api_key": "Your API key secret"
    }
  },
  "default_instance": "default"
}

You can now use most of the functions of the dataiku package from your own machine, independently from the DSS installation.

If at some point you need to clear the connection settings, you can do so with the following code:

dataiku.clear_remote_dss()
The configuration will be cleared.
If you are using the client within your DSS instance, it will target the API of your instance.
Otherwise, you will need to reset your connection by following the procedure described here.

Advanced options

Disabling SSL certificate check

If your DSS has SSL enabled, the packages will verify the certificate. In order for this to work, you may need to add the root authority that signed the DSS SSL certificate to your local trust store. Please refer to your OS or Python manual for instructions.

If this is not possible, you can also disable checking the SSL certificate:

  • Through code:

import dataiku

dataiku.set_remote_dss("http(s)://DSS_HOST:DSS_PORT/", "Your API Key secret", no_check_certificate=True)
  • Through environment variables: Not supported at the moment

  • Through configuration file: Modify the configuration file as such:

{
  "dss_instances": {
    "default": {
      "url": "http(s)://DSS_HOST:DSS_PORT/",
      "api_key": "Your API key secret",
      "no_check_certificate": true
    }
  },
  "default_instance": "default"
}