Using the APIs outside of DSS¶
Most of the Python APIs offered by DSS can be used outside of DSS. This allows you:
To automate various kinds of tasks related to DSS
To develop recipes, webapps, … or any kind of DSS code in your favorite IDE, outside of DSS.
Using the dataikuapi
REST client package¶
Please see The main DSSClient class
Important note: if you follow the following instructions for using the dataiku
package, you can then also use dataiku.api_client()
idiom in order to get a handle to a REST client. You’ll still need to install the package, however
Using the dataiku
package¶
The dataiku
package is not available through pip. Instead, it can be obtained directly from your DSS instance itself.
Installing the package¶
Directly through pip¶
Run:
pip install http(s)://DSS_HOST:DSS_PORT/public/packages/dataiku-internal-client.tar.gz
If you use HTTPS without a proper certificate, you may need to add --trusted-host=DSS_HOST:DSS_PORT
to your pip command line.
In a requirements.txt file¶
In your requirements.txt file, add a line:
http(s)://DSS_HOST:DSS_PORT/public/packages/dataiku-internal-client.tar.gz
Then update your requirements with pip install -r requirements.txt
If you use HTTPS without a proper certificate, you may need to add --trusted-host=DSS_HOST:DSS_PORT
to your pip command line.
Manually with download¶
Download the package’s tar.gz file from your DSS instance:
http(s)://DSS_HOST:DSS_PORT/public/packages/dataiku-internal-client.tar.gz
Install it with
pip install dataiku-internal-client.tar.gz
Setting up the connection with DSS¶
In order to connect to DSS, you’ll need to supply:
The URL of DSS
A REST API key in order to perform actions
We strongly recommend that you use a personal API key. Please see Public API Keys for more information
There are three ways to supply this information:
Through code:
import dataiku
dataiku.set_remote_dss("http(s)://DSS_HOST:DSS_PORT/", "Your API Key secret")
Through environment variables. Before starting your Python, export the following environment variables:
export DKU_DSS_URL=http(s)://DSS_HOST:DSS_PORT/
export DKU_API_KEY="Your API key secret"
Through configuration file. Create or edit the file
~/.dataiku/config.json
(or%USERPROFILE%/.dataiku/config.json
on Windows), and add the following content:
{
"dss_instances": {
"default": {
"url": "http(s)://DSS_HOST:DSS_PORT/",
"api_key": "Your API key secret"
}
},
"default_instance": "default"
}
You can now use most of the functions of the dataiku
package from your own machine, independently from the DSS installation.
If at some point you need to clear the connection settings, you can do so with the following code:
dataiku.clear_remote_dss()
Advanced options¶
Disabling SSL certificate check¶
If your DSS has SSL enabled, the packages will verify the certificate. In order for this to work, you may need to add the root authority that signed the DSS SSL certificate to your local trust store. Please refer to your OS or Python manual for instructions.
If this is not possible, you can also disable checking the SSL certificate:
Through code:
import dataiku
dataiku.set_remote_dss("http(s)://DSS_HOST:DSS_PORT/", "Your API Key secret", no_check_certificate=True)
Through environment variables: Not supported at the moment
Through configuration file: Modify the configuration file as such:
{
"dss_instances": {
"default": {
"url": "http(s)://DSS_HOST:DSS_PORT/",
"api_key": "Your API key secret",
"no_check_certificate": true
}
},
"default_instance": "default"
}