Connections#

The API exposes DSS connections, which can be created, modified and deleted through the API. These operations are restricted to API keys with the “admin rights” flag.

A list of the connections can by obtained with the dataikuapi.dssclient.DSSClient.list_connections() method:

client = DSSClient(host, apiKey)
dss_connections = client.list_connections()
prettyprinter.pprint(dss_connections)

outputs

{   'filesystem_managed': {   'allowManagedDatasets': True,
                               'allowMirror': False,
                               'allowWrite': True,
                               'allowedGroups': [],
                               'maxActivities': 0,
                               'name': 'filesystem_managed',
                               'params': {   'root': '${dip.home}/managed_datasets'},
                               'type': 'Filesystem',
                               'usableBy': 'ALL',
                               'useGlobalProxy': True},
    'hdfs_root':                  {    'allowManagedDatasets': True,
                                   'allowMirror': False,
                                   'allowWrite': True,
                                   'allowedGroups': [],
                                   'maxActivities': 0,
                                   'name': 'hdfs_root',
                                   'params': {'database': 'dataik', 'root': '/'},
                                   'type': 'HDFS',
                                   'usableBy': 'ALL',
                                   'useGlobalProxy': False},
    'local_postgress':    {    'allowManagedDatasets': True,
                               'allowMirror': False,
                               'allowWrite': True,
                               'allowedGroups': [],
                               'maxActivities': 0,
                               'name': 'local_postgress',
                               'params': { 'db': 'testdb',
                                           'host': 'localhost',
                                           'password': 'admin',
                                           'port': '5432',
                                           'properties': {   },
                                           'user': 'admin'},
                            'type': 'PostgreSQL',
                            'usableBy': 'ALL',
                            'useGlobalProxy': False},
    ...
}

Connections can be added:

new_connection_params = {'db':'mysql_test', 'host': 'localhost', 'password': 'admin', 'properties': [{'name': 'useSSL', 'value': 'true'}], 'user': 'admin'}
new_connection = client.create_connection('test_connection', type='MySql', params=new_connection_params, usable_by='ALLOWED', allowed_groups=['administrators','data_team'])
prettyprinter.pprint(client.list_connections()['test_connection'])

outputs

{   'allowManagedDatasets': True,
    'allowMirror': True,
    'allowWrite': True,
    'allowedGroups': ['data_scientists'],
    'maxActivities': 0,
    'name': 'test_connection',
    'params': {   'db': 'mysql_test',
                   'host': 'localhost',
                   'password': 'admin',
                   'properties': {   },
                   'user': 'admin'},
    'type': 'MySql',
    'usableBy': 'ALLOWED',
    'useGlobalProxy': True}

To modify a connection, it is advised to first retrieve the connection definition with a get_definition() call, alter the definition, and set it back into DSS:

connection_definition = new_connection.get_definition()
connection_definition['usableBy'] = 'ALL'
connection_definition['allowWrite'] = False
new_connection.set_definition(connection_definition)
prettyprinter.pprint(new_connection.get_definition())

outputs

{   'allowManagedDatasets': True,
    'allowMirror': True,
    'allowWrite': False,
    'allowedGroups': ['data_scientists'],
    'maxActivities': 0,
    'name': 'test_connection',
    'params': {   'db': 'mysql_test',
                   'host': 'localhost',
                   'password': 'admin',
                   'properties': {   },
                   'user': 'admin'},
    'type': 'MySql',
    'usableBy': 'ALL',
    'useGlobalProxy': True}

Connections can be deleted through their handle:

connection = client.get_connection('test_connection')
connection.delete()

Detailed examples#

This section contains more advanced examples on Connections.

Mass-change filesystem Connections#

You can programmatically switch all Datasets of a Project from a given filesystem Connection to a different one, thus reproducing the “Change Connection” action available in the Dataiku Flow UI.

import dataiku

def mass_change_connection(project, origin_conn, dest_conn):
    """Mass change dataset connections in a project (filesystem connections only)
    """

    all_datasets = project.list_datasets()
    for d in all_datasets():
        ds = project.get_dataset(d["name"])
        ds_def = ds.get_definition()
        if ds_def["type"] == "Filesystem":
            if ds_def["params"]["connection"] == origin_conn:
                ds_def["params"]["connection"] == dest_conn
                ds.set_definition(ds_def)

client = dataiku.api_client()
project = client.get_default_project()
mass_change_connection(project, "FSCONN_SOURCE", "FSCONN_DEST")

Reference documentation#

dataikuapi.dss.admin.DSSConnection(client, name)

A connection on the DSS instance.

dataikuapi.dss.admin.DSSConnectionInfo(data)

A class holding read-only information about a connection.

dataikuapi.dss.admin.DSSConnectionListItem(...)

An item in a list of connections.

dataikuapi.dss.admin.DSSConnectionSettings(...)

Settings of a DSS connection.

dataikuapi.dss.admin.DSSConnectionDetailsReadability(data)

Handle on settings for access to connection details.