The main DSSClient class¶
The REST API Python client makes it easy to write client programs for the DSS REST API in Python. The REST API Python client is in the dataikuapi
Python package.
The client is the entrypoint for many of the capabilities listed in this chapter.
For more details about the two Dataiku packages, see Python APIs, Using the APIs inside of DSS and Using the APIs outside of DSS.
Creating a client from inside DSS¶
To work with the API, a connection needs to be established with DSS, by creating a DSSClient
object. Once the connection is established, the DSSClient object serves as the entry point to the other calls.
The Python client can be used from inside DSS. In that case:
- It’s preinstalled, you don’t need to do anything
- You don’t need to provide any API key, as the API client will automatically inherit connection credentials from the current context
import dataiku
client = dataiku.api_client()
# client is now a DSSClient and can perform all authorized actions.
# For example, list the project keys for which you have access
client.list_project_keys()
Creating a client from outside DSS¶
To work with the API, a connection needs to be established with DSS, by creating a DSSClient
object. Once the connection is established, the DSSClient object serves as the entry point to the other calls.
When running outside of DSS, you’ll first need to install the client. For that, simply install it from pip
To use the Python client from outside DSS, simply install it from pip.
pip install dataiku-api-client
This installs the client in the system-wide Python installation,
so if you are not using virtualenv, you may need to replace pip
by sudo pip
.
Note that this will always install the latest version of the API client. You might need to request a version compatible with your version of DSS.
When connecting from the outside world, you need an API key. See Public API Keys for more information on how to create an API key and the associated privileges.
You also need to connect using the base URL of your DSS instance.
import dataikuapi
host = "http://localhost:11200"
apiKey = "some_key"
client = dataikuapi.DSSClient(host, apiKey)
# client is now a DSSClient and can perform all authorized actions.
# For example, list the project keys for which the API key has access
client.list_project_keys()
Disabling SSL certificate check¶
If your DSS has SSL enabled, the package will verify the certificate. In order for this to work, you may need to add the root authority that signed the DSS SSL certificate to your local trust store. Please refer to your OS or Python manual for instructions.
If this is not possible, you can also disable checking the SSL certificate by setting client._session.verify = False
Reference API doc¶
Also see Reference API documentation of dataikuapi.
-
class
dataikuapi.
DSSClient
(host, api_key=None, internal_ticket=None, extra_headers=None)¶ Entry point for the DSS API client
-
list_futures
(as_objects=False, all_users=False)¶ List the currently-running long tasks (a.k.a futures)
Parameters: - as_objects (boolean) – if True, each returned item will be a
dataikuapi.dss.future.DSSFuture
- all_users (boolean) – if True, returns futures for all users (requires admin privileges). Else, only returns futures for the user associated with the current authentication context (if any)
Returns: list of futures. if as_objects is True, each future in the list is a
dataikuapi.dss.future.DSSFuture
. Else, each future in the list is a dict. Each dict contains at least a ‘jobId’ fieldReturn type: list of
dataikuapi.dss.future.DSSFuture
or list of dict- as_objects (boolean) – if True, each returned item will be a
-
list_running_scenarios
(all_users=False)¶ List the running scenarios
Parameters: all_users (boolean) – if True, returns scenarios for all users (requires admin privileges). Else, only returns scenarios for the user associated with the current authentication context (if any) Returns: list of running scenarios, each one as a dict containing at least a “jobId” field for the future hosting the scenario run, and a “payload” field with scenario identifiers Return type: list of dicts
-
get_future
(job_id)¶ Get a handle to interact with a specific long task (a.k.a future). This notably allows aborting this future.
Parameters: job_id (str) – the identifier of the desired future (which can be returned by list_futures()
orlist_running_scenarios()
)Returns: A handle to interact the future Return type: dataikuapi.dss.future.DSSFuture
-
list_running_notebooks
(as_objects=True)¶ List the currently-running Jupyter notebooks
Parameters: as_objects (boolean) – if True, each returned item will be a dataikuapi.dss.notebook.DSSNotebook
Returns: list of notebooks. if as_objects is True, each entry in the list is a dataikuapi.dss.notebook.DSSNotebook
. Else, each item in the list is a dict which contains at least a “name” field.Return type: list of dataikuapi.dss.notebook.DSSNotebook
or list of dict
-
get_root_project_folder
()¶ Get a handle to interact with the root project folder.
Returns: A :class:`dataikuapi.dss.projectfolder.DSSProjectFolder`to interact with this project folder
-
get_project_folder
(project_folder_id)¶ Get a handle to interact with a project folder.
Parameters: project_folder_id (str) – the project folder ID of the desired project folder Returns: A :class:`dataikuapi.dss.projectfolder.DSSProjectFolder`to interact with this project folder
-
list_project_keys
()¶ List the project keys (=project identifiers).
Returns: list of project keys identifiers, as strings Return type: list of strings
-
list_projects
()¶ List the projects
Returns: a list of projects, each as a dict. Each dictcontains at least a ‘projectKey’ field Return type: list of dicts
-
get_project
(project_key)¶ Get a handle to interact with a specific project.
Parameters: project_key (str) – the project key of the desired project Returns: A dataikuapi.dss.project.DSSProject
to interact with this project
-
get_default_project
()¶ Get a handle to the current default project, if available (i.e. if dataiku.default_project_key() is valid)
-
create_project
(project_key, name, owner, description=None, settings=None, project_folder_id=None)¶ Creates a new project, and return a project handle to interact with it.
Note: this call requires an API key with admin rights or the rights to create a project
Parameters: - project_key (str) – the identifier to use for the project. Must be globally unique
- name (str) – the display name for the project.
- owner (str) – the login of the owner of the project.
- description (str) – a description for the project.
- settings (dict) – Initial settings for the project (can be modified later). The exact possible settings are not documented.
- project_folder_id (str) – the project folder ID in which the project will be created (root project folder if not specified)
Returns: A class:dataikuapi.dss.project.DSSProject project handle to interact with this project
-
list_apps
()¶ List the apps
Returns: a list of apps, each as a dict. Each dict contains at least a ‘appId’ field Return type: list of dicts
-
get_app
(app_id)¶ Get a handle to interact with a specific app.
Parameters: app_id (str) – the id of the desired app Returns: A dataikuapi.dss.app.DSSApp
to interact with this project
-
list_plugins
()¶ List the installed plugins
Returns: list of dict. Each dict contains at least a ‘id’ field
-
install_plugin_from_archive
(fp)¶ Install a plugin from a plugin archive (as a file object)
Parameters: fp (object) – A file-like object pointing to a plugin archive zip
-
install_plugin_from_store
(plugin_id)¶ Install a plugin from the Dataiku plugin store
Parameters: plugin_id (str) – identifier of the plugin to install Returns: A DSSFuture
representing the install process
-
install_plugin_from_git
(repository_url, checkout='master', subpath=None)¶ Install a plugin from a Git repository. DSS must be setup to allow access to the repository.
Parameters: - repository_url (str) – URL of a Git remote
- checkout (str) – branch/tag/SHA1 to commit. For example “master”
- subpath (str) – Optional, path within the repository to use as plugin. Should contain a ‘plugin.json’ file
Returns: A
DSSFuture
representing the install process
-
get_plugin
(plugin_id)¶ Get a handle to interact with a specific plugin
Parameters: plugin_id (str) – the identifier of the desired plugin Returns: A dataikuapi.dss.project.DSSPlugin
-
sql_query
(query, connection=None, database=None, dataset_full_name=None, pre_queries=None, post_queries=None, type='sql', extra_conf=None, script_steps=None, script_input_schema=None, script_output_schema=None, script_report_location=None, read_timestamp_without_timezone_as_string=True, read_date_as_string=False)¶ Initiate a SQL, Hive or Impala query and get a handle to retrieve the results of the query. Internally, the query is run by DSS. The database to run the query on is specified either by passing a connection name, or by passing a database name, or by passing a dataset full name (whose connection is then used to retrieve the database)
Parameters: - query (str) – the query to run
- connection (str) – the connection on which the query should be run (exclusive of database and dataset_full_name)
- database (str) – the database on which the query should be run (exclusive of connection and dataset_full_name)
- dataset_full_name (str) – the dataset on the connection of which the query should be run (exclusive of connection and database)
- pre_queries (list) – (optional) array of queries to run before the query
- post_queries (list) – (optional) array of queries to run after the query
- type (str) – the type of query : either ‘sql’, ‘hive’ or ‘impala’
Returns: A
dataikuapi.dss.sqlquery.DSSSQLQuery
query handle
-
list_users
()¶ List all users setup on the DSS instance
Note: this call requires an API key with admin rights
Returns: A list of users, as a list of dicts Return type: list of dicts
-
get_user
(login)¶ Get a handle to interact with a specific user
Parameters: login (str) – the login of the desired user Returns: A dataikuapi.dss.admin.DSSUser
user handle
-
create_user
(login, password, display_name='', source_type='LOCAL', groups=None, profile='DATA_SCIENTIST')¶ Create a user, and return a handle to interact with it
Note: this call requires an API key with admin rights
Parameters: - login (str) – the login of the new user
- password (str) – the password of the new user
- display_name (str) – the displayed name for the new user
- source_type (str) – the type of new user. Admissible values are ‘LOCAL’ or ‘LDAP’
- groups (list) – the names of the groups the new user belongs to (defaults to [])
- profile (str) – The profile for the new user, can be one of READER, DATA_ANALYST or DATA_SCIENTIST
Returns: A
dataikuapi.dss.admin.DSSUser
user handle
-
get_own_user
()¶
-
list_groups
()¶ List all groups setup on the DSS instance
Note: this call requires an API key with admin rights
Returns: A list of groups, as an list of dicts Return type: list of dicts
-
get_group
(name)¶ Get a handle to interact with a specific group
Parameters: name (str) – the name of the desired group Returns: A dataikuapi.dss.admin.DSSGroup
group handle
-
create_group
(name, description=None, source_type='LOCAL')¶ Create a group, and return a handle to interact with it
Note: this call requires an API key with admin rights
Parameters: - name (str) – the name of the new group
- description (str) – (optional) a description of the new group
- source_type – the type of the new group. Admissible values are ‘LOCAL’ and ‘LDAP’
Returns: A
dataikuapi.dss.admin.DSSGroup
group handle
-
list_connections
()¶ List all connections setup on the DSS instance
Note: this call requires an API key with admin rights
Returns: All connections, as a dict of connection name to connection definition Return type: :dict
-
get_connection
(name)¶ Get a handle to interact with a specific connection
Parameters: name (str) – the name of the desired connection Returns: A dataikuapi.dss.admin.DSSConnection
connection handle
-
create_connection
(name, type, params=None, usable_by='ALL', allowed_groups=None)¶ Create a connection, and return a handle to interact with it
Note: this call requires an API key with admin rights
Parameters: - name – the name of the new connection
- type – the type of the new connection
- params (dict) – the parameters of the new connection, as a JSON object (defaults to {})
- usable_by – the type of access control for the connection. Either ‘ALL’ (=no access control) or ‘ALLOWED’ (=access restricted to users of a list of groups)
- allowed_groups (list) – when using access control (that is, setting usable_by=’ALLOWED’), the list of names of the groups whose users are allowed to use the new connection (defaults to [])
Returns: A
dataikuapi.dss.admin.DSSConnection
connection handle
-
list_code_envs
()¶ List all code envs setup on the DSS instance
Note: this call requires an API key with admin rights
Returns: a list of code envs. Each code env is a dict containing at least “name”, “type” and “language”
-
get_code_env
(env_lang, env_name)¶ Get a handle to interact with a specific code env
Parameters: name (str) – the name of the desired code env Returns: A dataikuapi.dss.admin.DSSCodeEnv
code env handle
-
create_code_env
(env_lang, env_name, deployment_mode, params=None)¶ Create a code env, and return a handle to interact with it
Note: this call requires an API key with admin rights
Parameters: - env_lang – the language (PYTHON or R) of the new code env
- env_name – the name of the new code env
- deployment_mode – the type of the new code env
- params – the parameters of the new code env, as a JSON object
Returns: A
dataikuapi.dss.admin.DSSCodeEnv
code env handle
-
list_clusters
()¶ List all clusters setup on the DSS instance
- Returns:
- List clusters (name, type, state)
-
get_cluster
(cluster_id)¶ Get a handle to interact with a specific cluster
- Args:
- name: the name of the desired cluster
- Returns:
- A
dataikuapi.dss.admin.DSSCluster
cluster handle
-
create_cluster
(cluster_name, cluster_type='manual', params=None)¶ Create a cluster, and return a handle to interact with it
Parameters: - cluster_name – the name of the new cluster
- cluster_type – the type of the new cluster
- params – the parameters of the new cluster, as a JSON object
Returns: A
dataikuapi.dss.admin.DSSCluster
cluster handle
-
list_global_api_keys
()¶ List all global API keys set up on the DSS instance
Note: this call requires an API key with admin rights
Returns: All global API keys, as a list of dicts
-
get_global_api_key
(key)¶ Get a handle to interact with a specific Global API key
Parameters: key (str) – the secret key of the desired API key Returns: A dataikuapi.dss.admin.DSSGlobalApiKey
API key handle
-
create_global_api_key
(label=None, description=None, admin=False)¶ Create a Global API key, and return a handle to interact with it
Note: this call requires an API key with admin rights
Parameters: - label (str) – the label of the new API key
- description (str) – the description of the new API key
- admin (str) – has the new API key admin rights (True or False)
Returns: A
dataikuapi.dss.admin.DSSGlobalApiKey
API key handle
-
list_meanings
()¶ List all user-defined meanings on the DSS instance
Note: this call requires an API key with admin rights
Returns: A list of meanings. Each meaning is a dict Return type: list of dicts
-
get_meaning
(id)¶ Get a handle to interact with a specific user-defined meaning
Note: this call requires an API key with admin rights
Parameters: id (str) – the ID of the desired meaning Returns: A dataikuapi.dss.meaning.DSSMeaning
meaning handle
-
create_meaning
(id, label, type, description=None, values=None, mappings=None, pattern=None, normalizationMode=None, detectable=False)¶ Create a meaning, and return a handle to interact with it
Note: this call requires an API key with admin rights
Parameters: - id – the ID of the new meaning
- type – the type of the new meaning. Admissible values are ‘DECLARATIVE’, ‘VALUES_LIST’, ‘VALUES_MAPPING’ and ‘PATTERN’
- (optional) (detectable) – the description of the new meaning
- (optional) – when type is ‘VALUES_LIST’, the list of values, or a list of {‘value’:’the value’, ‘color’:’an optional color’}
- (optional) – when type is ‘VALUES_MAPPING’, the mapping, as a list of objects with this structure: {‘from’: ‘value_1’, ‘to’: ‘value_a’}
- (optional) – when type is ‘PATTERN’, the pattern
- (optional) – when type is ‘VALUES_LIST’, ‘VALUES_MAPPING’ or ‘PATTERN’, the normalization mode to use for value matching. One of ‘EXACT’, ‘LOWERCASE’, or ‘NORMALIZED’ (not available for ‘PATTERN’ type). Defaults to ‘EXACT’.
- (optional) – whether DSS should consider assigning the meaning to columns set to ‘Auto-detect’. Defaults to False.
Returns: A
dataikuapi.dss.meaning.DSSMeaning
meaning handle
-
list_logs
()¶ List all available log files on the DSS instance This call requires an API key with admin rights
Returns: A list of log file names
-
get_log
(name)¶ Get the contents of a specific log file This call requires an API key with admin rights
Parameters: name (str) – the name of the desired log file (obtained with list_logs()
)Returns: The full content of the log file, as a string
-
log_custom_audit
(custom_type, custom_params=None)¶ Log a custom entry to the audit trail
Parameters: - custom_type (str) – value for customMsgType in audit trail item
- custom_params (dict) – value for customMsgParams in audit trail item (defaults to {})
-
get_variables
()¶ Get the DSS instance’s variables, as a Python dictionary
This call requires an API key with admin rights
Returns: a Python dictionary of the instance-level variables
-
set_variables
(variables)¶ Updates the DSS instance’s variables
This call requires an API key with admin rights
It is not possible to update a single variable, you must set all of them at once. Thus, you should only use a
variables
parameter that has been obtained usingget_variables()
.Parameters: variables (dict) – the new dictionary of all variables of the instance
-
get_general_settings
()¶ Gets a handle to interact with the general settings.
This call requires an API key with admin rights
Returns: a dataikuapi.dss.admin.DSSGeneralSettings
handle
-
create_project_from_bundle_local_archive
(archive_path, project_folder=None)¶ Create a project from a bundle archive. Warning: this method can only be used on an automation node.
Parameters: - archive_path (string) – Path on the local machine where the archive is
- project_folder (A
dataikuapi.dss.projectfolder.DSSProjectFolder
) – the project folder in which the project will be created or None for root project folder
-
create_project_from_bundle_archive
(fp, project_folder=None)¶ Create a project from a bundle archive (as a file object) Warning: this method can only be used on an automation node.
Parameters: - fp (string) – A file-like object pointing to a bundle archive zip
- project_folder (A
dataikuapi.dss.projectfolder.DSSProjectFolder
) – the project folder in which the project will be created or None for root project folder
-
prepare_project_import
(f)¶ Prepares import of a project archive. Warning: this method can only be used on a design node.
Parameters: fp (file-like) – the input stream, as a file-like object Returns: a TemporaryImportHandle
to interact with the prepared import
-
get_apideployer
()¶ Gets a handle to work with the API Deployer
Return type: DSSAPIDeployer
-
catalog_index_connections
(connection_names=None, all_connections=False, indexing_mode='FULL')¶ Triggers an indexing of multiple connections in the data catalog
Parameters: - connection_names (list) – list of connections to index, ignored if all_connections=True (defaults to [])
- all_connections (bool) – index all connections (defaults to False)
-
get_scoring_libs_stream
()¶ Get the scoring libraries jar required for scoring with model jars that don’t include libraries. You need to close the stream after download. Failure to do so will result in the DSSClient becoming unusable.
Returns: a jar file, as a stream Return type: file-like
-
get_auth_info
(with_secrets=False)¶ Returns various information about the user currently authenticated using this instance of the API client.
This method returns a dict that may contain the following keys (may also contain others):
- authIdentifier: login for a user, id for an API key
- groups: list of group names (if context is an user)
- secrets: list of dicts containing user secrets (if context is an user)
Param: with_secrets boolean: Return user secrets Returns: a dict Return type: dict
-
get_auth_info_from_browser_headers
(headers_dict, with_secrets=False)¶ Returns various information about the DSS user authenticated by the dictionary of HTTP headers provided in headers_dict.
This is generally only used in webapp backends
This method returns a dict that may contain the following keys (may also contain others):
- authIdentifier: login for a user, id for an API key
- groups: list of group names (if context is an user)
- secrets: list of dicts containing user secrets (if context is an user)
Param: headers_dict dict: Dictionary of HTTP headers Param: with_secrets boolean: Return user secrets Returns: a dict Return type: dict
-
get_ticket_from_browser_headers
(headers_dict)¶ Returns a ticket for the DSS user authenticated by the dictionary of HTTP headers provided in headers_dict.
This is only used in webapp backends
This method returns a ticket to use as a X-DKU-APITicket header
Param: headers_dict dict: Dictionary of HTTP headers Returns: a string Return type: string
-
create_personal_api_key
(label)¶ Creates a personal API key corresponding to the user doing the request. This can be called if the DSSClient was initialized with an internal ticket or with a personal API key
Param: label string: Label for the new API key Returns: a dict of the new API key, containing at least “secret”, i.e. the actual secret API key Return type: dict
-
push_base_images
()¶ Push base images for Kubernetes container-execution and Spark-on-Kubernetes
-
apply_kubernetes_namespaces_policies
()¶ Apply Kubernetes namespaces policies defined in the general settings
-
get_licensing_status
()¶ Returns a dictionary with information about licensing status of this DSS instance
Return type: dict
-
get_object_discussions
(project_key, object_type, object_id)¶ Get a handle to manage discussions on any object
Parameters: - project_key (str) – identifier of the project to access
- object_type (str) – DSS object type
- object_id (str) – DSS object ID
Returns: the handle to manage discussions
Return type: dataikuapi.discussion.DSSObjectDiscussions
-
-
class
dataikuapi.dssclient.
TemporaryImportHandle
(client, import_id)¶ -
execute
(settings=None)¶ Executes the import with provided settings.
Parameters: settings (dict) – Dict of import settings (defaults to {}). The following settings are available:
- targetProjectKey (string): Key to import under. Defaults to the original project key
- remapping (dict): Dictionary of connection and code env remapping settings.See example of remapping dict:
"remapping" : { "connections": [ { "source": "src_conn1", "target": "target_conn1" }, { "source": "src_conn2", "target": "target_conn2" } ], "codeEnvs" : [ { "source": "src_codeenv1", "target": "target_codeenv1" }, { "source": "src_codeenv2", "target": "target_codeenv2" } ] }
@warning: You must check the ‘success’ flag
-