Experiment Tracking¶
For an introduction to Experiment Tracking in DSS, please see Experiment Tracking.
Experiment Tracking in DSS uses the MLflow Tracking API.
This section focuses on Dataiku-specific Extensions to the MLflow API
API Reference¶
-
class
dataikuapi.dss.mlflow.
DSSMLflowExtension
(client, project_key)¶ A handle to interact with specific endpoints of the DSS MLflow integration.
Do not create this directly, use
dataikuapi.dss.project.DSSProject.get_mlflow_extension()
-
list_models
(run_id)¶ Returns the list of models of given run
- Parameters
run_id (str) – run_id for which to return a list of models
-
list_experiments
(view_type='ACTIVE_ONLY', max_results=1000)¶ Returns the list of experiments in the DSS project for which MLflow integration is setup
- Parameters
view_type (str) – ACTIVE_ONLY, DELETED_ONLY or ALL
max_results (int) – max results count
- Return type
dict
-
rename_experiment
(experiment_id, new_name)¶ Renames an experiment
- Parameters
experiment_id (str) – experiment id
new_name (str) – new name
-
restore_experiment
(experiment_id)¶ Restores a deleted experiment
- Parameters
experiment_id (str) – experiment id
-
restore_run
(run_id)¶ Restores a deleted run
- Parameters
run_id (str) – run id
-
garbage_collect
()¶ Permanently deletes the experiments and runs marked as “Deleted”
-
create_experiment_tracking_dataset
(dataset_name, experiment_ids=[], view_type='ACTIVE_ONLY', filter_expr='', order_by=[], format='LONG')¶ Creates a virtual dataset exposing experiment tracking data.
- Parameters
dataset_name (str) – name of the dataset
experiment_ids (list(str)) – list of ids of experiments to filter on. No filtering if empty
view_type (str) – one of ACTIVE_ONLY, DELETED_ONLY and ALL. Default is ACTIVE_ONLY
filter_expr (str) – MLflow search expression
order_by (list(str)) – list of order by clauses. Default is ordered by start_time, then runId
format (str) – LONG or JSON. Default is LONG
-
clean_experiment_tracking_db
()¶ Cleans the experiments, runs, params, metrics, tags, etc. for this project
This call requires an API key with admin rights
-
set_run_inference_info
(run_id, prediction_type, classes=None, code_env_name=None, target=None)¶ Sets the type of the model, and optionally other information useful to deploy or evaluate it.
prediction_type must be one of: - REGRESSION - BINARY_CLASSIFICATION - MULTICLASS - OTHER
Classes must be specified if and only if the model is a BINARY_CLASSIFICATION or MULTICLASS model.
This information is leveraged to filter saved models on their prediction type and prefill the classes when deploying using the GUI an MLflow model as a version of a DSS Saved Model.
- Parameters
prediction_type (str) – prediction type (see doc)
run_id (str) – run_id for which to set the classes
classes (list) – ordered list of classes (not for all prediction types, see doc). Every class will be converted by calling str().
code_env_name (str) – name of an adequate DSS python code environment
target (str) – name of the target
-