API for managed folders

This class lets you interact with managed folders in Python recipes and notebooks. See Managed folders for more information.

class dataiku.core.managed_folder.Folder(lookup, project_key=None)

This is a handle to interact with a managed folder.

Note: this class is also available as dataiku.Folder

get_info(sensitive_info=False)

Get information about the location and settings of this managed folder :rtype: dict

get_partition_info(partition)

Get information about the partitions of this managed folder :rtype: dict

get_path()

Gets the filesystem path of this managed folder. This method can only be called for managed folders that are stored on the local filesystem of the DSS server.

For non-filesystem managed folders (HDFS, S3, …), you need to use the various read/download and write/upload APIs.

is_partitioning_directory_based()

Whether the partitioning of the folder is based on sub-directories

list_paths_in_partition(partition='')

Gets the filesystem paths of the folder for the given partition (or for the entire folder)

list_partitions()

Gets the partitions in the folder

Return type:list
get_partition_folder(partition)

Gets the filesystem path of the directory corresponding to the partition (if the partitioning is directory-based)

get_id()
get_name()
file_path(filename)

Gets the filesystem path for a given file within the folder. This method can only be called for managed folders that are stored on the local filesystem of the DSS server.

For non-filesystem managed folders (HDFS, S3, …), you need to use the various read/download and write/upload APIs.

Parameters:filename (str) – Name of the file within the folder
read_json(filename)

Reads a JSON file within the folder and returns its parsed content

Parameters:filename (str) – Path of the file within the folder
Return type:list or dict: Depending on the content of the file
write_json(filename, obj)

Writes a JSON-serializable (mostly dict or list) object as JSON to a file within the folder

Parameters:
  • filename (str) – Path of the target file within the folder
  • obj (str) – JSON-serializable object to write (generally dict or list)
clear()

Removes all files from the folder

clear_partition(partition)

Removes all files from a specific partition of the folder.

clear_path(path)

Removes a file or directory from the folder

get_path_details(path='/')

Get details about a specific path (file or directory) in the folder

Return type:dict
get_download_stream(path)

Gets a file-like object that allows you to read a single file from this folder. If the file already exists, it will be replaced.

with folder.get_download_stream("myfile") as stream:
    data = stream.readline()
    print("First line of myfile is: \%s" \% data) 
Return type:file-like
upload_stream(path, f)

Uploads the content of a file-like object to a specific path in the managed folder. If the file already exists, it will be replaced.

# This copies a local file to the managed folder
with open("local_file_to_upload") as f:
    folder.upload_stream("name_of_file_in_folder", f)
Parameters:
  • path (str) – Target path of the file to write in the managed folder
  • f – file-like object open for reading
upload_file(path, file_path)

Uploads a local file to a specific path in the managed folder. If the file already exists, it will be replaced.

Parameters:
  • path (str) – Target path of the file to write in the managed folder
  • file_path – Absolute path to a local file
upload_data(path, data)

Uploads binary data to a specific path in the managed folder. If the file already exists, it will be replaced.

Parameters:
  • path (str) – Target path of the file to write in the managed folder
  • data – str or unicode data to upload
get_writer(path)

Get a writer object to write incrementally to a specific path in the managed folder. If the file already exists, it will be replaced.

Parameters:path (str) – Target path of the file to write in the managed folder
get_last_metric_values(partition='')

Get the set of last values of the metrics on this folder, as a dataiku.ComputedMetrics object

get_metric_history(metric_lookup, partition='')

Get the set of all values a given metric took on this folder :param metric_lookup: metric name or unique identifier :param partition: optionally, the partition for which the values are to be fetched

save_external_metric_values(values_dict)

Save metrics on this folder. The metrics are saved with the type “external”

Parameters:values_dict – the values to save, as a dict. The keys of the dict are used as metric names
save_external_check_values(values_dict)

Save checks on this folder. The checks are saved with the type “external”

Parameters:values_dict – the values to save, as a dict. The keys of the dict are used as check names