Other topics

Managing dependencies

Packages

Python and R dependencies (packages) are not managed by DSS: the user must ensure that the DSS Python or R environment has the necessary packages.

DSS can however inform the user about dependencies: to do so, add a requirements.json file at the root of the plugin (besides the plugin.json file). This file is simply a declaration of the required packages, which is presented to the administrator as soon as he installs the plugin.

A requirements.json file looks like this:

{
    "python" : [
        {"name":"pandas", "version":">=0.16.2"},
        {"name":"sqlalchemy", "version":">=0.1"}
    ],
    "R" : [
        {"name":"dplyr", "version":"1.0.0"}
    ]
}

Code environments

A plugin can contain the definition of a code environment to hold the list of packages it requires for its execution. There is only one such definition in a plugin, either Python or R. For a Python code environment, one should setup the following hierarchy in the plugin (use “Add components” on the Definition tab of your development plugin):

plugin root
+---code-env
    +---python
        |   desc.json
        +---spec
            |   environment.spec  (optional, Conda spec)
            |   requirements.txt

While the environment.spec and requirements.txt contains the list of desired packages, the desc.json file contains the environment characteristics:

{
    "pythonInterpreter": "PYTHON27",
    "installCorePackages": false,
    "installJupyterSupport": false,
    "basePackagesInstallMethod": "PRE_BUILT",
    "conda": false,
}

Shared code

requirements.json work for “standard” packages that are available in repositories. However, you also often want to share some code between multiple datasets and recipes of the same plugin.

For these files, you can create a python-lib/ folder at the root of the plugin. This folder is automatically added to the PYTHONPATH of all custom recipes and datasets of this plugin. For an example of that, you can have a look at the code of our Pipedrive connector .

Code from this folder can also be imported from regular python recipes or notebooks using the following functions. This makes it possible to package python module inside plugins.

dataiku.use_plugin_libs(plugin_id)

Add the lib/ folder of the plugin to PYTHONPATH

dataiku.import_from_plugin(plugin_id, package_name)

Import a package from the lib/ folder of the plugin and returns the module

Resource files

You may also create a resource folder at the root of your plugin (besides the plugin.json file) to hold resource files of your plugin (for example, data files).

to hold resources useful fo your plugin, e.g. data files.

This resource folder is meant to be read-only. To get the path of the resource folder:

Custom settings UI

By default, DSS will present a form with a field for each parameter defined in the .json of the custom recipe/dataset, but it is possible to have a more elaborate and interactive interface.

Interactions in auto-generated forms

There are 2 mechanisms available to bring some interactivity in the forms DSS generates from the parameters’ list:

  • a parameter of type SEPARATOR can be added to define a section in the form.
  • a visibilityCondition field can be added to a parameter to have it shown/hidden depending on a condition on the other parameters (prefixed with model.)

For example, the following JSON definition of the parameters :

"params": [
    {
        "name": "sep1",
        "label": "Authentication",
        "type": "SEPARATOR"
    },
    {
        "name": "useToken",
        "label" : "Authenticate with token",
        "type": "BOOLEAN"
    },
    {
        "name": "username",
        "label" : "Login",
        "type": "STRING",
        "visibilityCondition" : "!model.useToken"
    },
    {
        "name": "password",
        "label" : "Password",
        "type": "PASSWORD",
        "visibilityCondition" : "!model.useToken"
    },
    {
        "name": "token",
        "label" : "Token",
        "type": "STRING",
        "visibilityCondition" : "model.useToken"
    },
    {
        "name": "sep3",
        "label": "Reads",
        "type": "SEPARATOR"
    },
    {
        "name": "fetchSize",
        "label" : "Fetch size",
        "type": "INT"
    }
]

produces the following form where the fields Token and Login/Password are shown/hidden depending on the state of the Authenticate with token checkbox:

../../_images/advanced_form_lp.png ../../_images/advanced_form_token.png

Fully custom forms

The more advanced option is to have a completely custom form by providing 2 parameters in the custom recipe/dataset JSON descriptor :

  • in paramsTemplate : a .html file in the resource/ folder at the plugin root
  • in paramsModule : an optional Angular module, defined in a .js file in the js/ folder at the plugin root (the name of the file doesn’t matter)

The html is loaded with Angular, and the parameter values should be set in the object config. Additional files from the plugin’s resource folder can be accessed by referencing them with /plugins/__plugin_name__/resource/__file_to_get__. This is useful to load CSS stylesheets, images, or html files to use as Angular templates. Typically, one can add a <link /> element to load some CSS rules, like :

<link href="/plugins/my_plugin/resource/my_form.css" rel="stylesheet" type="text/css">

For example, the form from the previous section could be done in a fully custom way with

<div ng-controller="MyCustomFormController" style="margin: 10px 0px;">
    <input name="useToken" type="checkbox" ng-model="config.useToken">Use token</input>
    <label for="login" ng-if="!config.useToken">
        <div style="width: 80px; display: inline-block;">Login</div>
        <input name="login" type="text" ng-model="config.login" style="width: 80px;" />
    </label>
    <label for="password" ng-if="!config.useToken">
        <div style="width: 80px; display: inline-block;">Password</div>
        <input name="password" type="password" ng-model="config.password" style="width: 80px;"/>
    </label>
    <label for="token" ng-if="config.useToken">
        <div style="width: 80px; display: inline-block;">Token</div>
        <input name="token" type="text" ng-model="config.token" style="width: 80px;"/>
    </label>
    <label for="fetchSize" >
        <div style="width: 80px; display: inline-block;">Fetch size</div>
        <input name="fetchSize" type="number" ng-model="config.fetchSize" style="width: 80px;"/>
    </label>
    <div>
        <span>{{checkResult.hasAuthentication == null ? 'Not checked' : (checkResult.hasAuthentication ? 'Form complete' : 'Fill credentials')}}</span>
        <button ng-click="check()" class="btn btn-default">Recheck</button>
    </div>
</div>

and

var app = angular.module('myplugin.module', []);

app.controller('MyCustomFormController', function($scope) {
    $scope.checkResult = {};
    $scope.check = function() {
        var hasAuthentication = function(config) {
            return config.useToken ? config.token : (config.login && config.password);
        };
        $scope.checkResult = {
            hasAuthentication : hasAuthentication($scope.config)
        };
    };
    $scope.check();
});

where the setup in the custom dataset JSON is

"paramsTemplate" : "form.html",
"paramsModule" : "myplugin.module",

This produces a form like :

../../_images/custom_form.png

Fetching data for custom forms

A fully custom form will often need to fetch data to be presented. A simple example would be a way to select one of the values of a given column in the input dataset of a recipe. For this example, code able to read the dataset and compute the list of distinct values is needed.

A custom form can call a do() method defined in a python file that will get executed on the backend’s machine, and will thus have access to the project’s data. This do() method is called from the javascript running in the browser by using the callPythonDo() method on the Angular scope of the form. The Python file containing the code for the do() method needs to be in the plugin’s resource folder, and referenced from the .json in a paramsPythonSetup field.

For example, a form asking to choose a column and a value from this column could be done with:

<div ng-controller="FoobarController">
    <div class="control-group" >
        <label class="control-label">Column</label>
        <div class="controls" > <!-- basic text field with typeahead for the column selection, as you would get for a COLUMN parameter in a generated form -->
            <input type="text" ng-model="config.filterColumn" ng-required="true" bs-typeahead="columnsPerInputRole['input_role']"/>
            <span class="help-inline">Column to filter on</span>
        </div>
    </div>
    <div class="control-group" >
        <label class="control-label">Value</label>
        <div class="controls" >
            <select dku-bs-select="{liveSearch:true}" ng-model="config.filterValue" ng-options="v for v in choices" />
            <span class="help-inline">Value to keep</span>
        </div>
    </div>
</div>
var app = angular.module('foobar', []);

app.controller('FoobarController', function($scope) {
    var updateChoices = function() {
        // the parameter to callPythonDo() is passed to the do() method as the payload
        // the return value of the do() method comes back as the data parameter of the fist function()
        $scope.callPythonDo({}).then(function(data) {
            // success
            $scope.choices = data.choices;
        }, function(data) {
            // failure
            $scope.choices = [];
        });
    };
    updateChoices();
    $scope.$watch('config.filterColumn', updateChoices);
});

and a Python callback

from dataiku import Dataset
from sets import Set

# paylaod is sent from the javascript's callPythonDo()
# config and plugin_config are the recipe/dataset and plugin configured values
# inputs is the list of input roles (in case of a recipe)
def do(payload, config, plugin_config, inputs):
    role_name = 'input_role'
    # get dataset name then dataset handle
    dataset_full_names = [i['fullName'] for i in inputs if i['role'] == role_name]
    if len(dataset_full_names) == 0:
        return {'choices' : []}
    dataset = Dataset(dataset_full_names[0])
    # get name of column providing the choices
    column_name = config.get('filterColumn', '')
    if len(column_name) == 0:
        return {'choices' : []}
    # check that the column is in the schema
    schema = dataset.read_schema()
    schema_columns = [col for col in schema if col['name'] == column_name]
    if len(schema_columns) != 1:
        return {'choices' : []}
    schema_column = schema_columns[0]
    # get the data and build the set of values
    choices = Set()
    for row in dataset.iter_tuples(sampling='head', limit=10000, columns=[column_name]):
        choices.add(row[0])
    return {'choices' : list(choices)}