dsscli tool¶
dsscli is a command-line tool that can perform a variety of runtime administration tasks on DSS. It can be used directly by a DSS administrator, or incorporated into automation scripts.
Most dsscli operations are performed through the DSS public API and can thus also be performed using the DSS public API Python client. You can also directly query the DSS REST API.
Running dsscli¶
dsscli is made of a large number of commands. Each command performs a single administration task. Each command takes arguments and options
From the DSS data directory, run ./bin/dsscli <command> <arguments>
Running
./bin/dsscli -hwill list the available commands.Running
./bin/dsscli <command> -hwill show the detailed help of the selected command.
For example, to list jobs history in project MYPROJECT, use ./bin/dsscli jobs-list MYPROJECT
dsscli vs dssadmin¶
Another command-line tool is available in the DSS data directory for performing management tasks on DSS: ./bin/dssadmin
dssadmin is mostly for “installation-kind” of commands (setting up R, Spark or Hadoop integration for example)
dsscli is mostly for “day-to-day” routine operations (creating users, running jobs, …)
Security-related commands¶
dsscli provides commands to:
Create, delete, list and edit users
Create, delete, list and edit groups
Create, delete, list and edit API keys
user-create¶
dsscli user-create [-h] [--email EMAIL]
                           [--source-type SOURCE_TYPE]
                           [--display-name DISPLAY_NAME]
                           [--user-profile USER_PROFILE] [--group GROUP]
                           login password
SOURCE_TYPEmust be eitherLOCALto create a regular user (in the DSS users database),LDAPto create a user that will authenticate through LDAP orLOCAL_NO_AUTHfor authentication through SSO. See Configuring LDAP authentication and Single Sign-On for more information. Note that even for LDAP and LOCAL_NO_AUTH, dsscli expects a “password” argument. Enter any random string, it will be ignored. The default isLOCALUSER_PROFILEis one of the possible user profiles defined by your license. For most DSS licenses, it is one ofREADER,DATA_ANALYSTorDATA_SCIENTIST. The default is specified by your configurationThe
--group GROUPargument can be specified multiple times to place the user in multiple groups
users-list¶
dsscli users-list
user-delete¶
dsscli user-delete [-h] login
positional arguments:
        login       Login to delete
user-edit¶
Modifies the settings of an existing user.
dsscli user-edit [-h] [--password PASSWORD]
                         [--display-name DISPLAY_NAME] [--email EMAIL]
                         [--user-profile USER_PROFILE] [--group GROUP]
                         login
Each of password, display name, email, user-profile and groups can be modified independently.
For example, running dsscli user-edit --email mynewemail@company.com user will only modify the email address, and leave all other fields unmodified.
All groups are modified at once: thus to modify groups, you need to pass a new list of groups, which will be the new complete list.
It is not possible to modify password for a LDAP or LOCAL_NO_AUTH user
groups-list¶
Lists all groups in DSS
dsscli groups-list [-h] [--with-permissions] [--output OUTPUT]
                       [--no-header]
If --with-permissions is specified, additional columns are added to the output with global permissions, as detailed in  Main project permissions
group-create¶
Creates a group by name
dsscli group-create [-h] [--description DESCRIPTION]
                      [--source-type SOURCETYPE] [--admin ADMIN]
                      [--may-manage-code-envs MAYMANAGECODEENVS]
                      [--may-create-code-envs MAYCREATECODEENVS]
                      [--may-write-unsafe-code MAYWRITEUNSAFECODE]
                      [--may-write-safe-code MAYWRITESAFECODE]
                      [--may-create-projects MAYCREATEPROJECTS]
                      [--may-manage-udm MAYMANAGEUDM]
                      [--may-edit-lib-folders MAYEDITLIBFOLDERS]
                      [--may-develop-plugins MAYDEVELOPPLUGINS]
                      [--may-create-authenticated-connections MAYCREATEAUTHENTICATEDCONNECTIONS]
                      name
All of the --may-xxx flags take “true” or “false” as argument, and refer to one of the global permissions as detailed in Main project permissions
SOURCETYPE can be either LDAP or LOCAL. Note that LDAP groups need to declare mappings to LDAP groups to be functional, but this feature is not currently in dsscli. You need to use the DSS API clients.
Adding users in groups is done by editing these users.
group-edit¶
Edits the settings of a group by name
dsscli group-edit [-h] [--description DESCRIPTION]
                      [--source-type SOURCETYPE] [--admin ADMIN]
                      [--may-manage-code-envs MAYMANAGECODEENVS]
                      [--may-create-code-envs MAYCREATECODEENVS]
                      [--may-write-unsafe-code MAYWRITEUNSAFECODE]
                      [--may-write-safe-code MAYWRITESAFECODE]
                      [--may-create-projects MAYCREATEPROJECTS]
                      [--may-manage-udm MAYMANAGEUDM]
                      [--may-edit-lib-folders MAYEDITLIBFOLDERS]
                      [--may-develop-plugins MAYDEVELOPPLUGINS]
                      [--may-create-authenticated-connections MAYCREATEAUTHENTICATEDCONNECTIONS]
                      name
All of the --may-xxx flags take “true” or “false” as argument, and refer to one of the global permissions as detailed in Main project permissions
SOURCETYPE can be either LDAP or LOCAL. Note that LDAP groups need to declare mappings to LDAP groups to be functional, but this feature is not currently in dsscli. You need to use the DSS API clients.
Adding users in groups is done by editing these users.
api-keys-list¶
Lists global API keys
api-key-create¶
Creates a global API key
dsscli api-key-create [-h] [--output OUTPUT] [--no-header]
                          [--description DESCRIPTION] [--label LABEL]
                          [--admin ADMIN]
                          [--may-manage-code-envs MAYMANAGECODEENVS]
                          [--may-create-code-envs MAYCREATECODEENVS]
                          [--may-write-unsafe-code MAYWRITEUNSAFECODE]
                          [--may-write-safe-code MAYWRITESAFECODE]
                          [--may-create-projects MAYCREATEPROJECTS]
                          [--may-manage-udm MAYMANAGEUDM]
                          [--may-edit-lib-folders MAYEDITLIBFOLDERS]
                          [--may-develop-plugins MAYDEVELOPPLUGINS]
                          [--may-create-authenticated-connections MAYCREATEAUTHENTICATEDCONNECTIONS]
The --admin flag and all of the --may-xxx flags take “true” or “false” as argument, and refer to one of the global permissions as detailed in Main project permissions
api-key-edit¶
Edits a global API key
dsscli api-key-edit [-h] [--output OUTPUT] [--no-header]
                          [--description DESCRIPTION] [--label LABEL]
                          [--admin ADMIN]
                          [--may-manage-code-envs MAYMANAGECODEENVS]
                          [--may-create-code-envs MAYCREATECODEENVS]
                          [--may-write-unsafe-code MAYWRITEUNSAFECODE]
                          [--may-write-safe-code MAYWRITESAFECODE]
                          [--may-create-projects MAYCREATEPROJECTS]
                          [--may-manage-udm MAYMANAGEUDM]
                          [--may-edit-lib-folders MAYEDITLIBFOLDERS]
                          [--may-develop-plugins MAYDEVELOPPLUGINS]
                          [--may-create-authenticated-connections MAYCREATEAUTHENTICATEDCONNECTIONS]
                          key
The --admin flag and all of the --may-xxx flags take “true” or “false” as argument, and refer to one of the global permissions as detailed in Main project permissions
Jobs-related commands¶
These commands are used to trigger, list and abort jobs and scenarios
jobs-list¶
Lists jobs for a given project, both running ones and past ones
dsscli jobs-list [-h] [--output OUTPUT] [--no-header] project_key
Returns a list like:
Job id  | 
State  | 
From scenario  | 
|---|---|---|
Build_dataset1_2017-10-25T13-05-23.615  | 
RUNNING  | 
|
Build_dataset1_2017-10-25T13-05-23.615  | 
FAILED  | 
|
Build_dataset1_2017-10-25T12-45-32.864  | 
FAILED  | 
|
Build_Other_2017-10-25T12-43-31.463  | 
DONE  | 
build¶
Runs a DSS job to build one or several datasets, saved models or managed folders
build [-h] [--output OUTPUT] [--no-header] [--wait]
                     [--mode MODE] [--dataset DATASET [DATASET ...]]
                     [--folder FOLDER] [--model MODEL]
                     project_key
Specifying outputs to build¶
To build “dataset1” and “dataset2” in project “PROJECT1”, run: dsscli build PROJECT1 --dataset dataset1 --dataset dataset2
For partitioned datasets, use the following syntax:
dsscli build PROJECT1 --dataset dataset1 partition1For multiple partitions, use the regular partition specification syntax:
dsscli build PROJECT1 --dataset dataset1 FR,ENdsscli build PROJECT1 --dataset dataset1 2017-01-02/2017-01-14dsscli build PROJECT1 --dataset dataset1 2017-01-02/2017-01-14|FR,2017-01-02/2017-01-30|EN
Build modes¶
Use –mode to switch between the different build modes of the DSS flow. The argument must be one of:
RECURSIVE_BUILD,
NON_RECURSIVE_FORCED_BUILD
RECURSIVE_FORCED_BUILD
RECURSIVE_MISSING_ONLY_BUILD
Other¶
The –wait argument makes dsscli wait for the end of the job (either success or failure) before returning. If the job fails or is aborted, dsscli returns with a non-zero exit code.
If not waiting, dsscli prints the new job id
job-abort¶
Aborts a running job
dsscli job-abort [-h] project_key job_id
The job_id is the first column returned by the dsscli jobs-list command
job-status¶
Gets the status of a job
dsscli job-status [-h] [--output OUTPUT] [--no-header]
                          project_key job_id
Scenarios-related commands¶
scenarios-list¶
Lists scenarios of a project
dsscli scenarios-list [-h] [--output OUTPUT] [--no-header]
                              project_key
scenario-runs-list¶
Lists previous and current runs of a scenario
dsscli scenario-runs-list [-h] [--output OUTPUT] [--no-header]
                                  [--limit LIMIT] [--only-finished-runs]
                                  project_key scenario_id
--only-finished-runslimits output to runs that are finished (either succeeded, failed or was aborted)--limitlimits the number of returned runs. Default is 10
scenario-run¶
Runs a scenario
dsscli scenario-run [-h] [--output OUTPUT] [--no-header] [--wait]
                            [--no-fail] [--params RUN_PARAMS]
                            project_key scenario_id
If the scenario was already running, the run is cancelled, and a flag is returned in the output
If --wait is passed, command waits for the scenario to be complete and fails if the scenario fails, except if --no-fail is passed. It also fails if the run is cancelled because the scenario was already running
--params is an optional file containing run parameters as a JSON dict. Use ‘-‘ for stdin
scenario-abort¶
dsscli scenario-abort [-h] project_key scenario_id
Aborts the current run of a scenario, if any. Does not fail if the scenario was not running
Projects-related commands¶
projects-list¶
Lists all project keys and names. For example:
./bin/dsscli projects-list
Project key  | 
Name  | 
|---|---|
DKU_HAIKU_STARTER  | 
Haiku Starter for Administrator  | 
project-export¶
Exports a project as zip archive to the specified path. Set the optional flags to modify export options.
project-export [-h] [--uploads] [--no-uploads]
                              [--managed-fs] [--no-managed-fs]
                              [--managed-folders] [--no-managed-folders]
                              [--input-managed-folders]
                              [--no-input-managed-folders]
                              [--input-datasets] [--no-input-datasets]
                              [--all-datasets] [--no-all-datasets]
                              [--analysis-models] [--no-analysis-models]
                              [--saved-models] [--no-saved-models]
                              project_key path
positional arguments:
  project_key           Project key to export
  path                  Target archive path
Example:
./bin/dsscli project-export DKU_HAIKU_STARTER DKU_HAIKU_STARTER.zip
Exporting with options: {"exportManagedFolders": false, "exportAllDatasets": false, "exportManagedFS": false, "exportAllInputDatasets": false, "exportUploads": true, "exportAnalysisModels": true, "exportSavedModels": true, "exportAllInputManagedFolders": false}
project-import¶
Import a project from project zip file.
project-import [-h] [--project-key PROJECT_KEY]  [--remap-connection OLD_CONNECTION=NEW_CONNECTION] path
    positional arguments:
      path                  Source archive path
    optional arguments:
      --project-key PROJECT_KEY
                            Override project key
      --remap-connection OLD_CONNECTION=NEW_CONNECTION
                            Remap a connection
In this example my imported project will have the project key IMPORTED_PROJECT. Example:
 ./bin/dsscli project-import --project-key=IMPORTED_PROJECT --remap-connection filesystem-managed=limited-filesystem MY_PROJECT.zip
Uploading archive ...
Importing ...
Import successful
project-delete¶
Delete a project. Returns nothing on success.
project-delete [-h] project_key
positional arguments:
  project_key  Project key of project to delete
Example:
./bin/dsscli project-delete DKU_HAIKU_STARTER
bundle-export (Design node only)¶
Creates a new bundle for the specified project. If the bundle_id already exists for the project, you will receive an error.
bundle-export [-h] project_key bundle_id
positional arguments:
  project_key  Project key for which to export a bundle
  bundle_id    Identifier of the bundle to create
./bin/dsscli bundle-export DKU_HAIKU_STARTER v1
Start exporting bundle v1 ...
Export completed
bundles-list-exported (Design node only)¶
Returns a table of bundle ids for the specified project.  If --with-data is specified, the full export manifest will be returned in JSON array format.
bundles-list-exported [-h] [--with-data] project_key
positional arguments:
  project_key           Project key for which to list bundles
optional arguments:
  --with-data           Retrieve full information for each bundle
Example:
./bin/dsscli bundles-list-exported DKU_HAIKU_STARTER
Bundle id
-----------
v1
v2
v3
bundle-download-archive (Design node only)¶
Downloads a bundle as a zip file.
bundle-download-archive [-h] project_key bundle_id path
positional arguments:
  project_key  Project key for which to export a bundle
  bundle_id    Identifier of the bundle to create
  path         Target file (- for stdout)
Example:
./bin/dsscli bundle-download-archive DKU_HAIKU_STARTER v1 dku_haiku_bundle_v1.zip
project-create-from-bundle (Automation node only)¶
Creates a project on the Automation node based on a project bundle archive generated from the Design node.
project-create-from-bundle archive_path
positional arguments:
  archive_path          Archive path
Example:
./bin/dsscli project-create-from-bundle ../DESIGN_DATADIR/dku_haiku_bundle_v1.zip
Project key
-----------------
DKU_HAIKU_STARTER
bundles-list-imported (Automation node only)¶
Lists all bundles per project on the automation node. If --with-data is specified, the full export manifest in JSON array format for each bundle is returned.
bundles-list-imported [-h] project_key
positional arguments:
  project_key           Project key for which to list bundles
Example:
./bin/dsscli bundles-list-imported DKU_HAIKU_STARTER
Bundle id
-----------
v1
bundle-import (Automation node only)¶
Imports a bundle on the automation node from a zip file archive. If project does not already exist, use project-create-from-bundle.
bundle-import [-h] project_key archive_path
positional arguments:
  project_key           Project key for which to import a bundle
  archive_path          Archive path
Example:
./bin/dsscli bundle-import DKU_HAIKU_STARTER ~/DESIGN_DATADIR/dku_haiku_bundle_v1.zip
Project key          Bundle id
-------------------  -----------
DKU_HAIKU_STARTER    v1
bundle-activate (Automation node only)¶
Activates a bundle on the automation node. Connection and code environment re-mappings should happen prior to activation.
bundle-activate [-h] project_key bundle_id
positional arguments:
  project_key  Project key for which to activate a bundle
  bundle_id    Identifier of the bundle to activate
Example:
./bin/dsscli bundle-activate DKU_HAIKU_STARTER v1
{
  "aborted": false,
  "unknown": false,
  "alive": false,
  "runningTime": 0,
  "hasResult": true,
  "result": {
    "neededAMigration": false,
    "anyMessage": false,
    "success": false,
    "messages": [],
    "warning": false,
    "error": false,
    "fatal": false
  },
  "startTime": 0
}
Datasets related commands¶
datasets-list¶
Lists the Project key, Name and Type for all datasets in a specified project.
datasets-list [-h] project_key
positional arguments:
  project_key           Project key for which to list datasets
Example:
./bin/dsscli datasets-list DKU_HAIKU_STARTER
Project key  | 
Name  | 
Type  | 
|---|---|---|
DKU_HAIKU_STARTER  | 
Orders  | 
FilesInFolder  | 
dataset-schema-dump¶
Outputs the Name, Type, Meaning and Max. length for all columns in a dataset schema. Meaning and Max. length will only be returned if they were modified from the default.
dataset-schema-dump [-h] project_key name
positional arguments:
  project_key           Project key of the dataset
  name                  Dataset for which to dump the schema
Example:
./bin/dsscli dataset-schema-dump DKU_HAIKU_STARTER Orders
The output would look like:
Name  | 
Type  | 
Meaning  | 
Max. length  | 
|---|---|---|---|
tshirt_quantity  | 
bigint  | 
Integer  | 
200  | 
dataset-list-partitions¶
Lists all partition values for a specified dataset.
dataset-list-partitions [-h] project_key name
positional arguments:
  project_key           Project key of the dataset
  name                  Dataset for which to list partitions
For example, for a dataset with two partitions puchase_date and merchant_id, the output would look like:
purchase_date  | 
merchant_id  | 
|---|---|
2020-12-15  | 
437  | 
dataset-clear¶
Clears the specified dataset and partition, if specified.
dataset-clear [-h] [--partitions PARTITIONS] project_key name
positional arguments:
  project_key           Project key of the dataset
  name                  Dataset to clear
optional arguments:
  --partitions PARTITIONS List of partitions to clear
Example:
./bin/dsscli dataset-clear DKU_HAIKU_STARTER Orders_enriched
dataset-delete¶
Deletes the specified dataset.
dataset-delete [-h] project_key name
positional arguments:
  project_key  Project key of the dataset
  name         Dataset to delete
Example:
./bin/dsscli dataset-delete DKU_HAIKU_STARTER Orders_enriched
Managed folders related commands¶
managed-folders-list¶
Lists all managed folders for the specified project.
managed-folders-list [-h] project_key
positional arguments:
  project_key           Project key of the managed folders
Example:
./bin/dsscli managed-folders-list DKU_HAIKU_STARTER
Name  | 
Type  | 
Id  | 
|---|---|---|
Orders  | 
Filsystem  | 
O2ue6CX3  | 
managed-folder-list-contents¶
Lists the contents of a particular managed folder. managed_folder_id is the Id returned in a call to managed-folders-list.
managed-folder-list-contents [-h] project_key managed_folder_id
positional arguments:
  project_key           Project key of the managed folders
  managed_folder_id     Managed folder id
Example:
./bin/dsscli managed-folder-list-contents DKU_HAIKU_STARTER O2ue6CX3
Path  | 
Size  | 
Last Modified  | 
|---|---|---|
/orders_2017-01.csv  | 
40981  | 
2021-01-25T18:49:48+00:00  | 
managed-folder-get-file¶
Allows you to download a specified file from a managed folder to your server. If no --output-file is specified, the contents of the file_path file will be output to the console.
managed-folder-get-file [-h] [--output-file OUTPUT_FILE] project_key managed_folder_id file_path
positional arguments:
  project_key           Project key of the managed folders
  managed_folder_id     Managed folder id
  file_path             File path
optional arguments:
  --output-file OUTPUT_FILE
                        Path to output file
Example:
./bin/dsscli managed-folder-get-file DKU_HAIKU_STARTER O2ue6CX3 orders_2017-01.csv --output-file my_local_orders.csv
Connections-related commands¶
connections-list¶
Lists all connections with their type and flags, like this:
./bin/dsscli connections-list
Name  | 
Type  | 
Allow write  | 
Allow managed datasets  | 
Usable by  | 
Credentials mode  | 
|---|---|---|---|---|---|
filesystem_managed  | 
Filsystem  | 
True  | 
True  | 
ALLOWED  | 
GLOBAL  | 
Code env related commands¶
code-envs-list¶
Lists the Name, Language and Type of all code environments.
Example:
./bin/dsscli code-envs-list
Name  | 
Language  | 
Type  | 
|---|---|---|
python36  | 
PYTHON  | 
DESIGN_MANAGED  | 
code-env-update¶
Allows you to perform an “update” for a particular code environment. If the --force-rebuild-env flag is included, it will clear the code environment and rebuild it from scratch. Nothing is returned on success.
code-env-update [-h] [--force-rebuild-env] lang name
positional arguments:
  lang                 Language of code env to update
  name                 Name of code env to update
optional arguments:
  --force-rebuild-env  Force rebuilding of the code env
In this example PYTHON is the language and python36 is the code environment name. Example:
./bin/dsscli code-env-update PYTHON python36 --force-rebuild-env
API services related commands¶
api-services-list¶
Lists all API services per project.
api-services-list [-h] project_key
Returns a list of service Ids, Endpoints and Public flags. Example:
./bin/dsscli api-services-list DKU_HAIKU_STARTER
Id  | 
Public?  | 
Endpoints  | 
|---|---|---|
Tutorial_Deployment  | 
Yes  | 
High_Revenue_Customers (STD_PREDICTION)  | 
api-service-package-create¶
Creates a package for an API service based on a project and service_id, which is the same as the Id returned from the api-services-list call. If no --name is specified, the version number will automatically be set to the next package version number available for the service.  If no --path is specified, the package will be downloaded to your current directory.
api-service-package-create [-h] [--name NAME] [--path PATH]  project_key service_id
positional arguments:
  project_key  Project key containing service
  service_id   API service to package
optional arguments:
  --name NAME  Name for the package (default: auto-generated)
  --path PATH  Path to download the package to (default: current directory)
Examples:
./bin/dsscli api-service-package-create DKU_HAIKU_STARTER Tutorial_Deployment
Downloading package to v3.zip
./bin/dsscli api-service-package-create DKU_HAIKU_STARTER Tutorial_Deployment --name v7 --path /Users/ssinick/Documents/
Downloading package to /Users/ssinick/Documents/v7.zip
Controlling dsscli output¶
All dsscli commands that display results take two additional arguments:
--no-headerremoves column headers from display to make them easier to parse (each line of the output directly corresponds to one data item)--output jsonto format output as JSON for machine consumption (the default is--output fancy)