FTP¶

Data Science Studio can use FTP servers to:

Read and write datasets
Read and write managed folders

Note

You can use the DSS Download recipe to cache the contents from a FTP server.

This can provide better performance if you need to read FTP files a lot of time, and don’t mind the copy of the data which is made into a DSS managed folder.

By default, the download recipe will still check the FTP server for updates when its output folder is rebuilt. This behavior can be disabled.

Creating a FTP connection¶

Note

Creating FTP connections can only be done by DSS administrators (except if you use “personal connections”)

Interactive with a FTP server first requires the definition of a connection to the remote server, as follows:

Go to Administration > Connections
Click the “New connection” button and select FTP
Enter a name for the new connection, and the required connection parameters
Save the new connection

FTP connection parameters¶

Name	Description
Host	Host name or IP address of the FTP server to access (Mandatory)
User	FTP username to use, or empty for an anonymous FTP connection
Password	FTP password to use, or empty for an anonymous FTP connection
Use passive mode	Check to use FTP “passive” data transfer mode (default). Using FTP passive mode is often mandatory when there is a firewall between the Data Science Studio server and the FTP server.
Path	Path to the remote folder to use once connected to the FTP server. Start with a `/` to specify an absolute path, without `/` the path will be relative to the startup directory.
Writable	Check to allow DSS to write datasets on this server. Those datasets will be written in a subfolder or the `path` (or the startup directory if `path` is empty) bearing the name of the dataset.
Allow managed datasets	Check to allow DSS to write managed datasets on this server. Requires `Writable`.
Allow managed folders	Check to allow DSS to write managed folders on this server. Requires `Writable`.
Use global proxy	When checked, use the global proxy for this connection. Uncheck this if the FTP server is directly accessible. If you have an HTTP proxy, passive mode is mandatory.

Creating FTP datasets¶

After setting up a FTP connection, simply add a new dataset to your project, choosing the “FTP” type. Select your FTP connection.

If necessary, specify a path (subpath of the connection’s path if it is not empty) or click “Browse” and select a file or directory.

If the final path a directory, the data is the union of all the data in all the files in that directory (including sub-directories). The preview displayed in the dataset creation screen will only present data from the first non-empty file.

Use the FTP dataset for writing¶

Two cases are supported:

In a folder:
- the data will be written in possibly multiple files
- the content of the folder is wiped before writing
- writing a managed dataset requires a directory
In a file:
- you must create the file beforehand (it may be empty)
- the file is emptied before writing