“Files in folder” dataset¶
A managed folder is a generic object in the DSS flow that contains files. It can be read and written by Python or R recipes, and files can be added to it either manually or via an API.
In some setups, it is useful to recreate a dataset from such a managed folder. To do that, use the files in folder dataset.
- Select the folder to read from
- If needed, select a pattern to select the files to read from the folder. If you don’t select a pattern, all files in the folder are read.
This enables advanced setups like:
- Having a Python dataset download files from a files-oriented data store that DSS cannot read. The Python recipe downloads the files to a managed folder. This way, it does not have to deal with parsing the files.
- Having a files-in-folder dataset do the parsing and extraction from the intermediate folder.