Excel

Dataiku DSS handles Excel spreadsheets saved in XLS and XLSX format, and automatically detects them.

It will also detect which of the sheets in the file might contain data, and on which row the column headers could be. You can also manually configure which sheet(s) to include. You can either choose to include:

  • All: to include all sheets. Useful if the Excel can change over time and include new sheets in the future

  • By name: to include specific sheets based on their exact names

  • By indices: to include specific sheets based on their positions. To include multiple sheets separate their indices with commas (ex: 1,2,4, 1,5), use ranges (ex: 5-7, 2-, -5) or a mix of both (ex: 1,2,5-7).

  • By pattern: to include specific sheets whose names match a regular expression pattern (ex: [0-9]{4} or sales-.*). All sheets where the pattern are found is selected. If you want to ensure that only sheets that perfectly match the pattern are included, use ^ and $.

When creating a dataset with multiple sheets, all sheets are expected to have the same schema (same number of columns and same column names)

When uploading an Excel file containing multiple sheets, you can either create a single dataset for multiple sheets or one dataset per sheet. For more advanced capabilities when importing multiple excel sheets, you can checkout the Excel sheet importer plugin.

Exporting

Dataiku DSS offers built-in capabilities to export a dataset into an Excel spreadsheet.

If some columns have been configured with conditional formatting “color by rules”, DSS allows to color the cells accordingly when exporting the dataset to Excel. Please note that while cells are colored in the exported Excel files, rules themselves are not exported.

To export a dataset into an existing Excel files acting as a template, see Excel Templater