Extract failed rows¶
The “extract failed rows” recipe allows users to create a new dataset containing the records that failed the Data Quality rules defined on the input dataset.
The output dataset will include all columns from the original dataset and new columns that have been appended for each Data Quality rule.
See also
For more information about Data Quality, see also the following article in the Knowledge Base:
Create a recipe¶
Users have two options to extract their failed Data Quality rows.
The first option is to initiate the extract from the ‘Current status’ tab of the Data Quality view within a dataset under the vertical dots ‘More actions’.
The second option is to click on the ‘+Recipe’ button from the flow. Alternatively, if you have selected a dataset, go to the right panel’s Action tab, and select ‘Other recipes’ > ‘Extract failed rows’.
Supported Data Quality rules¶
The extract failed rows recipe is currently compatible with 5 rule types:
Column values are not empty
Column values are empty
Column values are unique
Column values in set
Column values are valid according to meaning (only with DSS engine)
Engines¶
Depending on the input dataset types, DSS will adjust the engine it uses to execute the recipe, and choose between Hive, Impala, SparkSQL, plain SQL, and internal DSS. The available engines can be seen and selected by clicking on the cogwheel below the “Run” button.