Metrics, checks and Data Quality¶
Metrics allow you to automate computation of various measurements on flow items (datasets, managed folders, saved models and model evaluation stores). You can use Checks to assert whether metric values meet certain conditions.
Data Quality rules are an improvement over the check mechanism for datasets. They allow you to define expectations on a dataset’s contents in a single step and also provide different views to monitor and analyze data quality issues across datasets, projects, and the full Dataiku instance.
- Metrics
- Checks
- Data Quality Rules
- Data Quality rule types
- Column min in range
- Column avg in range
- Column max in range
- Column sum in range
- Column median in range
- Column std dev in range
- Column values are not empty
- Column values are empty
- Column values are unique
- Column values in set
- Column top N values in set
- Column most frequent value in set
- Column values are valid according to meaning
- Metric value in range
- Metric value in set
- File size in range
- Record count in range
- Column count in range
- Python code
- Compare values of two metrics
- Plugin rules
- Rule configuration
- Data Quality monitoring views
- Other data quality views
- Data Quality on partitioned datasets
- Retro-compatibility with Checks
- Data Quality rule types
- Custom probes and checks
- Data Quality Templates