Concepts¶

When training¶

When you train a machine learning model, a portion of the data is held out in order to evaluate the performance of the machine learning model (this “held-out” data is then called the test set).

The result of this evaluation operation is what can be seen in the Results screens of the models:

Performance metrics
Confusion matrix
Decision charts
Density charts
Lift charts
Error deciles
Partial dependences
Subpopulation analysis
…

For more details on all possible result screens, see Prediction Results

Each time you train a model in the visual analysis, and each time you retrain a saved model through a training recipe, this produces a new version of the model and its associated evaluation.

Subsequent evaluations¶

In addition to the evaluation that is automatically generated when training a model, it can be useful to evaluate a model on a different dataset, at a later time.

This is especially useful to detect Drift, i.e. when a model does not perform as well anymore, usually because the external conditions have changed.

In DSS, creating subsequent evaluations is done using an Evaluation recipe. These evaluations are stored in a Model Evaluation Store