When training

When you train a machine learning model, a portion of the data is held out in order to evaluate the performance of the machine learning model (this “held-out” data is then called the test set).

The result of this evaluation operation is what can be seen in the Results screens of the models:

  • Performance metrics

  • Confusion matrix

  • Decision charts

  • Density charts

  • Lift charts

  • Error deciles

  • Partial dependences

  • Subpopulation analysis

For more details on all possible result screens, see Prediction Results

Each time you train a model in the visual analysis, and each time you retrain a saved model through a training recipe, this produces a new version of the model and its associated evaluation.

Subsequent evaluations

In addition to the evaluation that is automatically generated when training a model, it can be useful to evaluate a model on a different dataset, at a later time.

This is especially useful to detect Drift, i.e. when a model does not perform as well anymore, usually because the external conditions have changed.

In DSS, creating subsequent evaluations is done using an Evaluation recipe. These evaluations are stored in a Model Evaluation Store