# Model Document Generator¶

You can use the Model Document Generator to create documentation associated with any trained model. It will generate a Microsoft Word™ .docx file that provide information regarding:

• What the model does

• How the model was built (algorithms, features, processing, …)

• How the model was tuned

• What are the model’s performances

It allows you to prove that you followed industry best practices to build your model.

## Generate a model document¶

Note

To use this feature, the graphical export library must be activated on your DSS instance. Please check with your administrator.

Once a model has been trained, you can generate a document from it with the following steps:

• Go to the trained model you wish to document (either a model trained in a Visual Analysis of the Lab or a version of a saved model deployed in the Flow)

• Click the Actions button on the top-right corner

• Select Export model documentation

• On the modal dialog, select the default template or upload your own template, and click Export to generate the document

• After the document is generated, you will be able to download your generated model document using the Download link

Warning

Any placeholders starting with the keyword design will be linked to the current status of your visual analysis. If you change parameters in your model, DSS will consider that placeholder values are outdated and will replace these placeholders with blanks in the generated documentation. If you delete your visual analysis and then generate a document from a saved model, any placeholders starting with the keyword design will not provide any result.

## Custom templates¶

If you want the document generator to use your own template, you need to use Microsoft Word™ to create a .docx file.

You can create your base document as you would create a normal Word document, and add placeholders when you want to retrieve information from your model.

### Sample templates¶

Instead of starting from scratch, you can modify the default templates:

### Placeholders¶

A placeholder is defined by a placeholder name inside two brackets, like {{design.prediction_type.name}}. The generator will read the placeholder and replace it with the value retrieved from the model.

There are multiples types of placeholders, which can produce text, an image, or a table.

You can format the style of a table placeholder by using a block placeholder and placing an empty table inside it.

### Conditional placeholders¶

If you want to build a single template able to handle several kinds of models, you may need to display information only when they are relevant. For example, you may want to describe what is a binary classification, but this description should only appear on your binary classification models. This can be achieved with a conditional placeholder.

A conditional placeholder contains three elements, each of them needs to be on a separate line:

• a condition

• a text to display if the condition is valid

• a closing element

Example:

{{if design.prediction_type.name == Binary classification}}

The model {{design.prediction_type.name}} is a binary classification.

{{endif design.prediction_type.name}}

{{if design.prediction_type.name != Binary classification}}

The model {{design.prediction_type.name}} is a not binary classification.

{{endif design.prediction_type.name}}


The condition itself is composed of three parts:

• a text placeholder

• an operator (== or !=)

• a reference value

Example:

{{if my.placeholder == myReferenceValue }}


During document generation, the placeholder is replaced by its value and compared to the reference value. If the condition is correct, the text is displayed, otherwise nothing will appear on the final document.

### List of placeholders¶

Placeholders related to the dataset:

Name

Compatibility

Description

dataset.prepare_steps.table

All

Preparation steps used on the dataset

dataset.prepare_steps.status

All

Tell if preparation steps are used on the dataset

Placeholders related to the design phase of a model:

Name

Compatibility

Description

All

Name of the modeling task

design.visual_analysis.name

All

Name of the visual analysis

design.algorithm.name

All

Name of the algorithm used to compute the model

design.target.name

Prediction

Name of the target variable

design.features_count.value

All

Number of columns in the train set

design.model_type.name

All

Type of model (Prediction or Clustering)

design.prediction_type.name

Prediction

Type of prediction (Regression, Binary classification or Multi-class classification)

design.target_proportion.plot

Classification

Proportions of classes in the guess sample

design.training_and_testing_strategy.table

Prediction

Strategy used to train and test the model

design.training_and_testing_strategy.policy.value

Prediction

Name of the policy used to train and test the model

design.sampling_method.value

Prediction

Sampling method named used to train and test the model

design.k_fold_cross_testing.value

Prediction

Tell if the K-fold strategy was used to train and test the model

design.sampling_and_splitting.image

Prediction

Sampling and splitting strategy used to train and test the model

design.train_set.image

Prediction

Explicit train set strategy

design.test_set.image

Prediction

Explicit test set strategy

design.test_metrics.name

Prediction

Metric used to optimize model hyperparameters

design.input_feature.table

All

Summary of input features

design.feature_generation_pairwise_linear.value

Prediction

Display if pairwise linear combinations are used in the feature generation

design.feature_generation_pairwise_polynomial.value

Prediction

Display if pairwise polynomial combinations are used in the feature generation

design.feature_generation_explicit_pairwise.status

Prediction

Display if the model contains explicit pairwise interactions used in the feature generation

design.feature_generation_explicit_pairwise.table

Prediction

Display explicit pairwise interactions used in the feature generation

design.feature_reduction.image

Prediction

Reduction method applied to the model

design.feature_reduction.name

Prediction

Name of the feature reduction applied to the model

design.chosen_algorithm_search_strategy.table

All

List the parameters used to configure the chosen algorithm

design.chosen_algorithm_search_strategy.text

All

Description of the chosen algorithm

design.other_algorithms_search_strategy.table

All

Parameters and description of the other computed algorithms

design.hyperparameter_search_strategy.image

Prediction

Hyperparameter search strategy used to compute the model

design.hyperparameter_search_strategy.table

Prediction

Summary of the hyperparameter search strategy

design.cross_validation_strategy.value

Prediction

Name of cross-validation strategy used on hyperparameters

design.dimensionality_reduction.table

Clustering

Dimensionality reduction used to compute the model

design.dimensionality_reduction.status

Clustering

Tell if a dimensionality reduction was used to compute the model

design.outliers_detection.table

Clustering

Outliers detection strategy used to compute the model

design.outliers_detection.status

Clustering

Tell if an outliers detection strategy was used to compute the model

design.weighting_strategy.method.name

Prediction

Name of the weighting strategy

design.weighting_strategy.variable.name

Prediction

Name of the weighting strategy associated variable

design.weighting_strategy.text

Prediction

Description of a weighting strategy

design.calibration_strategy.name

Classification

Name of the probability calibration method

design.calibration_strategy.text

Classification

Description of a probability calibration strategy

design.backend.name

All

Name of the backend engine

design.partitioned_model.status

All

Tell if the current model is a partitioned model

Placeholders related to the result of a model computation:

Name

Compatibility

Description

result.chosen_algorithm.name

All

Name of the algorithm selected for the mode used in the current document generation

result.train_set.sample_rows_count.value

All

Number of rows in the train set

result.test_metrics.value

Prediction

Value of the test metrics

result.target_value.positive_class.value

Binary classification

Name of the target value representing the positive class

result.target_value.negative_class.value

Binary classification

Name of the target value representing the negative class

result.threshold_metric.name

Binary classification

Name of the threshold (cut off) metric

result.classification_threshold.current.value

Binary classification

Current value of the threshold metric

result.classification_threshold.optimal.value

Binary classification

Optimal value of the threshold metric

result.chosen_algorithm_details.image

All

Summary of the prediction computation

result.cluster_summary.image

Clustering

Summary of the clustering computation

result.cluster_description.image

Clustering

Description of each category of the clustering

result.partitioned.summary.image

Prediction

Summary of the partitioned model execution

result.decision_trees.image

Prediction

Random forest decision tree visualisation

result.decision_trees.status

All

Tell if the model is compatible with a decision tree visualisation

result.regression_coefficients.image

Regression and binary classification

Regression coefficient chart

result.regression_coefficients.status

All

Tell if the model has a regression coefficient chart

result.feature_importance.plot

All

Feature importance chart

result.feature_importance.status

All

Does the model had a feature importance chart

result.individual_explanations.image

Prediction

Individual explanation chart

result.individual_explanations.text

Prediction

Description of the individual explanation chart

result.individual_explanations.status

All

Tell if the model contains an individual explanation chart

result.hyperparameter_search.plot

Prediction

Hyperparameter search results as a chart

result.hyperparameter_search.table

Prediction

Displays data associated to the hyperparameter search as an table

result.hyperparameter_search.status

All

Tell if the model had a hyperparameter search results chart

result.confusion_matrix.image

Classification

Confusion matrix results as a table

result.confusion_matrix_table.text

Binary classification

Description of the confusion matrix

result.confusion_matrix_metrics.plot

Classification

Metrics associated to the confusion matrix

result.confusion_matrix_metrics.text

Classification

Description of the different metrics

result.cost_matrix.image

Binary classification

Cost matrix as a table

result.cost_matrix.text

Binary classification

Description of the cost matrix

result.decision_chart.plot

Binary classification

Decision chart

result.decision_chart.text

Binary classification

Description of the decision chart

result.lift_curve.plot

Binary classification

Lift curve charts

result.lift_curve.text

Binary classification

Description of the lift curve charts

result.calibration.plot

Classification

Calibration curve chart

result.calibration.text

Classification

Description of the calibration curve chart

result.roc_curve.plot

Classification

ROC curve chart

result.roc_curve.text

Classification

Description of the ROC curve chart

result.density_chart.plot

Classification

Density chart

result.density_chart.text

Classification

Description of the density chart

result.hierarchy.plot

Clustering

Hierarchy of the interactive clustering model

result.anomalies.plot

Clustering

Anomalies detected displayed as a heatmap

result.cluster_heat_map.plot

Clustering

Features heatmap

result.scatter.plot

Regression and clustering

Scatter plot chart

result.error_distribution.table

Regression

Error distribution as a table

result.error_distribution.plot

Regression

Error distribution as a chart

result.error_distribution.text

Regression

Description of the error distribution

result.detailed_metrics.table

All

Detailed summary of the model computation

result.ml_assertions.table

Prediction

Assertions definitions and metrics

result.timings.table

All

Time of the different execution steps

result.diagnostics.table

All

All Diagnostics for the current model

result.diagnostics.table.dataset_sanity_checks

All

Diagnostics of type dataset sanity checks for the current model

result.diagnostics.table.modeling_parameters

All

Diagnostics of type modeling parameters for the current model

result.diagnostics.table.runtime

All

Diagnostics of type runtime that check the model training speed

result.diagnostics.table.training_overfit

All

Diagnostics of type training overfit for the current model

result.diagnostics.table.leakage_detection

All

Diagnostics of type leakage detection for the current model

result.diagnostics.table.model_check

All

Diagnostics of type model check for the current model

All

result.training_date.name

All

Date of the training

All

Image of the default leaderboard of the computed models

Classification

Image of the leaderboard of the computed models for the metric accuracy

Classification

Image of the leaderboard of the computed models for the metric precision

Classification

Image of the leaderboard of the computed models for the metric recall

Classification

Image of the leaderboard of the computed models for the metric F1 score

Classification

Image of the leaderboard of the computed models for the metric cost matrix gain

Classification

Image of the leaderboard of the computed models for the metric log loss

Classification

Image of the leaderboard of the computed models for the metric ROC AUC

Binary classification

Image of the leaderboard of the computed models for the metric calibration loss

Binary classification

Image of the leaderboard of the computed models for the metric lift

Regression

Image of the leaderboard of the computed models for the metric EVS

Regression

Image of the leaderboard of the computed models for the metric MAPE

Regression

Image of the leaderboard of the computed models for the metric MAE

Regression

Image of the leaderboard of the computed models for the metric MSE

Regression

Image of the leaderboard of the computed models for the metric RMSE

Regression

Image of the leaderboard of the computed models for the metric RMSLE

Regression

Image of the leaderboard of the computed models for the metric R2 score

Regression

Image of the leaderboard of the computed models for the metric correlation

Clustering

Image of the leaderboard of the computed models for the metric silhouette

Clustering

Image of the leaderboard of the computed models for the metric inertia

Clustering

Image of the leaderboard of the computed models for the metric clusters number

Placeholders related to the DSS configuration:

Name

Compatibility

Description

config.author.name

All

Name of the user that launched the generation

config.author.email

All

E-mail of the user that launched the generation

config.environment.name

All

Name of the DSS environment

config.dss.version

All

Current version of DSS

config.is_saved_model.value

All

Yes when documenting a model exported into the Flow, No otherwise

config.generation_date.name

All

Generation date of the output document

All

URL to access the project

config.output_file.name

All

Name of the generated file

### List of conditional placeholders¶

Placeholders that can be used as conditional placeholders:

• dataset.prepare_steps.status

• design.visual_analysis.name

• design.algorithm.name

• design.target.name

• design.features_count.value

• design.model_type.name

• design.prediction_type.name

• design.training_and_validation_strategy.policy.value

• design.sampling_method.value

• design.k_fold_cross_testing.value

• design.feature_generation_pairwise_linear.value

• design.feature_generation_pairwise_polynomial.value

• design.feature_generation_explicit_pairwise.status

• design.feature_reduction.name

• design.chosen_algorithm_search_strategy.text

• design.cross_validation_strategy.value

• design.dimensionality_reduction.status

• design.outliers_detection.status

• design.weighting_strategy.method.name

• design.weighting_strategy.variable.name

• design.calibration_strategy.name

• design.backend.name

• design.partitioned_model.status

• result.chosen_algorithm.name

• result.train_set.sample_rows_count.value

• result.target_value.positive_class.value

• result.target_value.negative_class.value

• result.threshold_metric.name

• result.classification_threshold.current.value

• result.classification_threshold.optimal.value

• result.decision_trees.status

• result.regression_coefficients.status

• result.feature_importance.status

• result.individual_explanations.status

• result.hyperparameter_search.status

• result.training_date.name

• config.author.name

• config.author.email

• config.environment.name

• config.dss.version

• config.is_saved_model.value

• config.generation_date.name

• config.output_file.name

### Deprecated placeholders¶

Name

Compatibility

Description

Replaced by

result.detailed_metrics.image

All

Detailed summary of the model computation

result.detailed_metrics.table

## Limitations¶

• You need to upgrade the table of contents manually after a document generation

• The Model Document Generator is currently not compatible with ensemble models

• The Model Document Generator is currently not compatible with Keras models

• The Model Document Generator is not compatible with plugin algorithms

• When generating a document from a visual analysis, the model document generator will not display outdated design information. To fix this issue, you can either run a new analysis or revert the design to match the analysis

• When generating a document from a saved model version, some information may be extracted from the original Visual Analysis Design settings. As a result, any placeholders starting with design will be empty when the Visual Analysis was modified or erased

• Each part of a conditional placeholder must necessarily be in a new line

• Table and conditional placeholders are not supported on headers and footers