Model Document Generator

You can use the Model Document Generator to create documentation associated with any trained model. It will generate a Microsoft Word™ .docx file that provide information regarding:

  • What the model does
  • How the model was built (algorithms, features, processing, …)
  • How the model was tuned
  • What are the model’s performances

It allows you to prove that you followed industry best practices to build your model.

Generate a model document

Note

To use this feature, the graphical export library must be activated on your DSS instance. Please check with your administrator.

Once a model has been trained, you can generate a document from it with the following steps:

  • Go to the trained model you wish to document (either a model trained in a Visual Analysis of the Lab or a version of a saved model deployed in the Flow)
  • Click the Actions button on the top-right corner
  • Select Export model documentation
  • On the modal dialog, select the default template or upload your own template, and click Export to generate the document
  • After the document is generated, you will be able to download your generated model document using the Download link

Warning

Any placeholders starting with the keyword design will be linked to the current status of your visual analysis. If you change parameters in your model, DSS will consider that placeholder values are outdated and will replace these placeholders with blanks in the generated documentation. If you delete your visual analysis and then generate a document from a saved model, any placeholders starting with the keyword design will not provide any result.

Custom templates

If you want the document generator to use your own template, you need to use Microsoft Word™ to create a .docx file.

You can create your base document as you would create a normal Word document, and add placeholders when you want to retrieve information from your model.

If you prefer to modify the default template, you can download the default template for prediction or the default template for clustering.

A placeholder is defined by a placeholder name inside two brackets, like {{design.prediction_type.name}}. The generator will read the placeholder and replace it with the value retrieved from the model.

There are multiples types of placeholders, which can produce text, an image, or a table.

../_images/mdg-text.png ../_images/mdg-image.png ../_images/mdg-table.png

You can format the style of a table placeholder by using a block placeholder and placing an empty table inside it.

../_images/mdg-table2.png

Conditional placeholders

If you want to build a single template able to handle several kinds of models, you may need to display information only when they are relevant. For example, you may want to describe what is a binary classification, but this description should only appear on your binary classification models. This can be achieved with a conditional placeholder.

A conditional placeholder contains three elements, each of them needs to be on a separate line:

  • a condition
  • a text to display if the condition is valid
  • a closing element

Example:

{{if design.prediction_type.name == Binary classification}}

The model {{design.prediction_type.name}} is a binary classification.

{{endif design.prediction_type.name}}
{{if design.prediction_type.name != Binary classification}}

The model {{design.prediction_type.name}} is a not binary classification.

{{endif design.prediction_type.name}}

The condition itself is composed of three parts:

  • a text placeholder
  • an operator (== or !=)
  • a reference value

Example:

{{if my.placeholder == myReferenceValue }}

During document generation, the placeholder is replaced by its value and compared to the reference value. If the condition is correct, the text is displayed, otherwise nothing will appear on the final document.

List of placeholders

Placeholders related to the dataset:

Name Compatibility Description
dataset.prepare_steps.table All Preparation steps used on the dataset
dataset.prepare_steps.status All Tell if preparation steps are used on the dataset

Placeholders related to the design phase of a model:

Name Compatibility Description
design.mltask.name All Name of the modeling task
design.visual_analysis.name All Name of the visual analysis
design.algorithm.name All Name of the algorithm used to compute the model
design.target.name Prediction Name of the target variable
design.features_count.value All Number of columns in the train set
design.model_type.name All Type of model (Prediction or Clustering)
design.prediction_type.name Prediction Type of prediction (Regression, Binary classification or Multi-class classification)
design.target_proportion.plot Classification Proportions of classes in the guess sample
design.training_and_testing_strategy.table Prediction Strategy used to train and test the model
design.training_and_testing_strategy.policy.value Prediction Name of the policy used to train and test the model
design.sampling_method.value Prediction Sampling method named used to train and test the model
design.k_fold_cross_testing.value Prediction Tell if the K-fold strategy was used to train and test the model
design.sampling_and_splitting.image Prediction Sampling and splitting strategy used to train and test the model
design.train_set.image Prediction Explicit train set strategy
design.test_set.image Prediction Explicit test set strategy
design.test_metrics.name Prediction Metric used to optimize model hyperparameters
design.input_feature.table All Summary of input features
design.feature_generation_pairwise_linear.value Prediction Display if pairwise linear combinations are used in the feature generation
design.feature_generation_pairwise_polynomial.value Prediction Display if pairwise polynomial combinations are used in the feature generation
design.feature_generation_explicit_pairwise.status Prediction Display if the model contains explicit pairwise interactions used in the feature generation
design.feature_generation_explicit_pairwise.table Prediction Display explicit pairwise interactions used in the feature generation
design.feature_reduction.image Prediction Reduction method applied to the model
design.feature_reduction.name Prediction Name of the feature reduction applied to the model
design.chosen_algorithm_search_strategy.table All List the parameters used to configure the chosen algorithm
design.chosen_algorithm_search_strategy.text All Description of the chosen algorithm
design.other_algorithms_search_strategy.table All Parameters and description of the other computed algorithms
design.hyperparameter_search_strategy.image Prediction Hyperparameter search strategy used to compute the model
design.hyperparameter_search_strategy.table Prediction Summary of the hyperparameter search strategy
design.cross_validation_strategy.value Prediction Name of cross-validation strategy used on hyperparameters
design.dimensionality_reduction.table Clustering Dimensionality reduction used to compute the model
design.dimensionality_reduction.status Clustering Tell if a dimensionality reduction was used to compute the model
design.outliers_detection.table Clustering Outliers detection strategy used to compute the model
design.outliers_detection.status Clustering Tell if an outliers detection strategy was used to compute the model
design.weighting_strategy.method.name Prediction Name of the weighting strategy
design.weighting_strategy.variable.name Prediction Name of the weighting strategy associated variable
design.weighting_strategy.text Prediction Description of a weighting strategy
design.calibration_strategy.name Classification Name of the probability calibration method
design.calibration_strategy.text Classification Description of a probability calibration strategy
design.backend.name All Name of the backend engine
design.partitioned_model.status All Tell if the current model is a partitioned model

Placeholders related to the result of a model computation:

Name Compatibility Description
result.chosen_algorithm.name All Name of the algorithm selected for the mode used in the current document generation
result.train_set.sample_rows_count.value All Number of rows in the train set
result.test_metrics.value Prediction Value of the test metrics
result.target_value.positive_class.value Binary classification Name of the target value representing the positive class
result.target_value.negative_class.value Binary classification Name of the target value representing the negative class
result.threshold_metric.name Binary classification Name of the threshold (cut off) metric
result.classification_threshold.current.value Binary classification Current value of the threshold metric
result.classification_threshold.optimal.value Binary classification Optimal value of the threshold metric
result.chosen_algorithm_details.image All Summary of the prediction computation
result.cluster_summary.image Clustering Summary of the clustering computation
result.cluster_description.image Clustering Description of each category of the clustering
result.partitioned.summary.image Prediction Summary of the partitioned model execution
result.decision_trees.image Prediction Random forest decision tree visualisation
result.decision_trees.status All Tell if the model is compatible with a decision tree visualisation
result.regression_coefficients.image Regression and binary classification Regression coefficient chart
result.regression_coefficients.status All Tell if the model has a regression coefficient chart
result.feature_importance.plot All Feature importance chart
result.feature_importance.status All Does the model had a feature importance chart
result.individual_explanations.image Prediction Individual explanation chart
result.individual_explanations.text Prediction Description of the individual explanation chart
result.individual_explanations.status All Tell if the model contains an individual explanation chart
result.hyperparameter_search.plot Prediction Hyperparameter search results as a chart
result.hyperparameter_search.table Prediction Displays data associated to the hyperparameter search as an table
result.hyperparameter_search.status All Tell if the model had a hyperparameter search results chart
result.confusion_matrix.image Classification Confusion matrix results as a table
result.confusion_matrix_table.text Binary classification Description of the confusion matrix
result.confusion_matrix_metrics.plot Classification Metrics associated to the confusion matrix
result.confusion_matrix_metrics.text Classification Description of the different metrics
result.cost_matrix.image Binary classification Cost matrix as a table
result.cost_matrix.text Binary classification Description of the cost matrix
result.decision_chart.plot Binary classification Decision chart
result.decision_chart.text Binary classification Description of the decision chart
result.lift_curve.plot Binary classification Lift curve charts
result.lift_curve.text Binary classification Description of the lift curve charts
result.calibration.plot Classification Calibration curve chart
result.calibration.text Classification Description of the calibration curve chart
result.roc_curve.plot Classification ROC curve chart
result.roc_curve.text Classification Description of the ROC curve chart
result.density_chart.plot Classification Density chart
result.density_chart.text Classification Description of the density chart
result.hierarchy.plot Clustering Hierarchy of the interactive clustering model
result.anomalies.plot Clustering Anomalies detected displayed as a heatmap
result.cluster_heat_map.plot Clustering Features heatmap
result.scatter.plot Regression and clustering Scatter plot chart
result.error_distribution.table Regression Error distribution as a table
result.error_distribution.plot Regression Error distribution as a chart
result.error_distribution.text Regression Description of the error distribution
result.detailed_metrics.image All Detailed summary of the model computation
result.timings.table All Time of the different execution steps
result.full_log.link All URL to download the model’s logs
result.training_date.name All Date of the training
result.leaderboard.image All Image of the default leaderboard of the computed models
result.leaderboard.accuracy.image Classification Image of the leaderboard of the computed models for the metric accuracy
result.leaderboard.precision.image Classification Image of the leaderboard of the computed models for the metric precision
result.leaderboard.recall.image Classification Image of the leaderboard of the computed models for the metric recall
result.leaderboard.f1.image Classification Image of the leaderboard of the computed models for the metric F1 score
result.leaderboard.cost_matrix.image Classification Image of the leaderboard of the computed models for the metric cost matrix gain
result.leaderboard.log_loss.image Classification Image of the leaderboard of the computed models for the metric log loss
result.leaderboard.roc_auc.image Classification Image of the leaderboard of the computed models for the metric ROC AUC
result.leaderboard.calibration_loss.image Binary classification Image of the leaderboard of the computed models for the metric calibration loss
result.leaderboard.lift.image Binary classification Image of the leaderboard of the computed models for the metric lift
result.leaderboard.evs.image Regression Image of the leaderboard of the computed models for the metric EVS
result.leaderboard.mape.image Regression Image of the leaderboard of the computed models for the metric MAPE
result.leaderboard.mae.image Regression Image of the leaderboard of the computed models for the metric MAE
result.leaderboard.mse.image Regression Image of the leaderboard of the computed models for the metric MSE
result.leaderboard.rmse.image Regression Image of the leaderboard of the computed models for the metric RMSE
result.leaderboard.rmsle.image Regression Image of the leaderboard of the computed models for the metric RMSLE
result.leaderboard.r2.image Regression Image of the leaderboard of the computed models for the metric R2 score
result.leaderboard.correlation.image Regression Image of the leaderboard of the computed models for the metric correlation
result.leaderboard.silhouette.image Clustering Image of the leaderboard of the computed models for the metric silhouette
result.leaderboard.inertia.image Clustering Image of the leaderboard of the computed models for the metric inertia
result.leaderboard.clusters.image Clustering Image of the leaderboard of the computed models for the metric clusters number

Placeholders related to the DSS configuration:

Name Compatibility Description
config.author.name All Name of the user that launched the generation
config.author.email All E-mail of the user that launched the generation
config.environment.name All Name of the DSS environment
config.dss.version All Current version of DSS
config.is_saved_model.value All Yes when documenting a model exported into the Flow, No otherwise
config.generation_date.name All Generation date of the output document
config.project.link All URL to access the project
config.output_file.name All Name of the generated file

List of conditional placeholders

Placeholders that can be used as conditional placeholders:

  • dataset.prepare_steps.status
  • design.mltask.name
  • design.visual_analysis.name
  • design.algorithm.name
  • design.target.name
  • design.features_count.value
  • design.model_type.name
  • design.prediction_type.name
  • design.training_and_validation_strategy.policy.value
  • design.sampling_method.value
  • design.k_fold_cross_testing.value
  • design.feature_generation_pairwise_linear.value
  • design.feature_generation_pairwise_polynomial.value
  • design.feature_generation_explicit_pairwise.status
  • design.feature_reduction.name
  • design.chosen_algorithm_search_strategy.text
  • design.cross_validation_strategy.value
  • design.dimensionality_reduction.status
  • design.outliers_detection.status
  • design.weighting_strategy.method.name
  • design.weighting_strategy.variable.name
  • design.calibration_strategy.name
  • design.backend.name
  • design.partitioned_model.status
  • result.chosen_algorithm.name
  • result.train_set.sample_rows_count.value
  • result.target_value.positive_class.value
  • result.target_value.negative_class.value
  • result.threshold_metric.name
  • result.classification_threshold.current.value
  • result.classification_threshold.optimal.value
  • result.decision_trees.status
  • result.regression_coefficients.status
  • result.feature_importance.status
  • result.individual_explanations.status
  • result.hyperparameter_search.status
  • result.training_date.name
  • config.author.name
  • config.author.email
  • config.environment.name
  • config.dss.version
  • config.is_saved_model.value
  • config.generation_date.name
  • config.output_file.name

Limitations

  • You need to upgrade the table of content manually after a document generation
  • The Model Document Generator is currently not compatible with ensemble models
  • The Model Document Generator is currently not compatible with Keras models
  • The Model Document Generator is currently not compatible with Vertica models
  • The Model Document Generator is not compatible with plugin algorithms
  • When generating a document from a visual analysis, the model document generator will not display outdated design information. To fix this issue, you can either run a new analysis or revert the design to match the analysis
  • When generating a document from a saved model version, some information may be extracted from the original Visual Analysis Design settings. As a result, any placeholders starting with design will be empty when the Visual Analysis was modified or erased
  • Each part of a conditional placeholder must necessary be in a new line
  • Table and conditional placeholders are not supported on headers and footers