You can initialize a model training with weights from another model, in order to:
- Update a pre-trained model via transfer learning
- Fine-tune a model previously trained with a higher learning rate
To do so, Keras provides the load_model and load_weigths methods to retrieve previously saved models or weights.
DSS provides methods to retrieve the location of models, either in a ML task or in a saved model:
# in dataiku.doctor.deep_learning.load_model get_keras_model_from_trained_model(session_id=None, analysis_id=None, mltask_id=None) get_keras_model_location_from_trained_model(session_id=None, analysis_id=None, mltask_id=None) get_keras_model_from_saved_model(saved_model_id) get_keras_model_location_from_saved_model(saved_model_id)
This preprocesses the data in batches and not all at once, which helps to prevent use of too much memory, in particular for text and images, which are memory-intensive.
DSS will preprocess data and produce those sequences: train and validation (what we usually call test is called validation in Keras terminology), depending on the size of each batch, and call fit_generator. You can customize how the process is done.
The Advanced mode for training (accessible by clicking on the top right of the analysis) allows you to modify the data, preprocessed by DSS that will be sent to the model, and to customize the parameters of the call to fit_generator. In particular, the two main use cases of using the Advanced mode are:
- data augmentation
- using custom Callbacks
You need to fill two methods
The method build_sequence should return the sequences that will be used to train the model. To do so, you have access to helpers build_train_sequence_with_batch_size and build_validation_sequence_with_batch_size, which are functions that return sequences depending on a batch_size.
Then you can modify at will these sequences before training. In particular, you may want to perform some data augmentation. DSS provides a helper to do so, which looks like:
from dataiku.doctor.deep_learning.sequences import DataAugmentationSequence from keras.preprocessing.image import ImageDataGenerator original_batch_size = 8 train_sequence = build_train_sequence_with_batch_size(original_batch_size) augmentator = ImageDataGenerator( zoom_range=0.2, shear_range=0.5, rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True ) augmented_sequence = DataAugmentationSequence(train_sequence, "image_name_preprocessed", augmentator, n_augmentation=3)
- image_name_preprocessed is the name of the input to augment
- n_augmentation is the number of time the sequence is augmented
ImageDataGenerator is a helper provided by Keras to perform data augmentation on images.
You can provide your own augmentator. It must be an instance of a class that implements a random_transform method that has the following signature:
def random_transform(x, seed=None): # returns a numpy array with the same shape as x
When you use data augmentation, you need to be aware that the actual batch size of its augmented sequence will be original_batch_size * n_augmentation, therefore you may want to provide a smaller original_batch_size.
The method fit_model allows you to define custom Keras Callbacks. However, DSS expects that you will call fit_generator with train_sequence and validation_sequence.
As per Keras documentation,
A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training
DSS builds a list of base_callbacks (to compute metrics, interrupt model if requested in the UI …) that must be added in the call to fit_generator. Then, you are free to add any custom callback to this list.
Many builtin (or custom) Callbacks from Keras require a metric to monitor. Their behavior will depend on the value this metric. For example, the Early Stopping callback will track a metric and stop the training, even if it’s not finished, when it detects that the metric is not improving anymore (decreasing for a loss, increasing for a score).
Usually, you define the metrics you want to track in the metrics parameters of the compile function. Then you can retrieve it in the callbacks. In DSS context, the software computes its own metrics through a base callback. The list of tracked metrics will depend on the type of problem (Regression, Classification) and is the following:
- ‘R2 Score’
- ‘Custom Score’
- Binary Classification
- ‘F1 Score’
- ‘Cost Matrix Gain’
- ‘Log Loss’
- ‘Cumulative Lift’
- ‘ROC AUC’
- ‘Custom score’
- Multiclass Classification
- ‘F1 Score’
- ‘Log Loss’
- ‘ROC AUC’
- ‘Custom score’
As we track metrics on both ‘Train’ and ‘Test’ sets, you need to prepend ‘Train ‘/’Test ‘ to the name of the metric to have the proper name.
As they are computed in a base callback, if you want to use them, you need to put your custom callback after the list of base_callbacks provided by DSS, in the list that you will pass to fit_generator.
For example, in a binary classification problem, if you want to introduce an early stopping callback monitoring ROC AUC, you can add the following callback to its list
from keras.callbacks import EarlyStopping early_stopping_callback = EarlyStopping(monitor="Test ROC AUC", mode="max", min_delta=0, patience=2)
DSS also provides a helper to retrieve in the code the name of metric that is used for the optimization of the model, along with the info on whether it is a loss (and lower is better) or a score (greater is better). You can access those variables with
from dataiku.doctor.deep_learning.shared_variables import get_variable metric_to_monitor = get_variable("DKU_MODEL_METRIC") greater_is_better = get_variable("DKU_MODEL_METRIC_GREATER_IS_BETTER")
and the previous early stopping callback becomes
from dataiku.doctor.deep_learning.shared_variables import get_variable from keras.callbacks import EarlyStopping metric_to_monitor = get_variable("DKU_MODEL_METRIC") greater_is_better = get_variable("DKU_MODEL_METRIC_GREATER_IS_BETTER") early_stopping_callback = EarlyStopping(monitor=metric_to_monitor, mode="max" if greater_is_better else "min", min_delta=0, patience=2)