When creating a Deep Learning model, you need to write the Architecture of the Neural Network. To do so, you must fill two Python functions defined in the “Architecture” tab of the settings.
Build Keras model¶
build_model function needs to return an instance of the Keras Model class.
This function takes two parameters:
input_shapesis a dictionary of the shapes of the input tensors
n_classesis the number of classes to predict (for classification models only)
Advanced models can have multiple inputs (see Multiple inputs for more information). Each input has a name.
input_shapes is a dictionary indexed by input name. Indeed, in most cases, the shape of the inputs is not known before the preprocessing, which will create a certain numbers of columns (when doing dummification, vectorization …), so they are provided to you to build your model.
If you haven’t used multi-input features, you only have a
main input. Thus, to know the shape of your input tensor, simply use
input_main = Input(shape=input_shapes["main"]) x = Dense(64, activation="relu")(input_main) ...
n_classes, for multiclass classification problems, you may not know the number of target classes
You need to be careful with the dimension of the last layer of you network, as it will condition the output of the model.
- For regression, the last layer needs to have a dimension equal to 1, and it should not have any activation if the variable to predict does not have particularities
- For binary classification, the last layer should be either: dimension equal to 1 and sigmoid activation, or dimension equal to 2 and softmax activation
- For multiclass classification, the last layer should have a dimension equal to the number of target classes, and a softmax activation.
If this is not respected, train will either fail (mismatch in dimension) or give inconsistent results (if the activation is not a proper one, the result may not be a probability distribution).
Compile the model¶
The compile_model function takes the previously created model as input and must compile it. The reason it is separated from the build_model function is that DSS may need to manipulate the model between creation and compilation, in particular for multi-GPU training.
The compile function usually takes a list of metrics to track during training. In DSS context, it is not necessary to precise them because we already compute the list of metrics found in the “Metrics” tab, and they will be available on Tensorboard afterwards.
The two arguments that needs to be carefully filled are: