Using unmanaged AKS clusters¶
- Using GPUs
If you already have an Azure Container Registry (ACR) up and ready, you can skip this section and go to Create your AKS cluster.
Otherwise, follow the Azure documentation on how to create your ACR registry.
We recommend that you pay extra attention to the Azure container registry pricing plan, as it is directly related to the registry storage capacity.
To create your Azure Kubernetes Service (AKS) cluster, follow the Azure documentation on how to create your AKS cluster. We recommend that you allocate at least 16GB of memory for each cluster node.
Once the cluster is created, you must modify its IAM credentials to grant it access to ACR (Kubernetes secret mode is not supported). This is required for the worker nodes to pull images from the registry.
Follow the Azure documentation to ensure the following on your local machine (where Dataiku DSS is installed):
azcommand is properly logged in. As of October 2019, this implies running the
az login --service-principal --username client_d --password client_secret --tenant tenant_idcommand. You must use a service principal that has sufficient IAM permissions to write to ACR and full control on AKS.
dockercommand can successfully push images to the ACR repository. As of October 2019, this implies running the
az acr login --name your-registry-namecommand.
kubectlcommand can interact with the cluster. As of October 2019, this implies running the
az aks get-credentials --resource-group your-rg --name your-cluster-namecommand.
Go to Administration > Settings > Containerized execution, and add a new execution configuration of type “Kubernetes”.
- In particular, to set up the image registry, the URL must be of the form
- Finish by clicking Push base images.
You’re now ready to run recipes, notebooks and ML models in AKS.
Azure provides GPU-enabled instances with NVIDIA GPUs. Several steps are required in order to use them for containerized execution.
The base image that is built by default (see setup) does not have CUDA support and cannot use NVIDIA GPUs. You need to build a CUDA-enabled base image (see Setting up (Kubernetes)).
Create a new containerized execution configuration dedicated to running GPU workloads. If you specified a tag for the base image, report it in the “Base image tag” field.
In order for your container execution to be located on nodes with GPU accelerators, and for AKS to configure the CUDA driver on your containers, the corresponding AKS pods must be created with a custom “limit” (in Kubernetes parlance) to indicate that you need a specific type of resource (standard resource types are CPU and Memory). Also, NVIDIA drivers should be mounted in the containers.
To do so:
- in the “Custom limits” section, add a new entry with key:
1(to request 1 GPU). Don’t forget to effectively add the new entry.
- in “HostPath volume configuration”, mount
Don’t forget to effectively add the new entry, and save the settings.
Follow Azure documentation for how to create a cluster with GPU accelerators.