Running on Google Kubernetes Engine¶
You can use container execution on Google Kubernetes Engine as a fully managed Kubernetes solution.
Setup¶
Create your GKE cluster¶
Follow Google documentation on how to create your GKE cluster. We recommend that you allocate at least 15 GB of memory for each cluster node. More memory may be required if you plan on running very large in-memory recipes.
You’ll be able to configure the memory allocation for each container and per-namespace using multiple container execution configurations.
Prepare your local docker
and kubectl
commands¶
Follow Google documentation to make sure that:
- Your local (on the DSS machine)
kubectl
command can interact with the cluster. As of July 2018, this implies runninggcloud container clusters get-credentials <cluster_id>
- Your local (on the DSS machine)
docker
command can successfully push images to thegcr.io
repository. As of July 2018, this implies runninggcloud auth configure-docker
Create the execution configuration¶
Build the base image as indicated in Setting up.
In Administration > Settings > Container exec, add a new execution config of type “Kubernetes”
On GKE, there is only a single shared image repository URL, gcr.io
. Access control is based on images names. Therefore, the repository URL to use is gcr.io/GCP_project_name
For example, if your GCP project is called my-gke-project
, use gcr.io/my-gke-project
as the repository URL
You’re now ready to run recipes and models on GKE
Using GPUs¶
Google Cloud Platform provides GPU-enabled instances with NVIDIA GPUs. Several steps are required in order to use them for container execution
Build a CUDA-enabled base image¶
The base image that is built by default (see Setting up) does not have CUDA support and cannot use NVIDIA GPUs.
You need to build a CUDA-enabled base image.
Then create a new container configuration dedicated to running GPU workloads. If you specified a tag for the base image, report it in the “Base image tag” field.
Create a cluster with GPUs¶
Follow google documentation for how to create a cluster with GPU accelerators (note: you can also create a GPU-enabled node group in an existing cluster).
Don’t forget to run the “daemonset” installation procedure. This procedure needs several minutes to complete.
Add a custom reservation¶
In order for your container execution to be located on nodes with GPU accelerators, and for GKE to configure the CUDA driver on your containers, the corresponding GKE pods must be created with a custom “limit” (in Kubernetes parlance) to indicate that you need a specific type of resource (standard resource types are CPU and Memory)
You must configure this limit in the container execution
- In the “Custom limits” section, add a new entry with key:
nvidia.com/gpu
and value:1
(to request 1 GPU) - Don’t forget to add the new entry, save settings
Deploy¶
You can now deploy your GPU-requiring recipes and models