Running on Google Kubernetes Engine

You can use container execution on Google Kubernetes Engine as a fully managed Kubernetes solution.

Setup

Create your GKE cluster

Follow Google documentation on how to create your GKE cluster. We recommend that you allocate at least 15 GB of memory for each cluster node. More memory may be required if you plan on running very large in-memory recipes.

You’ll be able to configure the memory allocation for each container and per-namespace using multiple container execution configurations.

Prepare your local docker and kubectl commands

Follow Google documentation to make sure that:

  • Your local (on the DSS machine) kubectl command can interact with the cluster. As of July 2018, this implies running gcloud container clusters get-credentials <cluster_id>
  • Your local (on the DSS machine) docker command can successfully push images to the gcr.io repository. As of July 2018, this implies running gcloud auth configure-docker

Create the execution configuration

Build the base image as indicated in Setting up.

In Administration > Settings > Container exec, add a new execution config of type “Kubernetes”

On GKE, there is only a single shared image repository URL, gcr.io. Access control is based on images names. Therefore, the repository URL to use is gcr.io/GCP_project_name

For example, if your GCP project is called my-gke-project, use gcr.io/my-gke-project as the repository URL

You’re now ready to run recipes and models on GKE

Using GPUs

Google Cloud Platform provides GPU-enabled instances with NVidia GPUs. Several steps are required in order to use them for container execution

Build a CUDA-enabled base image

The base image that is built by default (see Setting up) does not have CUDA support and cannot use NVidia GPUs.

You need to build a CUDA-enabled base image. We recommend that you give this image a specific tag and keep the default base image “pristine”. We recommend that you add the DSS version number in the image tag.

From the DSS Datadir, run

# -c 1 enables cuda
# -t TAG sets the generated image tag

./bin/dssadmin build-container-exec-base-image -c 1 -t dataiku-exec-base-cuda:X.Y.Z

where X.Y.Z is your DSS version number

Warning

After each upgrade of DSS, you must rebuild all base images

Then create a new container configuration dedicated to running GPU workloads, and in the “Base image tag” field, enter dataiku-exec-base-cuda:X.Y.Z

Note

This image contains CUDA 9.0 and CuDNN 7.0. If you need other CUDA versions, you’ll need to create a custom image. See custom-bhase-images

Create a cluster with GPUs

Follow google documentation for how to create a cluster with GPU accelerators (Note: you can also create a GPU-enabled node group in an existing cluster)

Don’t forget to run the “daemonset” installation procedure. This procedure needs several minutes to complete.

Add a custom reservation

In order for your container execution to be located on nodes with GPU accelerators, and for GKE to configure the CUDA driver on your containers, the corresponding GKE pods must be created with a custom “limit” (in Kubernetes) parlance to indicate that you need a specific type of resource (standard resource types are CPU and Memory)

You must configure this limit in the container execution

  • In the “Custom limits” section, add a new entry with key: nvidia.com/gpu and value: 1 (to request 1 GPU)
  • Don’t forget to add the new entry, save settings

Deploy

You can now deploy your GPU-requiring recipes and models