DSS can work with container engines in three ways:
- Parts of the processing tasks of the DSS design and automation nodes can run on one or several hosts, powered by Docker or Kubernetes. For more details, see the Running in containers section of the documentation.
- The DSS API node can run as multiple containers orchestrated by Kubernetes. For more details, see the API Node & API Deployer: Real-time APIs section.
- The entirety of a DSS design or automation node can run as a Docker container. For more details, see Running DSS as a Docker container.
In general, running Dataiku DSS as a container (either by running Docker directly, or through Kubernetes) is incompatible with the ability to leverage containers as a processing engine.
DSS can run certain kinds of processes in Kubernetes. These processes include the following:
- Python and R recipes
- Plugin-provided recipes that are written in Python or R
- Initial training of in-memory machine learning models (when using the “in-memory” engine, see In-memory Python (Scikit-learn / XGBoost))
- Retraining of in-memory machine learning models
- Scoring of in-memory machine learning models when NOT using the “Optimized engine” (see Scoring engines). The optimized engine can run on Spark.
- Evaluation of in-memory machine learning models
Running DSS processes in containers provides several key advantages, namely:
- Improved scalability: Use of containers provide the ability to scale processing of “local” code beyond the single DSS design/automation node machine. This is especially true when using Kubernetes.
- Improved computing capabilities: Containers provide the ability to leverage processing nodes that may have different computing capabilities. In particular, you can leverage remote machines that provide GPUs, even though the DSS machine itself does not. This is especially useful for deep learning.
- Ease of resource management: You can restrict the use of resources, such as CPU and memory usage, globally (by using the resource management capabilities of Kubernetes) or individually for each container (e.g., if using Docker directly, you can specify those restrictions per container in DSS).
The base image for containers has only the basic python packages for DSS. If you need any additional packages that were manually added to the built-in python environment of DSS, then we recommend that you use a code environment. You could also choose to use a custom base image.
- In code recipes and notebooks, using libraries from plugins is not supported in containers. For example,
dataiku.import_from_pluginwill not work.
- For deep learning models, if you run a GPU-enabled training in a container, but the DSS server itself does not have a GPU or CUDA installed, Tensoboard visualization will not work, because it runs locally using the same code environment as the training. This should not prevent the training itself, only the Tensorboard visualization.
DSS can run its workloads on Kubernetes, but also by directly using Docker. We recommend using Kubernetes in most cases.
A Kubernetes setup offers a lot of flexibility by providing the following:
- A native ability to run on a cluster of machines. Depending on the available resources, Kubernetes automatically places containers on machines.
- An ability to globally control resource usage.
- A capability to auto scale (for managed cloud Kubernetes services).
DSS can natively leverage multiple cloud Kubernetes clusters for you. Similarly (all large cloud providers offer managed Kubernetes services).
A Docker-only configuration is easier to set up, as any recent operating system comes with full Docker execution capabilities. However, Docker itself is a mono machine, and while DSS can leverage multiple Docker daemons, each workload must explicitly target a single machine.
With Docker, you can manage the resources used by each container, but you cannot globally restrict resources used by the sum of all containers (or all containers of a user).
Each activity (such as recipes, machine learning models, …) that you run on containers targets a specific “Containerized execution configuration”.
Kubernetes execution configurations indicate:
- The base image to use
- The “context” for the
kubectlcommand. This allows you to target multiple unmanaged Kubernetes clusters or to use multiple sets of Kubernetes credentials.
- Resource restriction keys (as specified by Kubernetes)
- The Kubernetes resource namespace for resource quota management
- The image registry URL
- Permissions — to restrict which user groups have the right to use a specific Kubernetes execution configuration
Docker execution configurations indicate:
- The base image to use
- The host of the Docker daemon (by default, runs on the local Docker daemon)
- Resource restriction keys (as specified by Docker)
- Permissions — to restrict which user groups have the right to use a specific Docker execution configuration
- Optionally, the image registry URL
- Optionally, the Docker “runtime” (this is used for advanced use cases like GPUs)
DSS uses one or multiple Docker images that must be built prior to running any workload.
In most cases, you’ll only have a single Docker base image that will be used for all container-based executions. At build time, it is possible to set up whether you want your image to have:
- R support
- CUDA support for execution on GPUs
For advanced use cases, you can build multiple base images. This can be used for example:
- By having one base image with CUDA support, and one without
- If you require additional base system packages
Kubernetes execution capabilities are fully compatible with multiple managed code environments. You simply need to indicate for which containerized execution configuration(s) your code environment must be made available. For more information, see Using code envs with containerized execution.