Preparing Code Studio templates¶

Code Studio templates are prepared by the DSS administrator, in the Administration section of DSS. Permissions to use the templates can then be granted to groups.

Sample: deploying JupyterLab¶

In this sample, we will define a Code Studio template to run JupyterLab, in order to interactively edit and debug Python recipes, libraries, …

This assumes that you meet the prerequisites, notably that you have Elastic AI set up, and a containerized execution configuration.

• In “Administration > Code Studios”, click Create Code Studio template and create a new template named jupyter-lab-template

• In the “Definition” tab, click Add a block and select JupyterLab server

• Click Build

In the “General” tab, you can then grant access to some (or all) DSS groups, to control which users is allowed to start JupyterLab runtimes. In their projects, the selected users can now create JupyterLab Code Studios.

Building blocks¶

Templates are built from blocks, each adding some configuration to the Code Studios. The blocks are applied sequentially, so blocks can rely on the modifications defined in previous blocks.

Standard blocks¶

Visual Studio Code¶

This block adds the VS Code IDE in the Code Studio. Each code env added by an “Add code environment” block is added to VS Code as a debug configuration and is available from the list of interpreters.

It is not usually needed to change any setting of this block.

JupyterLab server¶

This block adds the JupyterLab IDE in the Code Studio. Each code env added by a “Add code environment” block is added as a kernel in JupyterLab.

It is not usually needed to change any setting of this block.

RStudio¶

This block adds the RStudio Server IDE in the Code Studio.

It is not usually needed to change any setting of this block.

RStudio block is not available on Dataiku Online.

Streamlit¶

This block adds the Streamlit application building framework in the Code Studio and adds an entry point that runs the application. This allows you to both edit and run the application directly from the Code Studio.

The Streamlit application is automatically bootstrapped and can be edited directly from the “Code Studio Versioned Files”, in the streamlit/app.py file.

This block installs the specified code environment in the designated location in the container.

• If your template contains JupyterLab, the code environment is also automatically registered as a Kernel.

• If your template contains VS Code, the code environment is also automatically registered as a run config (debug panel), and as an interpreter (bottom right menu) in VS Code

Append to DockerFile¶

A Code Studio template is primarily defined by the image that its container runs.

This block allows you to add arbitrary Dockerfile statements to the image. Each instance of this block appends to the Dockerfile of the image that is built.

The image starts from the DSS base image.

Resources listed in the block are copied to the Docker build context, next to the Dockerfile. You can use ${dku.install.dir} and ${dip.home} to point to the DSS installation directory and data directory respectively.

The actual entry point of the container is defined by DSS to be a technical “runner.py” script that’s part of the base DSS image. To start actual HTTP servers in the container, the template must define at least one entry point that this technical script will launch. Each entry point can also declare a port on which it’s expected to be running an HTTP server. That HTTP server is then made available in the DSS UI. To forward the HTTP communication to the DSS UI:

• make sure Expose HTML service is checked. If left unchecked, the HTTP server is accessible by requesting its URL directly but not shown in the UI. The URL is built as http[s]://studio-host:port/dip/code-studios/<project-key>/<code-studio-id>/<exposed-port>/

• whether the HTTP server is launched when using this template in a Webapp

• specify a label to use on tabs in the Code Studio’s “View”

• specify a proxied subpath that is given to a proxy_pass in an Nginx configuration. What to use depends on the capabilities of the HTTP server in the Code Studio, and in particular whether it’s able to handle a path prefix. A value of / works for most cases, but when the server can’t handle a path prefix, you should use $request_uri and force the server in the Code Studio to use a fixed prefix. DSS sets a $DKU_CODE_STUDIO_BROWSER_PATH_<exposed-port> variable in the Code Studio that the server entry point can use to set its path prefix.

While the Code Studio is up, some predefined actions can be triggered inside the container by the user, from the Code Studio’s “Actions” tab. Each action is a command line.

When a Code Studio is created from the template, the template writer can add predefined files inside the code-studio-specific file zones and user-specific file zones on the DSS server’s filesystem. This can for example be used to provide an initial working version of code for a webapp. The files are only added upon creating the Code Studio, not when (re)starting it.

File synchronization¶

This special block defines which files are synchronized with the Code Studio and where. See Concepts for more details about the different file locations.

Each synchronization definition consists of:

• a “zone” of the DSS server’s filesystem

• optionally a sub-folder of that zone

• a target location in the container.

A synchronization can be made one-way by toggling the arrow; if one-way, then files are copied from the DSS server’s filesystem to the Code Studio, but not the other way around. The block also sports a list of exclusions to define which files on the container are excluded from synchronization. The exclusions follow the syntax used by gitignore (minus the ! negation)

Kubernetes parameters¶

This special block controls advanced settings.

The container in which the Code Studio actually runs is spawned as a pod in a Kubernetes deployment (single replica). In order to have a proper deployment, a functioning readiness probe is needed, and that is the main purpose of this block. The simplest is to activate TCP probing, and DSS will set the deployment to probe on the first exposed port of the container. (see Readiness probes)

This block also specifies to DSS which URL to use in order to probe the readiness of the HTTP server inside the container. This probing is independent of the one done by Kubernetes to find out whether the deployment rollout is finished. If using the JupyterLab / RStudio / VisualCode blocks, this field is not necessary, because these blocks will automatically adjust the readiness probe URL.

Finally, the block allows for defining additional exposed ports. However, these ports should preferably be defined from the Add an entry point block.

Environment variables in the pod¶

The pod running the Code Studio receives some parameters using environment variables:

• DKU_CODE_STUDIO_BROWSER_PATH : the path prefix used by DSS in front of the Code Studio in the UI. Its value is defined by DSS, and is currently /code-studios/<project_key>/<code_studio_id> in a code studio.

• DKU_CODE_STUDIO_BROWSER_PATH_{port} : the specific path prefix used for a given exposed port (starts with DKU_CODE_STUDIO_BROWSER_PATH). Its value is /code-studios/<project_key>/<code_studio_id>/<port_number>

• K8S_NODE_NAME, K8S_POD_NAME, K8S_POD_NAMESPACE, K8S_POD_ID : exposed from Kubernetes via the downward API

These variables are also defined for code studios published as webapps, but:

• the values of the DKU_CODE_STUDIO_BROWSER_PATH* variables is different, and corresponds to the usual backend path prefixes for webapps, like /web-apps-backends/<project_key>/<web_app_id>...