DSS 14 Release notes

Migration notes

How to upgrade

Pay attention to the warnings described in Limitations and warnings.

Migration paths to DSS 14

Limitations and warnings

Automatic migration from previous versions is supported (see above). Please pay attention to the following cautions, removal and deprecation notices.

Cautions

Dataiku Cloud Stacks: OS upgrade

For Cloud Stacks setups, the OS for the DSS nodes has been updated from AlmaLinux 8 to AlmaLinux 9.

Custom setup actions may require some updates

Cgroups have moved from V1 to V2. Most configurations (including the out-of-the-box configuration) are migrated automatically. Some specific configurations may require a manual update.

Dataiku Cloud Stacks: Removal of old Python versions

Cloud Stacks setup do not include Python 3.6, Python 3.7 nor Python 3.8 anymore by default. All these Python versions are deprecated, and we advise you to upgrade remaining code still using them.

If you need to maintain support for one of these Python versions, setup actions have been added to reinstall these versions, and can be configured in your Instance Template in Fleet Manager.

Container images: OS upgrade

The OS for container images has been updated from AlmaLinux 8 to AlmaLinux 9. Custom Dockerfile additions may require some updates (either if you build customized base images, or when using custom “container additions” in code envs)

Prebuilt container images: Removal of old Python versions

The Dataiku prebuilt container images do not include Python 3.6, Python 3.7 nor Python 3.8 anymore by default. All these Python versions are deprecated, and we advise you to upgrade remaining code still using them.

However, at the level of each code env, a “Container runtime addition” has been added to easily re-add, on a per-code-env basis, support for them.

On Dataiku Cloud Stacks, these container runtime additions are automatically added to existing code envs running one of these Python versions.

Dataiku Custom: Bump of minimal Java version

Dataiku now requires Java 17.

No action is required on Dataiku Cloud Stacks and Dataiku Cloud. On Dataiku Custom, you may need to install Java 17 prior to upgrading to Dataiku 14.

Support removal

Some features that were previously announced as deprecated are now removed or unsupported.

  • Support for R 3.6

  • Support for MLFlow < 2

  • Support for Python 3.6 and 3.7 for the builtin environment. Builtin environments using these versions will be automatically upgraded. Note that code envs can still use Python 3.6 to 3.8 but these are deprecated

  • Support for Java 11

  • Support for Red Hat Enterprise Linux 7.x

  • Support for CentOS 7.x

  • Support for Oracle Linux 7.x

  • Support for Debian 10.x

  • Support for Ubuntu 18.04

  • Support for SuSE 12

  • Support for SuSE 15 SP4 and below

Deprecation notices

DSS 14 deprecates support for some features and versions. Support for these will be removed in a later release.

  • Support for Python 3.8. As a reminder, Python 3.6 and 3.7 are already deprecated.

  • Govern: Support for PostgreSQL 12, 13 and 14

  • Time Series Forecasting: support for MXNet-based algorithms

  • Support for MLLib

  • Support for AmazonLinux 2

Version 14.0.0 - June 27th, 2025

DSS 14.0.0 is a major upgrade to DSS with major new features.

New feature: New home page

A completely new home page for Dataiku has been introduced.

The home page brings together your projects, your recent items, learning content from Dataiku as well as content from your administrator in a refreshed and modern interface.

The projects listing has been upgraded with new capabilities such as easier moving of projects, a table view for quick mass operations on projects, a tree view for better organization of your projects, and better search across projects.

The data catalog is now directly accessible within the home page for more efficient exploration of your data.

New feature: Dashboards and Charts Theming

The platform now supports the use of reusable themes to enhance the visual presentation of charts and dashboards. These themes facilitate the standardization of colors, fonts, and design elements, thereby ensuring a professional and branded aesthetic. This functionality allows users to establish a consistent visual identity across their visualization assets, effectively reflecting their organization’s branding.

New feature: Automatic throttling for LLM connections

A new system has been added for automatically throttling requests to LLM API providers. This gives you finer and simpler control over the speed at which requests are issued, to better handle cases where you only have limited quotas on your LLM provider.

For more details, please see Rate Limiting

New feature: Enhanced documents embedding

The Embed Documents recipe can now embed PDF, DOCX, PPTX and more using text-only extraction. This does not use a Vision LLM, and is faster and cheaper when the documents do not need a visual interpretation. Note that this requires using a new internal code env (“Document extraction”), which needs to be set up.

The Embed Documents can now avoid re-processing documents you’ve already processed with the new Smart Sync, Upsert and Append modes.

The Embed Documents recipe is now faster, even when using visual extraction

New feature: AI-Assisted Metadata & Metadata synchronization

A new (optional) AI Assistant now helps you generate dataset metadata and column descriptions.

On SQL datasets, table and column descriptions can now be automatically synchronized to the underlying SQL table. This is enabled by default on Snowflake, Databricks, BigQuery and MySQL.

New feature: Deployment Alerts

In Unified Monitoring, administrators can now configure alerts for Project and API Endpoint deployment statuses, with notifications deliverable via various channels

New feature: Model evaluation for Time Series forecasting

Dataiku now integrates its Model Evaluation capabilities with time series forecasting models. This enhancement allows you to visualize the evolution of key metrics, set custom alert thresholds—including per-identifier thresholds via code—and detect performance drift over time. Delve into detailed comparisons of forecasts versus actuals, pinpoint problematic identifiers, and gain deeper insights into your model’s behavior.

Agentic AI & RAG

  • New feature: Code tool. You can now directly write your own custom tool in Python, allowing other users to leverage it without writing code. (Note: Agents & Tools require the Advanced LLM Mesh add-on.)

  • Generative AI and Agentic AI now has a dedicated top bar menu

  • Retrieval-augmented models are now shown in your project’s Flow, linked to their parent Knowledge Bank, with configuration of sources for display in Chat UIs

  • Retrieval-augmented models: added support for streaming the response

  • Improved linking of Agents in the Flow with the elements their tools are using, e.g. other Agents

  • Fixed creation of Agents in Flow zone

  • Fixed possible exclusion of Knowledge Banks in recursive rebuilds

  • Fixed failure of Retrieval-Augmented models queries when an image (corresponding to a page from an Embed Document recipe) is not found

  • Fixed retrieval of such images when the project was imported and the KB not rebuilt

  • Fixed usage of shared Agents in Prompt Studio Chats

LLM Mesh

  • AWS Bedrock: Added support of Nova models in Fine Tuning and Agents

  • AWS Bedrock: Added DeepSeek R1

  • Local Hugging Face: simplified LLM IDs for usage in API

  • Cortex LLM: Added ability to use per-project role switching when connecting through OAuth2

Machine Learning

  • Improved auto-detection of time series forecasting settings

  • Fixed image classification model scoring when using get_predictor from a container

  • Fixed horizontal axis labels on the Partial Dependence plot

  • Fixed header names in time series forecasting model metrics table

  • Fixed column descriptions edition from the analysis page

  • Fixed disabled classes filter in the Design of an Object Detection task

MLOps

  • Added a Prometheus compatible public API endpoint for retrieving different deployment metrics (only statuses at the moment)

  • Added the ability to import MLflow models into the Flow without writing code

  • Added the ability to use shared datasets as evaluation datasets when importing MLflow models

  • Added support for partially labelled data in the Evaluation Recipe

  • Fixed Evaluation Recipe on time series asking for metrics dataset schema update while even on newly created recipe

  • Fixed an issue in the Model Evaluation Store where metrics display settings in the status tab were reset by Status checks settings changes

Dataset and Connections

  • New feature: Added ability to set a range of cells when importing Excel files

  • New feature: Added a connector to Treasure Data

  • New feature: Sample datasets directly integrated in DSS allow you to quickly get started. Admins can define their own sample datasets

  • Added a “Push description” button in the schema table of dataset

  • Reduced Excel export size

  • Databricks: Fixed wrongful request for OAuth2 authorization endpoint when working with application-level OAuth2 (client_credentials grant)

  • Conditional formatting: Added ability to define min and max when exporting a numerical column colored by scale

  • Conditional formatting: Fixed possible race condition when used on large datasets (more than 200 columns)

  • Conditional formatting: Added color interpolation on column coloring

  • Conditional formatting: Added ability to define custom colors in scale mode

  • Conditional formatting: Added support of scale coloring on Excel export and scenario mail reporter dataset attachment

  • Conditional formatting: Included rules in dataset export even if unused in current display mode

  • Managed folders: Fixed unexpected scrollback on long file preview

  • Managed folders: Fixed wrong warning on shared managed folders if the user has only read permission on it

  • Improved error reporting when using geo preview on improperly formatted data

  • BigQuery: Added ability to write columns of type Array and Object using Storage API

  • BigQuery: Fixed failure when reading very large tables

  • Replaced failure by warning when failing to synchronise dataset description in DB

  • Fixed call to action “Connect your account” sometimes not appearing when exploring datasets

  • Athena: Fixed issues with JDBC driver version 3

Flow

  • A brand new empty Flow page helps you get started quicker, including using sample datasets directly integrated in DSS.

  • Fixed double scroll on Flow view when using Google Chrome on Windows

  • Added ability from the Flow or from the project home page, to view the folder where the project is, and to move the project to another folder

  • Keep column search field value of side panel when clicking on another dataset

  • Improved resilience of Flow Document Generator with unexpected characters

Visual recipes

  • Join: Fixed post-join computed columns with “anti join” join type

  • Prepare: Fixed “Saved” button not clickable when setting “Error column” in formula step

  • Prepare: Fixed prepare recipe processor recommendations from the column header when column meaning is “Datetime no tz”

  • Prepare: Improved support of negative number in scientific notation with “Convert numeric format” processor

  • Upsert: Fixed upsert with DSS engine on Redshift connection with fast-write enabled

  • Added a keyboard shortcut to run recipes (shift+enter)

  • Fixed some formula functions that silently returned null instead of explicitly failing when called incorrectly

Charts and Dashboards

  • New feature: Manual Binning and Reusable Binned Dimensions: Added the ability to manually define bins for numerical dimensions in charts, along with the ability to create virtual columns from numerical dimensions at the dataset level, that serve as reusable binning configurations.

  • New feature: Dashboards tiles groups: Added the ability to group multiple tiles together. Groups can then be formatted, moved, or resized like single tiles.

  • Charts: Fixed pivot table layouts broken by column names with two dots.

  • Charts: Fixed chart size on loading and double legend

  • Charts: Fixed handling of empty bins in color measure

  • Charts: Fixed Sankey chart so that the whole color mapping is not reset when a column is removed

  • Charts: Fixed tooltips in treemaps

  • Charts: Fixed rounding errors in bar charts when not in one-tick-per-bin mode

  • Charts: Fixed Y axis overflow in binned rectangle charts

  • Charts: Fixed cross-filter exclude for “no value“

  • Dashboards: Added the ability to justify and set vertical alignment of text tiles

  • Dashboards: Fixed saving confirmation modal not disappearing when going to view mode after having created multiple new pages

  • Dashboards: Fixed dataset renaming breaking associated filters

Scenarios and automation

  • Properly surface errors in run conditions

  • Fixed Scenario-level variables not taken into account when running recipes with Containerized DSS engine

  • Fixed “Schedule” side panel button storing current project key in scenario configuration

Deployer

  • Allowed Unified Monitoring period above 1 hour

  • Fixed issue with API services deployed from the automation node when using legacy non-versioned code envs on the automation node

  • Fixed Unified Monitoring log rotation issue under some race conditions

  • Fixed the fetching of project deployments status for out of sync deployments

  • Fixed the handling of Govern status when Govern integration is disabled

  • Fixed the display layout of unsaved test queries responses

  • Fixed spikes in “Response time” charts

Governance

  • Govern now requires PostgreSQL 12.5+ (previous versions revealed a bug impacting Govern). Generally speaking, we recommend using an up-to-date minor version of PostgreSQL, which will include the latest fixes.

  • Improved the performance of the syncing of deleted projects, which also improves performance for syncing new or modified projects

  • On Cloud, added a menu to go back to launchpad

  • Fixed Govern modal not loading for users with restrictive permissions on certain fields

  • Fixed table display when user has no access permissions on the underlying blueprint version

  • Fixed the handling of the deletion of agent versions

  • Fixed LLM tag filter for projects with an Answers webapp

  • Updated Govern installation documentation to upgrade database requirements to Postgres 12.5+, as versions 12.0 to 12.4 revealed a bug impacting Govern. Generally speaking, we recommend using an up-to-date minor version of Postgres, which will include the latest fixes.

Data Catalog

  • Added ability to search for charts

  • Fixed default flow zoom when reaching a dataset from the catalog

  • Fixed “Last build” information in data collection when object has never been built

Collaboration

  • Added a button in project home page side panel to move it to an another folder

  • Fixed invitation and grant emails not being sent when creating project through the public API

  • Column Lineage: Added ability to see and notify data stewards

  • Fixed workspace item name change not reflected in preview

Coding & API

  • Added ability to change the Python interpreter of a code environment

  • Added ability for non-admin users with code environment management permission to create & update internal code environments

  • BigFrames integration: Fixed handling of recipes from partitioned BigQuery dataset to partitioned BigQuery dataset

Stories

  • New feature: Stories Themes: Introduced theme selection for stories, allowing users to change the visual style of presentations. These themes are pre-designed, with administrators having the option to add more. Users can also customize the active theme within a presentation by manually editing visual elements.

  • Added an option to “Correct” in the text assistant

  • Fixed “Copy to clipboard” on generated insights

  • Fixed slide assistant when asking for filters on Snowflake datasets

Git

  • When resolving merge conflicts, it’s now possible to directly accept all changes from either the branch or the base

Elastic AI

  • Upgraded OS for container images to AlmaLinux 9

Cloud Stacks

  • Upgraded OS to AlmaLinux 9

  • Added the possibility to activate Dataiku Stories in the instance templates

  • Added stricter anti-DoS setting on SSH server

  • Removed deprecated fm-cli

  • Upgraded Ansible used for setup tasks to Ansible 11

Misc

  • Regularly remove old Excel export temporary files

  • Python 3.12 is no longer experimental

  • Added support for Debian 12

  • Fixed issues with webapps that handle frontend-side routing (such as Angular) when accessing through the Direct Access URL

  • Fixed horizontal pod autoscaling when deploying webapps from Code Studio

  • Removed the legacy constraint on “urllib<2” in code envs

  • Fixed erroneous reporting of warnings status for jobs with multiple recipes

  • Fixed proper cgroup protection of Visual Statistics processes on OS with CGroups V2

  • Fixed installation of R integration on Debian 11