DSS 14 Release notes¶
Migration notes¶
How to upgrade¶
For Dataiku Cloud users, your DSS will be upgraded automatically to DSS 14 within pre-announced timeframes
For Dataiku Cloud Stacks users, please see upgrade documentation
For Dataiku Custom users, please see upgrade documentation: Upgrading a DSS instance.
Pay attention to the warnings described in Limitations and warnings.
Migration paths to DSS 14¶
From DSS 13: Automatic migration is supported, with the restrictions and warnings described in Limitations and warnings
From DSS 12: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 11: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 10.0: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 9.0: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 8.0: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 7.0: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 6.0: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 6.0 -> 7.0, 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 5.1: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 5.1 -> 6.0, 6.0 -> 7.0, 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 5.0: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 5.0 -> 5.1, 5.1 -> 6.0, 6.0 -> 7.0, 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 4.3: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 4.3 -> 5.0, 5.0 -> 5.1, 5.1 -> 6.0, 6.0 -> 7.0, 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 4.2: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 4.2 -> 4.3, 4.3 -> 5.0, 5.0 -> 5.1, 5.1 -> 6.0, 6.0 -> 7.0, 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 4.1: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 4.1 -> 4.2, 4.2 -> 4.3, 4.3 -> 5.0, 5.0 -> 5.1, 5.1 -> 6.0, 6.0 -> 7.0, 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
From DSS 4.0: Automatic migration is supported. In addition to the restrictions and warnings described in Limitations and warnings, you need to pay attention to the restrictions and warnings applying to your previous versions. See 4.0 -> 4.1, 4.1 -> 4.2, 4.2 -> 4.3, 4.3 -> 5.0, 5.0 -> 5.1, 5.1 -> 6.0, 6.0 -> 7.0, 7.0 -> 8.0, 8.0 -> 9.0, 9.0 -> 10.0, 10.0 -> 11, 11 -> 12, 12 -> 13
Migration from DSS 3.1 and below is not supported. You must first upgrade to 5.0. See DSS 5.0 Release notes
Limitations and warnings¶
Automatic migration from previous versions is supported (see above). Please pay attention to the following cautions, removal and deprecation notices.
Cautions¶
Dataiku Cloud Stacks: OS upgrade¶
For Cloud Stacks setups, the OS for the DSS nodes has been updated from AlmaLinux 8 to AlmaLinux 9.
Custom setup actions may require some updates
Cgroups have moved from V1 to V2. Most configurations (including the out-of-the-box configuration) are migrated automatically. Some specific configurations may require a manual update.
Dataiku Cloud Stacks: Removal of old Python versions¶
Cloud Stacks setup do not include Python 3.6, Python 3.7 nor Python 3.8 anymore by default. All these Python versions are deprecated, and we advise you to upgrade remaining code still using them.
If you need to maintain support for one of these Python versions, setup actions have been added to reinstall these versions, and can be configured in your Instance Template in Fleet Manager.
Container images: OS upgrade¶
The OS for container images has been updated from AlmaLinux 8 to AlmaLinux 9. Custom Dockerfile additions may require some updates (either if you build customized base images, or when using custom “container additions” in code envs)
Prebuilt container images: Removal of old Python versions¶
The Dataiku prebuilt container images do not include Python 3.6, Python 3.7 nor Python 3.8 anymore by default. All these Python versions are deprecated, and we advise you to upgrade remaining code still using them.
However, at the level of each code env, a “Container runtime addition” has been added to easily re-add, on a per-code-env basis, support for them.
On Dataiku Cloud Stacks, these container runtime additions are automatically added to existing code envs running one of these Python versions.
Dataiku Custom: Bump of minimal Java version¶
Dataiku now requires Java 17.
No action is required on Dataiku Cloud Stacks and Dataiku Cloud. On Dataiku Custom, you may need to install Java 17 prior to upgrading to Dataiku 14.
Support removal¶
Some features that were previously announced as deprecated are now removed or unsupported.
Support for R 3.6
Support for MLFlow < 2
Support for Python 3.6 and 3.7 for the builtin environment. Builtin environments using these versions will be automatically upgraded. Note that code envs can still use Python 3.6 to 3.8 but these are deprecated
Support for Java 11
Support for Red Hat Enterprise Linux 7.x
Support for CentOS 7.x
Support for Oracle Linux 7.x
Support for Debian 10.x
Support for Ubuntu 18.04
Support for SuSE 12
Support for SuSE 15 SP4 and below
In addition, the following plugins have been removed and cannot be used anymore with DSS 14:
Data Anonymizer. It was superseded long ago by Column Pseudonymization.
Deep Learning on Images. It was superseded long ago by Computer vision.
Feature Factory / Events Aggregator. It was superseded long ago by Generate features.
Model drift monitoring. It was superseded long ago by Drift analysis.
NLG Tasks. It was superseded long ago by Generative AI and LLM Mesh.
OpenAI GPT. It was superseded long ago by Generative AI and LLM Mesh.
Both Forecast plugins. It was superseded long ago by Time series forecasting.
MeaningCloud NLP (MeaningCloud does not exist anymore). See Text & Natural Language Processing for alternatives.
Crowlingo Services NLP. See Text & Natural Language Processing for alternatives.
Advisor
HubSpot
Looker Query Connector
namR Store
Natif Intelligent Document processing
Oncrawl
Deprecation notices¶
DSS 14 deprecates support for some features and versions. Support for these will be removed in a later release.
Support for Python 3.8. As a reminder, Python 3.6 and 3.7 are already deprecated.
Govern: Support for PostgreSQL 12, 13 and 14
Time Series Forecasting: support for MXNet-based algorithms
Support for MLLib
Support for AmazonLinux 2
In addition, the following plugins are deprecated and will be removed in a later release:
List folder Contents. It is superseded by List Folder Contents.
Azure AD Sync. It is superseded by Azure AD.
EMR clusters and Dataproc clusters. The underlying support for these has been removed from DSS.
Text Summarization. It is mostly superseded by Generative AI and LLM Mesh.
Sentiment Analysis. It is mostly superseded by Generative AI and LLM Mesh.
Text Embedding. It is mostly superseded by Generative AI and LLM Mesh and native handling in Visual ML.
Version 14.1.0 - August 12th, 2025¶
DSS 14.1.0 is a release with significant new features, bug fixes, security fixes, and performance improvements.
New feature: Project Standards¶
Dataiku’s new Project Standards feature helps you improve project quality and production-readiness by enabling designers to build and deploy robust, production-grade projects. This feature addresses the challenge of ensuring that projects created by citizen data scientists meet organizational best practices and standards, avoiding late-stage rejections and ensuring overall quality.
Key capabilities of Project Standards include: Check Library: Create and manage checks on a Dataiku instance, with customizable descriptions, parameters, and mitigation strategies. Scope Definition: Define which checks are executed in specific projects using project keys, tags, and folders, with support for scope ordering and API package scoping. Check Execution and Reporting: Get a clean and actionable report of check results, accessible from various contexts (e.g., bundles, API packages, flows). The report can be exported, and administrators can define acceptable thresholds for unpassed checks for deployments. Project Standards help ensure that projects are built to production standards from the outset, reducing rework and increasing the reliability of deployed solutions.
Agentic AI & RAG¶
New feature: search input strategy. Specify whether a Retrieval-Augmented LLM always augments its query from exact search (typically for batch or programmatic usage), or if it rewrites the query to optimize search / decide when searching is not necessary (typically for multi-turn chat).
New feature: a new Calculator tool can leverage DSS formulas to perform mathematical calculations.
New feature: New tool to call a deployed API Endpoint
New History tab for Agents & Agent Tools
New Python APIs for writing a Knowledge Bank and edit embedding recipes from a Python notebook or recipe
New Python APIs for creating Agents, Agent Tools and Retrieval-Augmented LLMs from a Python notebook or recipe
New Python API to obtain a persistent local workload folder, for use with a Code Agent, Agent Tool or Webapp
Set a similarity score threshold, on both Retrieval-Augmented LLMs and KB Search tools
Define filters on Retrieval-Augmented LLMs
Added a Quick Test to Retrieval-Augmented LLMs
A new Explore Trace button lets you directly open the trace of a Quick Test
Fixed Quick Test on Agents when using large test queries
Fixed Embed Documents recipe handling of files with names of less than 3 characters
Fixed deletion of agent tools and prompt studios when deleting a project
Fixed Send message tool error when called from a Prompt Studio
Fixed possible use of incorrect settings when using Retrieval-Augmented LLMs or “Search KB” Agent Tools when the KB is not yet rebuilt with the changed settings
Fixed project import remapping of connections used by Knowledge Banks
Fixed accumulation of sources across runs in “Search KB” & “Web Search” Agent Tools
LLM Mesh¶
New feature: custom LLM connections can now be written in Python in plugins
OpenAI: added support for o3, o4-mini
Mistral: added support for mistral medium
Set custom headers on OpenAI, Azure OpenAI and Azure LLM connections
Local Hugging Face models: ability to set
CUDA_VISIBLE_DEVICES
if running locally without containerizationLocal Hugging Face models: added support for using vLLM’s pipeline parallelism
Fine-Tunbing: added support for fine-tuning gpt-4.1 mini & nano
Fine-Tuning: improved display of model loss graph
Fine-Tuning: use the best checkpoint rather than the latest when fine-tuning a local Hugging Face model
Machine Learning¶
Time series forecasting: added an ETS (Error/Trend/Season) algorithm
Time series forecasting: added ability to specify start & end dates for train/test sets
Time series-forecasting: added support for Torch-based versions of DeepAR & SimpleFeedForward algorithms
Improved display of residuals / error distribution of regression models
Added APIs to query the value of forecasted time series on trained models
Added support for sklearn 1.5 in the built-in code environment
Added support for torch 2 on Computer Vision models
Fixed display of time series forecasting model reports when residuals can contain NaN values
Fixed train of time series forecasting STL models when the model’s error rate is so low that the information criterion goes to infinity
Statistics¶
Fixed multi-selection (shift-click) on filtered lists in configuration dialog for multivariate analyses
MLOps¶
Time series forecasting: added ability to run an Evaluation recipe on any number of time steps, not necessarily a multiple of the horizon
Added support for text drift analysis in the evaluation recipe
Fixed handling of DataFrame input with missing columns in exported models
Fixed display of
llm_raw_response
in the row-by-row analysis of an LLM evaluationFixed display of images metadata in the row-by-row comparison of an LLM evaluation
Dataset and Connections¶
New feature: Denodo connection
Added a “generate metadata” button from dataset schema screen
Added a “copy column name” button in dataset column header
Added support of custom cloud storage using Signature Version AWSS3V4
Improved detection of invalid S3 credentials when testing the connection
Fixed dataset display setting, kept in workspace if a conditional formatting rule is applied on the source dataset
Fixed dataset column custom field, preserved when reloading column description
Fixed default max length for BigQuery string columns
Fixed column conditional formatting scale rule export with Excel format when an explore filter is set
Fixed “Switch to bucket region” S3 connection option
Fixed infinite loop when listing tables of an empty Fabric schema
Fixed Edit schema link in side panel for Network, Uploaded, and File-in-folder datasets. Removed it on Sample datasets
Data Quality¶
New feature: Added a recipe to extract all rows not matching data quality rules of a dataset
New feature: Added a dataset schema equality rule
New feature: Added a “dataset schema contains” rule
New feature: Added a “All value in a range” rule
Added an option to clean a specific partition check history
Data Catalog¶
Added ability to customize which columns to display in a Data Collection
Added ability to put a Data Collection in the promoted content of the home page
Added a metadata completeness check to control addition of non-compliant datasets to a Data Collection
Flow and Visual Recipes¶
Flow: Improved ”move flow zone” usability and performance
Flow: Edit schema screen can now be reached from the side pannel
Flow: Fixed navigation behavior when opening a dataset shared into a flow zone
Flow: Fixed focus when coming from a dataset shared between flow zones
Flow: Fixed search result link for shared managed folder
Flow: Fixed coloring with “Last build duration” flow view when switching from linear to log scale option
Flow: Fixed error when using the “Check consistency” flow action on an empty coding recipe
Prepare: ignore disabled steps when computing recipe status
Prepare: improved display of formula and Python steps in side panel to show their content
Prepare: visual if processor can now, when possible, be converted to DSS formula
Prepare: fixed Trim processor with non breaking space character
Prepare: fixed preview button after generating preparation steps
Prepare: removed daily saving time shift for recent dates in
America/Sao_paulo
orBrazil/East
time zonesPrepare: fixed “Convert Unix timestamp” processor with negative timestamps
Prepare: added a default value of String transformation step when added through the column header of the prepare recipe
Join: Fixed “keep unmatch rows” option when used in a SQL or Spark pipeline
Pivot: fixed max modality name length in on “count” aggregation
Improved performances of App-as-Recipe run
Fixed Snowflake to Cloud storages fast paths when the Snowflake account has PREVENT_UNLOAD_TO_INLINE_URL or respectively PREVENT_LOAD_TO_INLINE_URL set
Fixed Azure/Databricks fast path when using OAuth2 on an Azure connection configured with a private app
Coding & API¶
Added support for Pandas 2.3
Added support for numpy 2 in the built-in environment
Added
ai_generate_description
public API methods toDSSProject
,DSSFlowZone
andDSSDataset
Added a Python API method to list the plugins used by a project
Added ability to include shared datasets when listing dataset of a project with the Python API
Added ability to list the datasets of a project matching a given set of tags with the Python API
Added a
get_data_steward
method to dataset settings in Python API
Git¶
Improved navigation between projects when switching branches
Collaboration & Onboarding¶
New feature: Dataiku AI Q&A Assistant in the help center
New feature: administrators can now personalize the promoted content displayed on the home page
Workspace favorites items are now visible from the home page
Fixed display of global tags
Charts¶
Added differentiated URLs for each chart
Added an option to hide empty bins
Added support for negative values in stacked bars
Moved all number formatting settings from measure to the “Format” menu
Added ability to customize “Placement mode” and Spacing” of Values in chart for each measure
Fixed aggregation sorting when in SQL engine
Fixed transition from geometry map to scatter map when using a categorical column for the details
Fixed KPI alignment settings getting dropped upon modification of other settings
Fixed chart display when using “Generate one tick per bin” with many bins
Dashboards¶
Added automatic save of Jupyter Notebooks when publishing to dashboards
Improved tile placement
Added keyboard navigation between pages (see Dashboard concepts)
Fixed the moving of tiles from one page to another
Fixed indefinite looping of export when there are groups of tiles
Fixed scroll of text tiles when their text is cropped
Stories¶
Added a “Back to workspace” button (top left bird)
Added support for GIFs
Improved slide navigation
Searching a dataset now filters instead of just highlighting
Fixed slide thumbnail for videos
Scenario and automation¶
Fixed “Refresh statistics & chart cache” step sometimes returning success before the chart cache is actually refreshed
Fixed Dashboard selection in the “Refresh statistics & chart cache” step when there are multiple dashboards with the same name
Deployer¶
New feature: Deployments now include an update history
Added an API on project deployments to execute test scenarios on the target automation node
Added a Horizontal Pod Autoscaling RAM limit option for Kubernetes infrastructures
Added ability to set an arbitrarily high Unified Monitoring batch frequency
Fixed Unified Monitoring’s Project Alerts for multi-node infrastructures
Governance¶
Added a new widget to the home page displaying all tasks assigned to the logged-in user
Added documentation links to the home page
Added support for Project Standards reports
Added support for Bundles’ release notes
Improved the definition of conditions for conditional views
Fixed possible scroll issues on permissions settings page when an edited field is temporarily invalid
Fixed DSS synchronization to skip temporary application instances
Cloud stacks¶
Azure: Fixed removal of virtual network secondary subnet value
AWS: Fixed provisioning of Load Balancer with name longer than 32 characters
Added a Python API to enable/disable a setup action
Security¶
New feature: administrators can now specify permissions on instance-level messaging channels
Fixed Hugging Face token printed in logs of an uncontainerized fine-tuning recipe
Improved user experience when transferring project ownership
Webapps¶
Fixed the Webapp settings’ Refresh button
Miscellaneous¶
Fixed handling of removed parameters in Development plugins
Added support for comments in definition of global and project variables
Added a search field to the “New component” dialog of plugin development
Fixed project folder when using a project creation macro from a folder’s page
Version 14.0.2 - July 31st, 2025¶
DSS 14.0.2 is a bugfix and feature release
Agentic AI & RAG¶
Fixed Agent Connect on Dataiku Cloud
Improved “Send Message” tool with more advanced templating
Added ability to set a metadata dataset as input on the Embed Documents recipe, to add metadata to the output Knowledge Bank
Added ability to specify a column containing document-level security tokens in the Embed Documents recipe
Added support for document-level security tokens when querying Retrieval-Augmented LLMs via API
Added warning when another user is concurrently editing an Agent
Added ability to install the dependencies required for the Embed Documents recipe’s text-only extraction on RHEL
Added support for “Dataset Lookup” tool usage with Vertex Gemini models
Fixed recursive build not propagated upstream of Agents
Fixed deletion of Retrieval-Augmented LLMs when the source KB is deleted
Fixed document extraction when the window size exceeds the number of pages (or ≥ 2 on image files)
Fixed possible hangs in the Embed Documents recipe
Fixed possible memory overruns in the Embed Documents recipe when processing a very large number of documents
Fixed extraction of PDF files with inconsistent ICC profiles
LLM Mesh¶
Vertex Generative AI: added support for Gemini 2.5 Pro & Flash, Gemini Text Embedding, Text embedding update 005
AWS Bedrock: added support for Claude 4 model family
Anthropic: added support for Claude 4 model family
Fixed JSON mode when streaming completion responses on Local Hugging Face models
Machine Learning¶
Fixed training of custom models that do not output probabilities, when using K-fold cross test
Fixed training of models using custom metrics that need probabilities, when using K-fold cross test
Fixed training of calibrated classifiers using sparse matrices
Fixed listing of prediction models in API Designer
Fixed filtering train/test set with a formula containing a variable
MLOps¶
Added automatic discovery of code environment and containerized execution settings when using MLflow import APIs from a Dataiku Python notebook
Added “number of evaluation windows” as a label in the Model Evaluations for Time Series models
Added the display of worst value for custom metrics in the Model Evaluation Store
Fixed column selection in the results validation settings of the test scenario step
Fixed model export to automatically cast categorical features to strings
Charts and Dashboards¶
Charts: Fixed handling of percentiles producing no data
Charts: Fixed the theme’s default continuous palette not being applied when selecting a theme
Charts: Fixed “in-chart titles” theme typography settings wrongly applied to radar axis labels
Charts: Fixed resetting of axis labels font wrongly enabling the “Add background” option on some charts
Charts: Fixed black and white being added to theme color palette when resetting it
Charts: Fixed formatting settings not being updated upon theme modification
Charts: Fixed mass selection of alphanumeric filter facets
Charts: Fixed KPI disappearing after reverting theme
Dashboards: Fixed application of theme’s general typography settings to the different kinds of tiles
Flow¶
Fixed display of implicit (dotted) flow links between some objects in different zones
Fixed display of implicit (dotted) flow links from objects shared from another project
Dataset and Connections¶
New feature: Experimental support for connecting to Dremio
Fixed listing of files when creating a new “Files in Folder” dataset
Fixed plugin datasets that use date selector settings
Fixed column filter in Explore > “Columns quick view”
Added ability to choose the name of the worksheet when exporting a dataset to Excel
BigQuery: Added an option on connections (overridable at the dataset level) to prevent DSS from writing into a BigQuery table if the dataset is partitioned but the underlying BigQuery table is not
Databricks: Fixed a race condition when synchronizing table and column descriptions when building multiple partitions at the same time
Databricks: Fixed “timestamp_ntz” columns being ignored during schema detection
SQL Server: Fixed errors caused by inserting Infinity values
Azure Blob: Fixed potential OAuth2 token expiration in case of long running activities
Azure Blob: Fixed creation of external datasets from Parquet files when hierarchical namespace is disabled
Sharepoint: Fixed connection becoming unusable when switching from OAuth2 to Private key authentication
RedShift, Oracle, PostgreSQL: Added synchronization of column descriptions when creating a database table
ElasticSearch: Fix writing in append mode when dataset contains a column of type “Datetime with tz”
Descriptions are now truncated when they are longer than the maximum allowed by the database (ex: 2048 characters for MySQL)
Fixed writing to Delta datasets partitioned by day generating files using date+time pattern
Fixed writing to Delta datasets partitioned by hour generating incorrect results if the server timezone is different from UTC
Fixed the Explore view of an empty CSV based dataset created with Spark
Visual recipes¶
Prepare: Fixed race condition on Snowflake when running a Prepare recipe with steps executed as UDF on a partitioned dataset
Prepare: Added SQL support for “Find and Replace” steps using regular expressions
Prepare: Added SQL support for the
asDatetimeTz
function in formula stepsPrepare: Fixed “Parse Date” step with patterns using SSS (milliseconds) failing to parse values with 0 milliseconds when run under Spark
Prepare: Fixed
asDateOnly
function producing incorrect values in formulas when executed on Snowflake with SQL engineJoin: Improved error message displayed in case of mismatch between the configuration of the recipe and the input datasets schemas
Pivot: Fixed incorrect values that could be produced when running with the DSS engine
Distinct: Added a new “on all columns” option to automatically include columns added in upstream recipes in future changes
Window: Fixed misalignment of columns and job failures when the upstream schema changes
Window: Fixed output column type of average of Date only columns
Group: Fixed concat aggregation on string columns when running with the DSS engine
Fixed Databricks engine when Spark is not configured on the DSS instance
Upsert: Added cross‑connection dataset support using DSS engine
Sharepoint: Added “List access” recipe to output per-file authorizations based on DSS user groups and associated Entra groups
Coding & API¶
Fix possible incorrect URL returned in
webapp.get_backend_client()
Added ability to view project libraries with read-only access to project
Governance¶
Added missing field types (Files, Times Series, JSON) in the fields selector for custom filters
Added ability to define an external URL in the Govern integration settings
Improved synchronization performance when objects have been deleted
Fixed application of text wrap option for text fields in a table cell
Fixed the wrapping of the name column in tables
Deployer¶
API deployment: Added ability to use Kubernetes
ingressClassName
fields instead of the annotation for API deployer infrastructureAPI deployment: Added support for setting Kubernetes Topology Spread Constraints both at the infrastructure level and at the deployment level
API deployment: Added support for defining environment variable as Kubernetes Secrets in the API Deployer Infrastructure settings
API deployment: Added ability to specify “Pod disruption budgets” for API deployments on Kubernetes
Elastic AI¶
Added libglvnd-glx in the docker base image (which makes it easier to use packages such as OpenCV out of the box)
Webapps exposition: Added ability to use Kubernetes
ingressClassName
fields instead of the annotationWebapps exposition: Added support for setting Kubernetes Topology Spread Constraints
Webapps exposition: Added ability to specify “Pod disruption budgets” for WebApps on Kubernetes
Cloud Stacks¶
Fixed Govern Integration settings consistency with nodes directory in case of an invalid govern node id
Fixed race condition leading to “Device has unexpected filesystem” when reprovisioning DSS
Hadoop¶
New feature: Added support for Cloudera Base on Premises 7.3.1
Stability & Performance¶
Fixed potential instance freeze when calling project.get_job().get_log() on a job with massive logs
Fixed potential instance freeze when a very large dataset is selected as parameter for Dynamic dataset repeat
Added safety limits on SQL notebook and SQL scenario step result size
Fixed potential freeze of the Jupyter subsystem
Misc¶
Dataiku Apps: Fixed refresh of variables in Variable Display tiles after running a job or scenario
Webapps: Fixed Code Studio streamlit backend not correctly detected as started when configured with “launch for webapps”
Fixed sorting of plugins by Name
Fixed inability to select input datasets when creating Python, R, or Spark scala recipes from notebooks
Added warning messages in recipes and their input & output datasets screens when a partition dependency is set to “All available” while both input and output datasets are partitioned
Version 14.0.1 - July 17th, 2025¶
DSS 14.0.1 is a bugfix release
LLM Mesh¶
Ensured retrieval-augmented LLMs have an active version when activating a bundle
Added containerization support for the text extraction process of Embed Documents recipes
Dataiku applications¶
Added an option to hide the Generative AI menu on app instances
Datasets and connections¶
Fixed BigQuery dataset when project id is not set at dataset level
Fixed ‘DataFormat’ error when importing a Excel export into Power BI
Added a connection setting to use default catalog/schema to “auto-resolve” not-fully-qualified datasets when checking table existence
Folders¶
Fixed file renaming modal
Charts¶
Fixed filtering from a cell value when using custom coloring rules on pivot tables
Flow¶
Fixed dataset creation not taking into account the current flow zone
Code Studios¶
Fixed Code Assistant in VS Code
API Node¶
Fixed API nodes deployment when a package contains endpoints using different code environments
Misc¶
Fixed Event server when using a S3 connection with STS for storage
Fixed possible error when invoking some LLM custom plugins
Version 14.0.0 - June 27th, 2025¶
DSS 14.0.0 is a major upgrade to DSS with major new features.
New feature: New home page¶
A completely new home page for Dataiku has been introduced.
The home page brings together your projects, your recent items, learning content from Dataiku as well as content from your administrator in a refreshed and modern interface.
The projects listing has been upgraded with new capabilities such as easier moving of projects, a table view for quick mass operations on projects, a tree view for better organization of your projects, and better search across projects.
The data catalog is now directly accessible within the home page for more efficient exploration of your data.
New feature: Dashboards and Charts Theming¶
The platform now supports the use of reusable themes to enhance the visual presentation of charts and dashboards. These themes facilitate the standardization of colors, fonts, and design elements, thereby ensuring a professional and branded aesthetic. This functionality allows users to establish a consistent visual identity across their visualization assets, effectively reflecting their organization’s branding.
New feature: Automatic throttling for LLM connections¶
A new system has been added for automatically throttling requests to LLM API providers. This gives you finer and simpler control over the speed at which requests are issued, to better handle cases where you only have limited quotas on your LLM provider.
For more details, please see Rate Limiting
New feature: Enhanced documents embedding¶
The Embed Documents recipe can now embed PDF, DOCX, PPTX and more using text-only extraction. This does not use a Vision LLM, and is faster and cheaper when the documents do not need a visual interpretation. Note that this requires using a new internal code env (“Document extraction”), which needs to be set up.
The Embed Documents can now avoid re-processing documents you’ve already processed with the new Smart Sync, Upsert and Append modes.
The Embed Documents recipe is now faster, even when using visual extraction
New feature: AI-Assisted Metadata & Metadata synchronization¶
A new (optional) AI Assistant now helps you generate dataset metadata and column descriptions.
On SQL datasets, table and column descriptions can now be automatically synchronized to the underlying SQL table. This is enabled by default on Snowflake, Databricks, BigQuery and MySQL.
New feature: Deployment Alerts¶
In Unified Monitoring, administrators can now configure alerts for Project and API Endpoint deployment statuses, with notifications deliverable via various channels
New feature: Model evaluation for Time Series forecasting¶
Dataiku now integrates its Model Evaluation capabilities with time series forecasting models. This enhancement allows you to visualize the evolution of key metrics, set custom alert thresholds—including per-identifier thresholds via code—and detect performance drift over time. Delve into detailed comparisons of forecasts versus actuals, pinpoint problematic identifiers, and gain deeper insights into your model’s behavior.
Agentic AI & RAG¶
New feature: Code tool. You can now directly write your own custom tool in Python, allowing other users to leverage it without writing code. (Note: Agents & Tools require the Advanced LLM Mesh add-on.)
Generative AI and Agentic AI now has a dedicated top bar menu
Retrieval-augmented models are now shown in your project’s Flow, linked to their parent Knowledge Bank, with configuration of sources for display in Chat UIs
Retrieval-augmented models: added support for streaming the response
Improved linking of Agents in the Flow with the elements their tools are using, e.g. other Agents
Fixed creation of Agents in Flow zone
Fixed possible exclusion of Knowledge Banks in recursive rebuilds
Fixed failure of Retrieval-Augmented models queries when an image (corresponding to a page from an Embed Document recipe) is not found
Fixed retrieval of such images when the project was imported and the KB not rebuilt
Fixed usage of shared Agents in Prompt Studio Chats
LLM Mesh¶
AWS Bedrock: Added support of Nova models in Fine Tuning and Agents
AWS Bedrock: Added DeepSeek R1
Local Hugging Face: simplified LLM IDs for usage in API
Cortex LLM: Added ability to use per-project role switching when connecting through OAuth2
Machine Learning¶
Improved auto-detection of time series forecasting settings
Fixed image classification model scoring when using
get_predictor
from a containerFixed horizontal axis labels on the Partial Dependence plot
Fixed header names in time series forecasting model metrics table
Fixed column descriptions edition from the analysis page
Fixed disabled classes filter in the Design of an Object Detection task
MLOps¶
Added a Prometheus compatible public API endpoint for retrieving different deployment metrics (only statuses at the moment)
Added the ability to import MLflow models into the Flow without writing code
Added the ability to use shared datasets as evaluation datasets when importing MLflow models
Added support for partially labelled data in the Evaluation Recipe
Fixed Evaluation Recipe on time series asking for metrics dataset schema update while even on newly created recipe
Fixed an issue in the Model Evaluation Store where metrics display settings in the status tab were reset by Status checks settings changes
Dataset and Connections¶
New feature: Added ability to set a range of cells when importing Excel files
New feature: Added a connector to Treasure Data
New feature: Sample datasets directly integrated in DSS allow you to quickly get started. Admins can define their own sample datasets
Added a “Push description” button in the schema table of dataset
Reduced Excel export size
Databricks: Fixed wrongful request for OAuth2 authorization endpoint when working with application-level OAuth2 (client_credentials grant)
Conditional formatting: Added ability to define min and max when exporting a numerical column colored by scale
Conditional formatting: Fixed possible race condition when used on large datasets (more than 200 columns)
Conditional formatting: Added color interpolation on column coloring
Conditional formatting: Added ability to define custom colors in scale mode
Conditional formatting: Added support of scale coloring on Excel export and scenario mail reporter dataset attachment
Conditional formatting: Included rules in dataset export even if unused in current display mode
Managed folders: Fixed unexpected scrollback on long file preview
Managed folders: Fixed wrong warning on shared managed folders if the user has only read permission on it
Improved error reporting when using geo preview on improperly formatted data
BigQuery: Added ability to write columns of type Array and Object using Storage API
BigQuery: Fixed failure when reading very large tables
Replaced failure by warning when failing to synchronise dataset description in DB
Fixed call to action “Connect your account” sometimes not appearing when exploring datasets
Athena: Fixed issues with JDBC driver version 3
Flow¶
A brand new empty Flow page helps you get started quicker, including using sample datasets directly integrated in DSS.
Fixed double scroll on Flow view when using Google Chrome on Windows
Added ability from the Flow or from the project home page, to view the folder where the project is, and to move the project to another folder
Keep column search field value of side panel when clicking on another dataset
Improved resilience of Flow Document Generator with unexpected characters
Visual recipes¶
Join: Fixed post-join computed columns with “anti join” join type
Prepare: Fixed “Saved” button not clickable when setting “Error column” in formula step
Prepare: Fixed prepare recipe processor recommendations from the column header when column meaning is “Datetime no tz”
Prepare: Improved support of negative number in scientific notation with “Convert numeric format” processor
Upsert: Fixed upsert with DSS engine on Redshift connection with fast-write enabled
Added a keyboard shortcut to run recipes (shift+enter)
Fixed some formula functions that silently returned null instead of explicitly failing when called incorrectly
Charts and Dashboards¶
New feature: Manual Binning and Reusable Binned Dimensions: Added the ability to manually define bins for numerical dimensions in charts, along with the ability to create virtual columns from numerical dimensions at the dataset level, that serve as reusable binning configurations.
New feature: Dashboards tiles groups: Added the ability to group multiple tiles together. Groups can then be formatted, moved, or resized like single tiles.
Charts: Fixed pivot table layouts broken by column names with two dots.
Charts: Fixed chart size on loading and double legend
Charts: Fixed handling of empty bins in color measure
Charts: Fixed Sankey chart so that the whole color mapping is not reset when a column is removed
Charts: Fixed tooltips in treemaps
Charts: Fixed rounding errors in bar charts when not in one-tick-per-bin mode
Charts: Fixed Y axis overflow in binned rectangle charts
Charts: Fixed cross-filter exclude for “no value“
Dashboards: Added the ability to justify and set vertical alignment of text tiles
Dashboards: Fixed saving confirmation modal not disappearing when going to view mode after having created multiple new pages
Dashboards: Fixed dataset renaming breaking associated filters
Scenarios and automation¶
Properly surface errors in run conditions
Fixed Scenario-level variables not taken into account when running recipes with Containerized DSS engine
Fixed “Schedule” side panel button storing current project key in scenario configuration
Deployer¶
Allowed Unified Monitoring period above 1 hour
Fixed issue with API services deployed from the automation node when using legacy non-versioned code envs on the automation node
Fixed Unified Monitoring log rotation issue under some race conditions
Fixed the fetching of project deployments status for out of sync deployments
Fixed the handling of Govern status when Govern integration is disabled
Fixed the display layout of unsaved test queries responses
Fixed spikes in “Response time” charts
Governance¶
Govern now requires PostgreSQL 12.5+ (previous versions revealed a bug impacting Govern). Generally speaking, we recommend using an up-to-date minor version of PostgreSQL, which will include the latest fixes.
Improved the performance of the syncing of deleted projects, which also improves performance for syncing new or modified projects
On Cloud, added a menu to go back to launchpad
Fixed Govern modal not loading for users with restrictive permissions on certain fields
Fixed table display when user has no access permissions on the underlying blueprint version
Fixed the handling of the deletion of agent versions
Fixed LLM tag filter for projects with an Answers webapp
Updated Govern installation documentation to upgrade database requirements to Postgres 12.5+, as versions 12.0 to 12.4 revealed a bug impacting Govern. Generally speaking, we recommend using an up-to-date minor version of Postgres, which will include the latest fixes.
Data Catalog¶
Added ability to search for charts
Fixed default flow zoom when reaching a dataset from the catalog
Fixed “Last build” information in data collection when object has never been built
Collaboration¶
Added a button in project home page side panel to move it to an another folder
Fixed invitation and grant emails not being sent when creating project through the public API
Column Lineage: Added ability to see and notify data stewards
Fixed workspace item name change not reflected in preview
Coding & API¶
Added ability to change the Python interpreter of a code environment
Added ability for non-admin users with code environment management permission to create & update internal code environments
BigFrames integration: Fixed handling of recipes from partitioned BigQuery dataset to partitioned BigQuery dataset
Stories¶
New feature: Stories Themes: Introduced theme selection for stories, allowing users to change the visual style of presentations. These themes are pre-designed, with administrators having the option to add more. Users can also customize the active theme within a presentation by manually editing visual elements.
Added an option to “Correct” in the text assistant
Fixed “Copy to clipboard” on generated insights
Fixed slide assistant when asking for filters on Snowflake datasets
Git¶
When resolving merge conflicts, it’s now possible to directly accept all changes from either the branch or the base
Elastic AI¶
Upgraded OS for container images to AlmaLinux 9
Cloud Stacks¶
Upgraded OS to AlmaLinux 9
Added the possibility to activate Dataiku Stories in the instance templates
Added stricter anti-DoS setting on SSH server
Removed deprecated fm-cli
Upgraded Ansible used for setup tasks to Ansible 11
Misc¶
Regularly remove old Excel export temporary files
Python 3.12 is no longer experimental
Added support for Debian 12
Fixed issues with webapps that handle frontend-side routing (such as Angular) when accessing through the Direct Access URL
Fixed horizontal pod autoscaling when deploying webapps from Code Studio
Removed the legacy constraint on “urllib<2” in code envs
Fixed erroneous reporting of warnings status for jobs with multiple recipes
Fixed proper cgroup protection of Visual Statistics processes on OS with CGroups V2
Fixed installation of R integration on Debian 11