Overview and setup¶
Dataiku provides multiple AI Assistants to help users with various tasks across the Dataiku platform.
The main assistants (“AI Services”) can operate in two modes:
Operated by Dataiku. Processing is done through Dataiku’s cloud-based servers
Bring your own LLM. You can choose which LLMs to use for the AI Services, and the processing is done through LLM connections that you configure and control using LLM Mesh. Your input is not sent to Dataiku.
Using AI Services requires acceptance of the Dataiku AI Services Terms of Use.
Other assistants (“Third-Party Assistants”) are entirely under customer control. They leverage LLM connections that you configure and control through LLM Mesh. Your code and metadata are not sent to Dataiku, but may be sent to third-party services depending on the LLM you select.
The AI Services¶
The AI Services are:
Cobuild, our flagship AI building agent. Describe what you want to build in plain language, and Cobuild generates a complete Dataiku project — data sourcing, data pipelines, machine learning models, agents, charts, and applications — as a visual flow that teams can inspect, edit, review, and approve.
SQL Assistant, a versatile SQL companion that allows you to generate, refine, and troubleshoot your SQL queries in SQL notebooks.
AI Search that allows you to find and discover relevant data in the Data Catalog.
Generate Metadata that automatically generates descriptions for your datasets and their columns.
Stories AI that allows you to generate presentations, slides, charts, and images inside Dataiku Stories.
AI Explain that provides explanations for what your Flow or code does, allowing you to better understand and document your data pipelines and codebases.
Generate Steps that allows users to use natural language to build steps in a Prepare recipe.
Requirements¶
Using the Dataiku AI Services is subject to acceptance of our Dataiku AI Services Terms of Use, which are linked from the “AI Services” page in Admin.
Once you have accepted the Terms of Use, you can turn on AI Services
AI Services require that the Dataiku DSS server be connected to Internet, in both “Operated by Dataiku” and “Bring your own LLM” modes. The URL to whitelist is https://ai-gateway.api-services.dataiku.io.
Use of the AI Services involves metered usage. AI Services may include a limited amount of included free usage and/or additional credits subject to fees as set forth in applicable Orders. Please see Usage of AI Services for more details.
“AI Services Operated by Dataiku” mode¶
By default, AI Services use Dataiku’s own server, hosted and managed by Dataiku. In that mode, your input is processed by Dataiku and by Third Party AI Providers.
“Bring your own LLM” mode¶
Alternatively, you can choose which LLMs to use for the AI Services, and the processing is done through a LLM connection that you configure and control through the LLM Mesh.
When users make requests to AI Services in this mode, the requests are directly sent to your choice of third-party AI provider. Dataiku (as a company) does not process the requests and does not generate outputs.
Note: Stories AI is not available in “Bring your own LLM” mode.
Choice of LLM¶
In order to use AI Services in “Bring your own LLM” mode, you must select a LLM. AI Services, notably Cobuild, require high-performance models, able to reliably perform long reasoning and tool calling.
The following models have been tested to provide good results. Note that most of these models can be obtained through multiple providers.
OpenAI GPT 5.5
OpenAI GPT 5.4
OpenAI GPT 5.3-Codex
OpenAI GPT 5.2
Anthropic Sonnet 4.6
Anthropic Opus 4.6
Anthropic Opus 4.7
Anthropic Opus 4.8
Google Gemini 3.5 Flash
The best overall results are obtained with OpenAI GPT 5.4 and 5.5.
The following models have been tested to provide “average” results. Using these models is not recommended as experience may vary:
DeepSeek V4 Pro
DeepSeek V4 Flash
The following models have been tested and do not provide good results. Using these models (as well as all inferior models) will lead to a disappointing experience, and is strongly discouraged:
OpenAI GPT 5.1
OpenAI GPT 4.1
OpenAI GPT 4o
Qwen3-235B-A22B
Nemotron 3 Super
DeepSeek V3.1
More models will be added to these lists. As AI Services evolve, some models may change category. We recommend checking this list regularly.
Third-Party Assistants¶
These assistants do not go through Dataiku’s AI Services. They may either go through the LLM Mesh, or directly connect to 3rd-party services with whom you have agreements.
OpenAI Codex is a high-end coding agent, integrated in Code Studios
Claude Code is a high-end coding agent, integrated in Code Studios
OpenCode is a full-featured open-source coding agent, integrated in Code Studios
Gemini CLI is a coding agent, integrated in Code Studios
GitHub Copilot is a powerful coding agent, integrated in Visual Studio Code in Code Studios
AI Code Assistant provides simple Python code generation and explanations in Jupyter Notebooks and in Visual Studio Code in Code Studios