Dataiku Snowflake Optimizer

Snowflake Marketplace Native App

Dataiku and Snowflake make for a great pairing. Dataiku is a platform that helps teams of diverse skill sets build data-driven applications – from simple data cleaning, aggregation, and automation – to more complicated machine learning and LLM-driven workflows. Snowflake provides the industry-leading data storage, scalable warehouse compute, and LLM services to power your Dataiku projects.

Use this app to monitor and optimize your joint Dataiku Snowflake usage, including:

App Information

Version 1.0.1
Author Dataiku
Released 2025/01/05
Last Updated 2025/01/05
License Proprietary

Permissions Setup

In the “Security” tab of the native app:

  1. Grant the app access to your account’s internal SNOWFLAKE database (required to view query history within the ‘Snowflake Warehouse Compute’, ‘Snowflake Cortex LLM Queries’, and ‘Misc Best Practices’ pages).
  2. Grant the app access to an ML prediction model UDF and training table (required to view the ‘ML What-If Analysis’ page).

Grant permissions to the native app within the ‘Security’ page

Figure 1: Grant permissions to the native app within the ‘Security’ page

Within a Snowflake worksheet outside of the native app:

  1. Create an External Access Integration to allow the app to pull information from your Dataiku instance (required to view the ‘Dataiku Recipes’ page). The script can be found on the first page of the native app. Note that your Dataiku instance URL must be reachable from your Snowflake instance in order for this to work.

Run the first-time setup script in a Snowflake worksheet to view the ‘Dataiku Recipes’ page

Figure 2: Run the first-time setup script in a Snowflake worksheet to view the ‘Dataiku Recipes’ page

Using the App

Hello and Setup

This page describes the app’s functionality, shows the first-time setup instructions (step 2 above), and allows you to check a connection to a Dataiku instance (after completing step 2).

Intro page with app functionality

Figure 3: Intro page with app functionality

Run the first-time setup script in a Snowflake worksheet to view the ‘Dataiku Recipes’ page

Figure 4: Run the first-time setup script in a Snowflake worksheet to view the ‘Dataiku Recipes’ page

Enter your Dataiku instance URL, click ‘Check connection’, and make sure it succeeds

Figure 5: Enter your Dataiku instance URL, click ‘Check connection’, and make sure it succeeds

Dataiku Recipes

To use this page, make sure you have confirmed a working connection to a Dataiku instance in the ‘Hello and Setup’ page.

Note: this page may take a few minutes to load all the recipe information from Dataiku.

The first two charts will show Dataiku recipes that:

Visualization of Dataiku Recipes with capability to pushdown compute to Snowflake

Figure 6: Visualization of Dataiku Recipes with capability to pushdown compute to Snowflake

The individual recipes from the bottom two cases are output into tables, where you can click through to the recipe in Dataiku and change the engine to Snowflake.

Table of Dataiku Recipes that are ready to use Snowflake or can be re-mapped

Figure 7: Table of Dataiku Recipes that are ready to use Snowflake or can be re-mapped

Snowflake Warehouse Compute

When Dataiku sends queries to Snowflake, an application query tag is included. As a result, Warehouse compute time driven by Dataiku can be tracked by type (JDBC SQL or Snowpark for Python) and user.

Snowflake warehouse compute usage

Figure 8: Snowflake warehouse compute usage

Snowflake Cortex LLM Queries

Dataiku’s LLM Mesh has a native connector to Snowflake’s Cortex LLM functions. The number of queries sent to Cortex from Dataiku users and the LLMs used can be tracked.

Snowflake Cortex LLM queries

Figure 9: Snowflake Cortex LLM queries

Miscellaneous Best Practices

This page checks your Snowflake query logs to ensure that your Snowflake connections, configured in Dataiku, have automatic fast-write enabled (to/from cloud object storage). Enabling fast-write allows Dataiku to use COPY INTO (faster) rather than INSERT INTO VALUES (slower) when writing data to Snowflake.

A green check means connectivity is configured properly

Figure 10: A green check means connectivity is configured properly

If you see a table here, check the Dataiku connections referenced and enable fast-write

Figure 11: If you see a table here, check the Dataiku connections referenced and enable fast-write

ML What-If Analysis

On this page, test how your ML model will respond to changes to input features. Follow the permission setup #3 to sync this app with an ML model you’ve trained in Dataiku and export it as a Snowflake Java UDF. See the Dataiku documentation for more information.

Choose ‘Classification’ or ‘Regression’ depending on the model. Toggle inputs on the left, and see how model predictions change on the right.

Figure 12: Choose ‘Classification’ or ‘Regression’ depending on the model. Toggle inputs on the left, and see how model predictions change on the right.