Dataiku and Snowflake make for a great pairing. Dataiku is a platform that helps teams of diverse skill sets build data-driven applications – from simple data cleaning, aggregation, and automation – to more complicated machine learning and LLM-driven workflows. Snowflake provides the industry-leading data storage, scalable warehouse compute, and LLM services to power your Dataiku projects.
Use this app to monitor and optimize your joint Dataiku Snowflake usage, including:
Version | 1.0.1 |
---|---|
Author | Dataiku |
Released | 2025/01/05 |
Last Updated | 2025/01/05 |
License | Proprietary |
In the “Security” tab of the native app:
Figure 1: Grant permissions to the native app within the ‘Security’ page
Within a Snowflake worksheet outside of the native app:
Figure 2: Run the first-time setup script in a Snowflake worksheet to view the ‘Dataiku Recipes’ page
This page describes the app’s functionality, shows the first-time setup instructions (step 2 above), and allows you to check a connection to a Dataiku instance (after completing step 2).
Figure 3: Intro page with app functionality
Figure 4: Run the first-time setup script in a Snowflake worksheet to view the ‘Dataiku Recipes’ page
Figure 5: Enter your Dataiku instance URL, click ‘Check connection’, and make sure it succeeds
To use this page, make sure you have confirmed a working connection to a Dataiku instance in the ‘Hello and Setup’ page.
Note: this page may take a few minutes to load all the recipe information from Dataiku.
The first two charts will show Dataiku recipes that:
Figure 6: Visualization of Dataiku Recipes with capability to pushdown compute to Snowflake
The individual recipes from the bottom two cases are output into tables, where you can click through to the recipe in Dataiku and change the engine to Snowflake.
Figure 7: Table of Dataiku Recipes that are ready to use Snowflake or can be re-mapped
When Dataiku sends queries to Snowflake, an application query tag is included. As a result, Warehouse compute time driven by Dataiku can be tracked by type (JDBC SQL or Snowpark for Python) and user.
Figure 8: Snowflake warehouse compute usage
Dataiku’s LLM Mesh has a native connector to Snowflake’s Cortex LLM functions. The number of queries sent to Cortex from Dataiku users and the LLMs used can be tracked.
Figure 9: Snowflake Cortex LLM queries
This page checks your Snowflake query logs to ensure that your Snowflake connections, configured in Dataiku, have automatic fast-write enabled (to/from cloud object storage). Enabling fast-write allows Dataiku to use COPY INTO (faster) rather than INSERT INTO VALUES (slower) when writing data to Snowflake.
Figure 10: A green check means connectivity is configured properly
Figure 11: If you see a table here, check the Dataiku connections referenced and enable fast-write
On this page, test how your ML model will respond to changes to input features. Follow the permission setup #3 to sync this app with an ML model you’ve trained in Dataiku and export it as a Snowflake Java UDF. See the Dataiku documentation for more information.
Figure 12: Choose ‘Classification’ or ‘Regression’ depending on the model. Toggle inputs on the left, and see how model predictions change on the right.