DSS provides full support for many databases and experimental support for others. Click on a link for detailed support information for that database.
DSS fully supports the following databases:
DSS has Tier 2 support for the following databases:
In addition, DSS can connect to any database that provides a JDBC driver.
For databases not listed previously, we cannot guarantee that anything will work. Reading datasets often works, but it is rare that writing works out of the box.
You might be able to get a better behavior by selecting a specific dialect from the dropdown in the JDBC connection screen
Before you try to connect to a database, make sure that the proper JDBC driver for it is installed. For information on how to install JDBC drivers, see Custom Dataiku instructions or Dataiku Cloud Stacks for AWS instructions
The first step to work with SQL databases is to create a connection to your SQL database.
Go to the Administration > Connection page.
Click “New connection” and select your database type.
Enter a name for your connection.
Enter the requested connection parameters. See the page of your database for more information, if needed
Click on Test. DSS attempts to connect to the database, and gives you feedback on whether the attempt was successful.
Save your connection.
For all databases, you can pass arbitrary key/value properties that are passed as-is to the database’s JDBC driver. The possible properties depend on each JDBC driver. Please refer to the documentation of your JDBC driver for more information
For all databases for which DSS has a specific connection kind, DSS automatically constructs the JDBC URL from the structured settings. For advanced use cases, you can enable the “Custom JDBC URL” mode and enter your own JDBC URL
When DSS reads records from the database, it fetches them by batches for improved performance. The “fetch size” parameter lets you select the size of this batch. If you leave this parameter blank, DSS uses a reasonable default. Setting the fetch size to high values (above a few thousands) can improve performance, especially if your network connection to the database has high latency, at the expense of increased memory usage.
By default, when writing non-partitioned managed datasets, DSS drops the table and recreates it (which avoid schema discrepancy problems). You can enable this option to TRUNCATE the table instead.