Microsoft OneLake¶
DSS supports the full range of features on Microsoft OneLake Warehouses:
Reading and writing datasets
Executing SQL recipes
Performing visual recipes in-database
Using live engine for charts
Note
You might want to start with our resources on data connections in the Knowledge Base.
Installing the JDBC driver¶
Microsoft OneLake uses the Microsoft SQL Server JDBC driver, which can be downloaded from Microsoft website (at time of writing, from https://learn.microsoft.com/en-us/sql/connect/jdbc/download-microsoft-jdbc-driver-for-sql-server?view=fabric).
Make sure to select a version which is appropriate for your version of Microsoft OneLake and your version of Java. At the time of writing, these constraints are summarized here.
Download the “tar.gz” distribution archive (for Unix)
Unarchive the downloaded file
Look into the “enu/jars” subdirectory
The driver is the single JAR file called mssql-jdbc-VERSION.jreX.jar
where X is the corresponding Java version.
Copy the JAR file to the
lib/jdbc
driver of DSSRestart DSS
Write into Microsoft OneLake¶
Loading data into OneLake Warehouses using the regular SQL “INSERT” statements is very inefficient and should only be used for extremely small datasets.
The recommended way to load data into a OneLake table is through a bulk COPY from files stored in Azure Blob Storage.
DSS can automatically use this fast load method. For that, you require an Azure Blob Storage connection. Then, in the settings of the OneLake connection:
Enable “Automatic fast-write”
In “Auto fast write connection”, enter the name of the Azure Blob Storage connection to use
In “Path in connection”, enter a relative path to the root of the Azure Blob Storage connection, such as “onelake-tmp”. This is a temporary path that will be used in order to put temporary upload files. This should not be a path containing datasets.
DSS will now automatically use the optimal “Azure Blob Storage to Synapse Warehouse” copy mechanism when executing a recipe that needs to load data “from the outside” into OneLake, such as a code recipe.
Note that when running visual recipes directly in-database, this does not apply, as the data does not move outside of the database.
Explicit sync from Azure Blob Storage¶
In addition to the automatic fast-write that happens transparently each time a recipe must write into Synapse, the Sync recipe also has an explicit “Azure Blob Storage to Synapse Warehouse” engine. This is faster than automatic fast-write because it does not copy to the temporary location in Azure Blob first.
Login using OAuth¶
DSS can login using OAuth on Microsoft OneLake. OAuth login can be performed either:
Using a single service account
Using per-user credentials.
Login as a single service account¶
Make sure that you have at least version 7.2 of the JDBC driver
Create a new App (Azure Portal > Azure Active Directory > App registrations). DSS will connect with the identity of this app
In the Overview tab, note the Application (client) ID
In the Overview tab, note the Directory (tenant) ID and use it to replace the {tenanId} in the templates for the OAuth2 Authorization endpoint and OAuth2 Token endpoint fields
Go to API permissions, Add a permission, APIs my organization uses
Search for
Azure SQL
and add theAzure SQL Database
permission, Delegated permissions, user_impersonationSearch for
Azure Storage
and add theAzure Storage
permission, Delegated permissions, user_impersonationSearch for
Microsoft Graph
and add theMicrosoft Graph
permission, Delegated permissions, User.ReadCreate a client secret for this application (App registration > Certificates & Secrets), note the client secret
Create a new Microsoft OneLake connection
From the Fabric Warehouse parameters in the
About
section retrieve the value of theSQL connection string
and use it to fill the host` fieldClient id is the application id
Client secret is the one you created earlier
Login with per-user OAuth tokens¶
Make sure that you have at least version 7.2 of the JDBC driver
Create a new App (Azure Portal > Azure Active Directory > App registrations). DSS will connect with the identity of this app
In the Overview tab, note the Application (client) ID
In the Overview tab, note the Directory (tenant) ID and use it to replace the {tenanId} in the templates for the OAuth2 Authorization endpoint and OAuth2 Token endpoint fields
Go to API permissions, Add a permission, APIs my organization uses
Search for
Azure SQL
and add theAzure SQL Database
permission, Delegated permissions, user_impersonationSearch for
Azure Storage
and add theAzure Storage
permission, Delegated permissions, user_impersonationSearch for
Microsoft Graph
and add theMicrosoft Graph
permission, Delegated permissions, User.ReadGo to Authentication, click on Add a platform, select Web, in the redirect URIs field enter
http://{dss_instance_domain}/dip/api/oauth2-callback
and click on configureCreate a new Microsoft OneLake connection
From the Fabric Warehouse parameters in the
About
section retrieve the value of theSQL connection string
and use it to fill the host` fieldSet “Credentials mode” to “Per user”
Client id is the application id
Create the connection (you can’t test it yet)
Then for each user:
Go the the Fabric Workspace dashboard, click Manage access and add the user and grant them at least Contributor permissions
In DSS, go to user profile > connection credentials
Click the “Edit” button next to the new connection name
Follow the instructions that appear
Important: the account you log with must be a “Member” on the Entra ID directory. A “guest” account cannot login