Google Drive

Dataiku can interact with Google Drive to read and write files stored on shared folders.

Note

This capability is provided by the “google-drive” plugin, which you need to install. Please see Installing plugins.

This plugin is Not supported

Create a Google Drive preset

Dataiku supports connecting to Google Drive using a Service Account or OAuth2.

With service account credentials, Dataiku will be able to access all resources associated with this service account, independently of the user initiating the preset.

OAuth2 preset access means Dataiku will use the OAuth2 protocol to access the resources in Google Drive. Dataiku will be registered as an OAuth2 client, authorized to request and gain access on behalf of your Dataiku users.

Use a service account if:

  • your Dataiku users don’t have direct access to the resources in Google Drive

  • you don’t need resources access filtering per user

Use OAuth2 if:

  • your Dataiku users have access to your Google Drive resources

  • you do not want your users to access resources via Dataiku in Google Drive which they do not have permission for

Using Service Account

Before connecting to Google Cloud via Dataiku you will have to:

Configure your Service Account preset :

  • navigate to the Google Drive plugin page > Settings > Google Drive Token > + Add preset

  • parse the entire content of your service account private key in JSON format in the Access credentials box

Using OAuth2

The OAuth2 connection is performed using per-user credentials. Each user must grant Dataiku permission to access Google Drive on their behalf. You will first need to create an OAuth2 client in your Google Drive project and configure the credentials in your Dataiku Google Drive preset.

To create an OAuth 2.0 client ID in the console, please refer to the following documentation . When creating your OAuth2 client in google, you will need to:

  • make sure “Google Drive API” is enabled in Google Cloud console’s API Manager

  • select the application type Web application

  • add the following redirect URI DSS_BASE_URL/dip/api/oauth2-callback

Note

For example if Dataiku is accessed at https://dataiku.mycompany.corp/, the OAuth2 redirect URL is https://dataiku.mycompany.corp/dip/api/oauth2-callback

Once created, configure Dataiku to use this OAuth2 client. In the Google Drive plugin page > Settings > Google Single Sign On, do the following:

  • create a new Google Single Sign On preset

  • fill the “Client id”, “Client secret” with the information from your OAuth app

Note

At this point, although the preset is operational, you can’t test it yet as your user hasn’t authorized Dataiku to access Google Drive on their behalf.

Each user, including you, will need to follow these steps to allow Dataiku to access Google Drive on their behalf:

  • go to user Profile and settings > Credentials

  • the user will see that no authorization was given yet to Dataiku for this preset

  • click the “Edit” button next to the new preset name

  • follow the instructions that appear: Google will authenticate and get the user consent to authorize Dataiku to access Google Drive

  • the user will be redirected automatically to Dataiku and will notice that credentials have successfully been obtained for the preset

Usage

Read a dataset

  • If you are using a service account, you must first share the Google Drive document or directory with the service account email address you took note of

  • Take note of the directory ID, which is the section of the directory’s URL after the last slash character

  • Create a new dataset in Dataiku DSS by selecting Dataset > Googledrive > Googledrive

  • Pick your preset for the connection, and in ID of root directory, paste the previously copied directory ID

Write data into Google Drive

  • In your DSS project flow, select Dataset > Folder

  • Choose Google drive in the Store into parameter and click Create. An External code failed error may appear because you have not selected your preset yet.

  • In the Settings tab of the folder, select the authentication type, your preset, add your Drive directory ID, and save. You can now export your dataset into your newly created Google Drive managed folder.

Opening multisheets GoogleSheet documents

By default the plugin transmits GoogleSheet documents to Dataiku as CSV files. In cases where a Google Sheet document contains several sheets, the plugin can be set to return datasets in the Excel format instead. To do so, in your newly created dataset or folder Settings tab, activate the read parameter Convert Google Sheets to .xlsx. In some instances, DSS might have difficulty recognizing the file format. Go in the Format/Preview tab. The Type should be set to Excel and XLSX format should be on.