Importing tables as datasets

The “import tables as datasets” feature is available through the API, both for Hive and SQL tables

Importing SQL tables

project = client.get_project("MYPROJECT")

import_definition = project.init_tables_import()
import_definition.add_sql_table("my_sql_connection", "schema_of_the_table", "name_of_the_table")

prepared_import = import_definition.prepare()
future = prepared_import.execute()

import_result = future.wait_for_result()

Importing Hive tables

project = client.get_project("MYPROJECT")

import_definition = project.init_tables_import()
import_definition.add_hive_table("hive_database", "hive_table_name")

prepared_import = import_definition.prepare()
future = prepared_import.execute()

import_result = future.wait_for_result()

Reference documentation

class dataikuapi.dss.project.TablesImportDefinition(client, project_key)

Temporary structure holding the list of tables to import

add_hive_table(hive_database, hive_table)

Add a Hive table to the list of tables to import

Parameters
  • hive_database (str) – the name of the Hive database

  • hive_table (str) – the name of the Hive table

add_sql_table(connection, schema, table)

Add a SQL table to the list of tables to import

Parameters
  • connection (str) – the name of the SQL connection

  • schema (str) – the schema of the table

  • table (str) – the name of the SQL table

add_elasticsearch_index_or_alias(connection, index_or_alias)

Add an Elastic Search index or alias to the list of tables to import

prepare()

Run the first step of the import process. In this step, DSS will check the tables whose import you have requested and prepare dataset names and target connections

Returns

an object that allows you to finalize the import process

Return type

TablesPreparedImport

class dataikuapi.dss.project.TablesPreparedImport(client, project_key, candidates)

Result of preparing a tables import. Import can now be finished

execute()

Starts executing the import in background and returns a dataikuapi.dss.future.DSSFuture to wait on the result

Returns

a future to wait on the result

Return type

dataikuapi.dss.future.DSSFuture