The first task when using Data Science Studio is to define datasets to connect to your data sources.
A dataset is a series of records with the same schema. It is quite analogous to a table in the SQL world.
For a more global explanation about the different kinds of datasets, see the Concepts page.
- ERR_FSPROVIDER_INVALID_CONFIGURATION: Invalid configuration
- ERR_SPARK_FAILED_DRIVER_OOM: Spark failure: out of memory in driver
- ERR_SPARK_FAILED_TASK_OOM: Spark failure: out of memory in task
- ERR_SPARK_FAILED_YARN_KILLED_MEMORY: Spark failure: killed by YARN (excessive memory usage)
- ERR_SPARK_PYSPARK_CODE_FAILED_UNSPECIFIED: Pyspark code failed
- ERR_SQL_CANNOT_LOAD_DRIVER: Failed to load database driver
- ERR_SQL_DB_UNREACHABLE: Failed to reach database
- ERR_SQL_IMPALA_MEMORYLIMIT: Impala memory limit exceeded
- ERR_SQL_TABLE_NOT_FOUND: SQL Table not found
- ERR_TRANSACTION_FAILED_ENOSPC: Out of disk space
- ERR_TRANSACTION_GIT_COMMMIT_FAILED: Failed committing changes
- ERR_MISC_ENOSPC: Out of disk space
- ERR_MISC_EOPENF: Too many open files