Setting up Hadoop and Spark integrationΒΆ
Data Science Studio is able to connect to a Hadoop cluster and to:
Read and write HDFS datasets
Run Hive queries and scripts
Run Impala queries
Run Pig scripts
Run preparation recipes on Hadoop
In addition, if you setup Spark integration, you can:
Run SparkSQL queries
Run preparation, join, stack and group recipes on Spark
Run PySpark & SparkR scripts
Train & use Spark MLLib models
See Setting up Hadoop integration and Setting up Spark integration