Running on HadoopΒΆ

When both the input and output datasets of a Data Preparation recipe are supported HDFS datasets, the data preparation recipe can run fully on Hadoop, as a MapReduce job.

To enable this behavior, go to the Settings / Build tab of the data preparation recipe and check “Run on Hadoop”. You do not need to fill the “Split size” parameter.