Running on Hadoop¶
When both the input and output datasets of a Data Preparation recipe are supported HDFS datasets, the data preparation recipe can run fully on Hadoop, as a MapReduce job.
To enable this behavior, go to the
Settings / Build tab of the data preparation recipe and check “Run on Hadoop”. You do not need to fill the “Split size” parameter.