Split and unfold

This processor splits the values of a column based on a separator and transforms them into several binary columns. Also called ‘dummification’.

You can prefix new columns by filling the “Prefix” option.

You can choose the maximum number of columns to create with the “Max nb. columns to create” option.

For example, with the following dataset:

customer_id

events

1

login, product, buy

2

login, product, logout

We get:

customer_id

events_login

events_product

events_buy

events_logout

1

1

1

1

2

1

1

1

The unfolded column is deleted.

Warning

Limitations

The limitations that apply to the Unfold processor also apply to the Split and Unfold processor.

For more details on reshaping, please see Reshaping.