Fold multiple columns by pattern

This processor takes values from multiple columns and transforms them to one line per column, like the Fold multiple columns processor. It selects the columns to fold using a pattern, and if the pattern has a capture group, the captured portion of the column name is used instead of the full column name.

It is a variant of Fold multiple columns, where the columns to fold are specified by a pattern instead of a list.

The processor only creates lines for non-empty columns.

For example, using “tag_(.*)” as column to fold pattern :

name n_connection tag_1 tag_2 tag_3
Florian 16570 bigdata python puns


name n_connection tag rank
Florian 16570 bigdata 1
Florian 16570 python 2
Florian 16570 puns 3

With capture groups

Another example: with the following dataset representing quarterly scores:

person age Q1_score Q2_score Q3_score
John 24 3 4 6
Sidney 31   6 9
Bill 33 1   4

Applying the “Fold multiple columns by pattern” processor with a pattern “.*_score” will generate the following result:

person age quarter score
John 24 Q1_score 3
John 24 Q2_score 4
John 24 Q3_score 6
Sidney 31 Q2_score 6
Sidney 31 Q3_score 9
Bill 33 Q1_score 1
Bill 33 Q3_score 4

When the pattern contains a capture group, the captured portion of the folded column’s name is used. On the same dataset, using the pattern “(.*)_score” would produce:

person age quarter score
John 24 Q1 3
John 24 Q2 4
John 24 Q3 6
Sidney 31 Q2 6
Sidney 31 Q3 9
Bill 33 Q1 1
Bill 33 Q3 4

For more details on reshaping, please see Reshaping.