Graph Clustering Recipe¶
The Graph clustering recipe computes community assignments on selected node groups and edge groups. It can write either a dataset of nodes or an enriched edge dataset.
The recipe runs on a graph database. See graph database recipe settings and algorithm execution and sampling.
Algorithms¶
The recipe can compute:
Fastgreedy
Multilevel
Infomap
Walktrap
Fastgreedy and Multilevel are available only when Directed graph is disabled.
Input / Output¶
- Input
Graph folder (Optional): Dataiku Folder that contains your materialized graph database. Leave it empty to run on an unmanaged Neo4j database directly.
- Output
Output dataset: Dataset containing the computed community assignments.
Settings¶
Node groups
Choose one or more node groups to include in the computation.
Edge groups
Select the edge groups that define the relationships to consider.
Directed graph
Enable this option to treat relationships as directed. Some algorithms are hidden when directed graphs are selected because they only support undirected graphs.
Weight property
Optionally select a numeric edge property to use as the relationship weight for clustering. The selected property must exist on all selected edge groups.
Output type
Choose Dataset of nodes to write one row per node, or Dataset of edges to keep an edge dataset enriched with community assignments for both endpoints.
Clustering algorithms
Use Select all to compute all algorithms supported by the current graph settings, or select individual algorithms.
Advanced parameters
Batch size: Number of result rows processed and written at a time.