Graph Features Recipe

The Graph features recipe computes node-level graph metrics on selected node groups and edge groups. It can write either a dataset of nodes or an enriched edge dataset.

Use this recipe for new PageRank computations. The standalone Compute PageRank recipe is deprecated and kept for compatibility.

The recipe runs on a graph database. See graph database recipe settings and algorithm execution and sampling.

Algorithms

The recipe can compute:

  • Degree

  • Eigenvector centrality

  • Clustering coefficient

  • Count of triangles

  • Closeness centrality

  • PageRank

  • Square clustering

  • Connected components

Count of triangles and Connected components are available only when Directed graph is disabled.

Input / Output

Input
  • Graph folder (Optional): Dataiku Folder that contains your materialized graph database. Leave it empty to run on an unmanaged Neo4j database directly.

Output
  • Output dataset: Dataset containing the computed graph features.

Settings

Node groups

Choose one or more node groups to include in the computation.

Edge groups

Select the edge groups that define the relationships to consider.

Directed graph

Enable this option to treat relationships as directed. Some algorithms are hidden when directed graphs are selected because they only support undirected graphs.

Output type

Choose Dataset of nodes to write one row per node, or Dataset of edges to keep an edge dataset enriched with graph feature values for both endpoints.

Graph features algorithms

Use Select all to compute all algorithms supported by the current graph settings, or select individual algorithms.

Advanced parameters

  • Batch size: Number of result rows processed and written at a time.