Agent Interaction Logging¶

Overview ¶

Agent interaction logging consists of storing each interaction with an agent (or a Retrieval-Augmented Generation) as a record in a dataset. These records can then be reused for oversight, analytics, debugging, and evaluation.

A logged interaction includes:

The user input sent to the agent
The final answer returned by the agent
The list of tool calls performed by the agent
Metadata about the interaction, such as timestamps, user, or the selected agent id
Additional technical payloads, such as traces and trajectories

Note

Note: This section covers the setup and usage of interaction logging for an agent; the same procedure applies to retrieval-augmented models.

Using interaction logs for Agent Evaluation ¶

The Evaluate Agent recipe expects a dataset containing records of agent interactions. An interaction logging dataset is therefore a common upstream input for agent evaluation.

To compute additional metrics, you can enrich the logged interactions with evaluation-specific reference data, for example:

A ground truth answer
A reference list of expected tool calls

How to set up interaction logging ¶

Interaction logging can be configured in the settings of an agent version. You can edit these settings either from the browser or through the Python public API, using dataikuapi.dss.agent.DSSAgentVersionSettings.interaction_logging_selection. If you want to reuse the same interaction logging settings for all agents in a project, you can define common interaction logging settings at the project level. Then, select the Inherit mode in the agent version settings to reuse the project-level configuration.

The following options are available.

Agent logging mode ¶

Three modes are available:

Disable interaction logs storage: No interaction logs are stored.
Inherit configuration: Reuses the logging configuration inherited from the project settings.
Define custom configuration: Lets you define an explicit logging dataset and storage behavior.

Output dataset ¶

The Output dataset is the dataset where interaction logs are written. It is mandatory. You can create a new dataset or reuse an existing one. The dataset schema and partitioning must be compatible with interaction logging.

Flushing ¶

To avoid performance issues, interaction logs are not written synchronously to the dataset. Instead, data is buffered and written periodically according to:

the Flush interval, in seconds
the Flush size, in bytes

Content mode ¶

Controls how much technical payload is stored in the logs.

Full: Writes the full raw response JSON in the corresponding column.
No logs: Writes the raw response JSON in the corresponding column, excluding the log field.
No trace and no logs: Writes the raw response JSON in the corresponding column, excluding both the log and trace fields. In this mode, the dku_trace column is also empty.

Agent Interaction Logging¶

Overview¶

Using interaction logs for Agent Evaluation¶

How to set up interaction logging¶

Agent logging mode¶

Output dataset¶

Flushing¶

Content mode¶