Agent Interaction Logging

Overview

Agent interaction logging consists of storing each interaction with an agent as a record in a dataset. These records can then be reused for oversight, analytics, debugging, and evaluation.

A logged interaction includes:

  • The user input sent to the agent

  • The final answer returned by the agent

  • The list of tool calls performed by the agent

  • Metadata about the interaction, such as timestamps, user, or the selected agent id

  • Additional technical payloads, such as traces and trajectories

Using interaction logs for Agent Evaluation

The Evaluate Agent recipe expects a dataset containing records of agent interactions. An interaction logging dataset is therefore a common upstream input for agent evaluation.

To compute additional metrics, you can enrich the logged interactions with evaluation-specific reference data, for example:

  • A ground truth answer

  • A reference list of expected tool calls

How to set up interaction logging

Interaction logging can be configured in the settings of an agent version. You can edit these settings either from the browser or through the Python public API, using dataikuapi.dss.agent.DSSAgentVersionSettings.interaction_logging_selection. If you want to reuse the same interaction logging settings for all agents in a project, you can define common interaction logging settings at the project level. Then, select the Inherit mode in the agent version settings to reuse the project-level configuration.

The following options are available.

Agent logging mode

Three modes are available:

  • Disable interaction logs storage: No interaction logs are stored.

  • Inherit configuration: Reuses the logging configuration inherited from the project settings.

  • Define custom configuration: Lets you define an explicit logging dataset and storage behavior.

Output dataset

The Output dataset is the dataset where interaction logs are written. It is mandatory. You can create a new dataset or reuse an existing one. The dataset schema and partitioning must be compatible with interaction logging.

Flushing

To avoid performance issues, interaction logs are not written synchronously to the dataset. Instead, data is buffered and written periodically according to:

  • the Flush interval, in seconds

  • the Flush size, in bytes

Content mode

Controls how much technical payload is stored in the logs.

Full

Writes the full raw LLM response JSON in the corresponding column.

No logs

Writes the raw LLM response JSON in the corresponding column, excluding the log field.

No trace and no logs

Writes the raw LLM response JSON in the corresponding column, excluding both the log and trace fields. In this mode, the dku_trace column is also empty.