Agent Interaction Logging¶
Overview¶
Agent interaction logging consists of storing each interaction with an agent as a record in a dataset. These records can then be reused for oversight, analytics, debugging, and evaluation.
A logged interaction includes:
The user input sent to the agent
The final answer returned by the agent
The list of tool calls performed by the agent
Metadata about the interaction, such as timestamps, user, or the selected agent id
Additional technical payloads, such as traces and trajectories
Using interaction logs for Agent Evaluation¶
The Evaluate Agent recipe expects a dataset containing records of agent interactions. An interaction logging dataset is therefore a common upstream input for agent evaluation.
To compute additional metrics, you can enrich the logged interactions with evaluation-specific reference data, for example:
A ground truth answer
A reference list of expected tool calls
How to set up interaction logging¶
Interaction logging can be configured in the settings of an agent version. You can edit these settings either from the browser or through the Python public API, using dataikuapi.dss.agent.DSSAgentVersionSettings.interaction_logging_selection.
If you want to reuse the same interaction logging settings for all agents in a project, you can define common interaction logging settings at the project level.
Then, select the Inherit mode in the agent version settings to reuse the project-level configuration.
The following options are available.
Agent logging mode¶
Three modes are available:
Disable interaction logs storage: No interaction logs are stored.
Inherit configuration: Reuses the logging configuration inherited from the project settings.
Define custom configuration: Lets you define an explicit logging dataset and storage behavior.
Output dataset¶
The Output dataset is the dataset where interaction logs are written. It is mandatory. You can create a new dataset or reuse an existing one. The dataset schema and partitioning must be compatible with interaction logging.
Flushing¶
To avoid performance issues, interaction logs are not written synchronously to the dataset. Instead, data is buffered and written periodically according to:
the Flush interval, in seconds
the Flush size, in bytes
Content mode¶
Controls how much technical payload is stored in the logs.
- Full
Writes the full raw LLM response JSON in the corresponding column.
- No logs
Writes the raw LLM response JSON in the corresponding column, excluding the
logfield.- No trace and no logs
Writes the raw LLM response JSON in the corresponding column, excluding both the
logandtracefields. In this mode, thedku_tracecolumn is also empty.