Reinforcement Learning¶

This capability provides recipes to train and evaluate reinforcement learning (RL) agents. It is provided by the Reinforcement Learning plugin, which you need to install. Please see Installing plugins.

Overview¶

Reinforcement learning is a good fit when a problem involves sequential decisions, delayed rewards, and an objective focused on long-term outcomes.

The plugin supports two agent families:

Q-learning for smaller, fully discrete state/action spaces.
Deep Q-learning (DQN) for richer observations where tabular policies are not practical.

Training recipe¶

Use the Train recipe to learn a policy by repeatedly interacting with an environment.

Outputs¶

A managed folder containing model artifacts and training metadata.

Parameters¶

Agent: Q-Learning or Deep Q-learning (DQN).
Environment source: built-in environment list or custom environment ID.
Environment kwargs (JSON object, optional): runtime parameters passed to the environment.
Q-learning settings: Discount factor, Learning rate, exploration (Epsilon, Decay rate), and episode/step limits.
DQN settings: Policy, Total training timesteps, Buffer size, Batch size, exploration settings, and target network update frequency.
Training profiles mode (DQN): optionally train multiple profiles from a JSON array.

Testing recipe¶

Use the Test recipe to evaluate a trained model without additional learning.

Inputs¶

A managed folder containing trained model artifacts.

Outputs¶

A managed folder with testing JSON files (scores, metadata, and replay information).

Parameters¶

What to test: manual model selection or manifest-based batch testing.
Agent: the agent family to evaluate.
Environment fields (Environment source, Environment or Custom Environment ID, Module to import, optional Environment kwargs).

Using custom environments¶

For most business use cases, you will define a custom Gym/Gymnasium environment.

Add your environment Python module to the Dataiku project library.
Ensure importing the module registers an environment ID.
In Train/Test recipes, set:
- Environment source = Custom environment ID
- Custom Environment ID = <registered_id>
- Module to import = <module_name>
- optional Environment kwargs (JSON object)

Visualizing testing results¶

To compare evaluation runs, create a visual webapp with the RL Agent Testing Results webapp template.

Configure:

Replay Folder: output folder from the Test recipe.
Training Models Folder (optional): output folder from Train recipe to enable manifest/profile helpers.