Dataiku Documentation
  • Academy
    • Join the Academy
      Benefit from guided learning opportunities →
      • Quick Starts
      • Learning Paths
      • Certifications
      • Academy Discussions
  • Community
      • Explore the Community
        Discover, share, and contribute →
      • Learn About Us
      • Ask A Question
      • What's New?
      • Discuss Dataiku
      • Using Dataiku
      • Setup And Configuration
      • General Discussion
      • Plugins & Extending Dataiku
      • Product Ideas
      • Programs
      • Frontrunner Awards
      • Dataiku Neurons
      • Community Resources
      • Community Feedback
      • User Research
  • Documentation
    • Reference Documentation
      Comprehensive specifications of Dataiku →
      • User's Guide
      • Specific Data Processing
      • Automation & Deployment
      • APIs
      • Installation & Administration
      • Other Topics
  • Knowledge
    • Knowledge Base
      Articles and tutorials on Dataiku features →
      • User Guide
      • Admin Guide
      • Dataiku Solutions
      • Dataiku Cloud
  • Developer
    • Developer Guide
      Tutorials and articles for developers and coder users →
      • Getting Started
      • Concepts and Examples
      • Tutorials
      • API Reference
  • User's Guide
  • DSS concepts
  • Connecting to data
  • Exploring data
  • Charts
  • The Flow
  • Data preparation
  • Visual recipes
  • Code recipes
  • Schemas, storage types and meanings
  • Generative AI and LLM Mesh
  • Agentic AI
  • Machine learning
  • MLOps
    • Feature Store
    • Models evaluations
      • Concepts
      • Evaluating Dataiku Prediction models
      • Evaluating Dataiku Time Series Forecasting models
      • Evaluating other models
      • Analyzing evaluation results
      • Automating model evaluations and drift analysis
    • Model Comparisons
    • Drift analysis
    • MLflow Models
    • External Models
    • Experiment Tracking
    • Unified Monitoring
    • Project Standards
    • AI Types
  • Interactive statistics
  • Code notebooks
  • Code Studios
  • Webapps
  • Collaboration
  • AI Assistants
  • Dashboards
  • Workspaces
  • Stories
  • Data Catalog
  • Data Lineage
  • Dataiku Applications
  • Working with partitions
  • DSS and SQL
  • DSS and Python
  • DSS and R
  • DSS and Spark
  • Code environments
  • Specific Data Processing
  • Time Series
  • Geographic data
  • Graph
  • Text & Natural Language Processing
  • Images
  • Graph
  • Audio
  • Video
  • Automation & Deployment
  • Metrics, checks and Data Quality
  • Automation scenarios
  • Production deployments and bundles
  • API Node & API Deployer: Real-time APIs
  • AI Governance
  • APIs
  • Python APIs
  • R API
  • Public REST API
  • Additional APIs
  • Installation & Administration
  • Installing and setting up
  • Elastic AI computation
  • DSS in the cloud
  • DSS and Hadoop
  • Metastore catalog
  • Operating DSS
  • Security
  • User Isolation
  • Other topics
  • Plugins
  • Streaming data
  • Formula language
  • Custom variables expansion
  • Sampling methods
  • Visualization themes
  • Accessibility
  • Troubleshooting
  • Release notes
  • Other Documentation
  • Third-party acknowledgements
Dataiku DSS
You are viewing the documentation for version 14 of DSS.
  • »
  • MLOps »
  • Models evaluations Open page in a new tab

Models evaluations¶

Evaluating a machine learning model consists of computing its performance and behavior on a set of data called the Evaluation set. Model evaluations are the cornerstone of MLOps capabilities. They permit Drift analysis, Model Comparisons and automating retraining of models

Evaluation of LLMs and Agents uses similar concepts, adapted to the specificities of these use cases.

  • Concepts
    • When training
    • Subsequent evaluations
  • Evaluating Dataiku Prediction models
    • Configuration of the Evaluation Recipe
      • Labels
      • Sampling
      • Custom Evaluation Metrics
    • Limitations
  • Evaluating Dataiku Time Series Forecasting models
    • Input dataset
    • Outputs
      • Output dataset
      • Metrics dataset
      • Model Evaluation Store
    • Refitting for statistical models
  • Evaluating other models
    • Configuration of the standalone evaluation recipe
      • Labels
      • Sampling
  • Analyzing evaluation results
    • The evaluations comparison
    • Model Evaluation details
    • Using evaluation labels
  • Automating model evaluations and drift analysis
    • Metrics and Checks
    • Scenarios and feedback loop
    • Feedback loop
Next Previous

© Copyright 2025, Dataiku

Built with Sphinx using a theme provided by Read the Docs.