Dataiku Documentation
  • Academy
    • Join the Academy
      Benefit from guided learning opportunities →
      • Quick Starts
      • Learning Paths
      • New Features
      • Certifications
      • Academy Discussions
  • Community
      • Explore the Community
        Discover, share, and contribute →
      • Learn About Us
      • Ask A Question
      • What's New?
      • Discuss Dataiku
      • Using Dataiku
      • Setup And Configuration
      • General Discussion
      • Plugins & Extending Dataiku
      • Product Ideas
      • Programs
      • Frontrunner Awards
      • Dataiku Neurons
      • Community Resources
      • Community Feedback
      • User Research
  • Documentation
    • Reference Documentation
      Comprehensive specifications of Dataiku →
      • User's Guide
      • Specific Data Processing
      • Automation & Deployment
      • APIs
      • Installation & Administration
      • Other Topics
  • Knowledge
    • Knowledge Base
      Articles and tutorials on Dataiku features →
      • User Guide
      • Admin Guide
      • Dataiku Solutions
      • Dataiku Cloud
  • Developer
    • Developer Guide
      Tutorials and articles for developers and coder users →
      • Getting Started
      • Concepts and Examples
      • Tutorials
      • API Reference
  • User's Guide
  • DSS concepts
  • Connecting to data
  • Exploring data
  • Charts
  • The Flow
  • Data preparation
  • Visual recipes
  • Code recipes
  • Schemas, storage types and meanings
  • Generative AI and LLM Mesh
  • Machine learning
  • MLOps
    • Feature Store
    • Models evaluations
      • Concepts
      • Evaluating DSS models
      • Evaluating Large Language Models
      • Evaluating other models
      • Analyzing evaluation results
      • Automating model evaluations and drift analysis
    • Model Comparisons
    • Drift analysis
    • MLflow Models
    • External Models
    • Experiment Tracking
    • Unified Monitoring
  • Interactive statistics
  • Code notebooks
  • Code Studios
  • Webapps
  • Collaboration
  • AI Assistants
  • Dashboards
  • Workspaces
  • Stories
  • Data Catalog
  • Dataiku Applications
  • Working with partitions
  • DSS and SQL
  • DSS and Python
  • DSS and R
  • DSS and Spark
  • Code environments
  • Specific Data Processing
  • Time Series
  • Geographic data
  • Text & Natural Language Processing
  • Images
  • Audio
  • Video
  • Automation & Deployment
  • Metrics, checks and Data Quality
  • Automation scenarios
  • Production deployments and bundles
  • API Node & API Deployer: Real-time APIs
  • Governance
  • APIs
  • Python APIs
  • R API
  • Public REST API
  • Additional APIs
  • Installation & Administration
  • Installing and setting up
  • Elastic AI computation
  • DSS in the cloud
  • DSS and Hadoop
  • Metastore catalog
  • Operating DSS
  • Security
  • User Isolation
  • Email Notifications
  • Other topics
  • Plugins
  • Streaming data
  • Formula language
  • Custom variables expansion
  • Sampling methods
  • Accessibility
  • Troubleshooting
  • Release notes
  • Other Documentation
  • Third-party acknowledgements
Dataiku DSS
You are viewing the documentation for version 13 of DSS.
  • »
  • MLOps »
  • Models evaluations Open page in a new tab

Models evaluations¶

Evaluating a machine learning model consists of computing its performance and behavior on a set of data called the Evaluation set. Model evaluations are the cornerstone of MLOps capabilities. They permit Drift analysis, Model Comparisons and automating retraining of models

  • Concepts
    • When training
    • Subsequent evaluations
  • Evaluating DSS models
    • Configuration of the evaluation recipe
      • Labels
      • Sampling
      • Custom Evaluation Metrics
    • Limitations
  • Evaluating Large Language Models
    • Overview
    • Recipe configuration
      • Input dataset
      • Metrics
      • Custom metrics
    • LLM Evaluations
    • Comparisons
  • Evaluating other models
    • Configuration of the standalone evaluation recipe
      • Labels
      • Sampling
  • Analyzing evaluation results
    • The evaluations comparison
    • Model Evaluation details
    • Using evaluation labels
  • Automating model evaluations and drift analysis
    • Metrics and Checks
    • Scenarios and feedback loop
    • Feedback loop
Next Previous

© Copyright 2025, Dataiku

Built with Sphinx using a theme provided by Read the Docs.