MLflow

MLflow

ML experiment & model registry

Data Dev Framework

What it's used for

MLflow is an open-source platform for managing the full ML lifecycle — from experiment tracking and model packaging to versioning and deployment. Originally built for traditional ML, it now includes first-class support for LLM tracking and evaluation.

Key use cases include:

  • Experiment tracking — log parameters, metrics, and artifacts for every training run
  • Model registry — version models and manage stage transitions (staging, production, archived)
  • Model packaging — package models in a standard format (MLmodel) that deploys anywhere
  • LLM tracking — log prompts, completions, token usage, and evaluation metrics
  • Model deployment — serve models as REST APIs with built-in inference servers
  • Evaluation — run automated evaluations on models with built-in and custom metrics

MLflow is used by data science and ML engineering teams who need a vendor-neutral, open-source platform for experiment management and model lifecycle. It is the most widely adopted open-source MLOps tool and integrates with virtually every ML framework.

Managed by the Linux Foundation and backed by Databricks, MLflow has a massive community and extensive integrations.

Getting started

  1. Install MLflow:
    pip install mlflow
  2. Start the tracking UI:
    mlflow ui
    # Opens at http://localhost:5000
  3. Add tracking to your code:
    import mlflow
    
    with mlflow.start_run():
        mlflow.log_param('learning_rate', 0.01)
        mlflow.log_metric('accuracy', 0.95)
        mlflow.log_artifact('model.pkl')
  4. For LLM tracking:
    mlflow.openai.autolog()
    # All OpenAI calls are now automatically logged
  5. For team use, deploy the tracking server or use Databricks Managed MLflow.

Pricing: MLflow is free and open source (Apache 2.0). Databricks Managed MLflow is included with Databricks workspace pricing. Self-hosted server costs depend on your infrastructure.

Case studies

Real MLflow projects

91% incident reduction Fintech

40-Person ML Team — 2x/Quarter to 8x/Week Deployments

Series C fintech, ML platform

Challenge

A fintech ML team was deploying models twice per quarter due to a manual, fragile deployment process. Every deployment required a war room, and 14 production incidents per quarter were traced to model issues.

Solution

Migrated 3 years of model history to MLflow with full lineage tracking. Built automated evaluation gates: models must pass 15 quality checks in MLflow before promotion. Rollback to any prior model version in under 5 minutes.

Results

Deployment frequency: 2x/quarter → 8x/week. Production incidents: 14/quarter → 1/quarter. Mean time to recover from model failures: 4 hours → 12 minutes.

Used MLflow professionally?

Add your case study and get discovered by clients.

Submit a case study

Thought leaders

AI leaders using MLflow

Follow for insights, tutorials, and thought leadership

Related tools in Data

Need a MLflow expert?

Submit a brief and we'll match you with vetted specialists who have proven MLflow experience.

Submit a brief — it's free