Metrics

With DVC, you can easily capture and plot metrics, visualise performance in charts, and update parameters. This allows you to run and compare many iterations of your ML project.

To find the right values for our ML attributes, we add a final evaluation phase to our previous pipeline:

$ uv run dvc stage add \
    -n evaluate \
    -d src/dvc_example/evaluate.py \
    -d model.pkl \
    -d data/features \
    -o eval \
    uv run python src/dvc_example/evaluate.py model.pkl data/features

This adds a new level to the dvc.yaml file:

34evaluate:
35  cmd: uv run python src/dvc_example/evaluate.py model.pkl data/features
36  deps:
37  - data/features
38  - model.pkl
39  - src/dvc_example/evaluate.py
40  outs:
41  - eval
Line 39

evaluate.py uses DVCLive to write metrics (for example, AUC (Area Under the Curve and ROC (Receiver Operating Characteristic curve) to the eval directory, which DVC can analyse to compare and visualise them across iterations. Typically, DVCLive configures metrics and plots in the dvc.yaml file, but in this example, we customise them by combining training and test plots.

Line 41

The metrics and plots are stored in the eval directory, so these files do not affect the Git history. Alternatively, in certain cases, it may also be useful to be able to track certain metrics and plots in Git.

Now we can run our evaluations and save the results:

$ uv run dvc repro
'data/data.xml.dvc' didn't change, skipping
Stage 'prepare' didn't change, skipping
Stage 'featurize' didn't change, skipping
Stage 'train' didn't change, skipping
Running stage 'evaluate':
> uv run python src/dvc_example/evaluate.py model.pkl data/features
$ git add .gitignore dvc.lock dvc.yaml pyproject.toml src/dvc_example/evaluate.py
$ git commit -m ':sparkles: Add evaluation step'

With dvc metrics, you can also generate metrics via the command line:

dvc metrics show

displays metrics with optional formatting, for example:

$ uv run dvc metrics show
Path               avg_prec.test    avg_prec.train    roc_auc.test    roc_auc.train
eval/metrics.json  0.9014           0.95704           0.93196         0.97743

See also

dvc metrics show

dvc metrics diff

shows changes in metrics between commits, for example:

$ uv run dvc metrics diff
Path               Metric          HEAD    workspace    Change
eval/metrics.json  avg_prec.test   -       0.9014       -
eval/metrics.json  avg_prec.train  -       0.95704      -
eval/metrics.json  roc_auc.test    -       0.93196      -
eval/metrics.json  roc_auc.train   -       0.97743      -

See also

dvc metrics diff

dvc plots show

generates an HTML page with plots:

DVC Plot

eval/plots/images/importance.png

workspace

Compare metrics

If you now change the parameters in the params.yaml file, you can compare your current working directory with the last commit (HEAD):

$ uv run dvc params diff
Path         Param                   HEAD    workspace
params.yaml  featurize.max_features  100     200
params.yaml  featurize.ngrams        1       2
$ uv run dvc metrics diff
Path               Metric          HEAD     workspace    Change
eval/metrics.json  avg_prec.test   0.9014   0.925        0.0236
eval/metrics.json  avg_prec.train  0.95704  0.97437      0.01733
eval/metrics.json  roc_auc.test    0.93196  0.94602      0.01406
eval/metrics.json  roc_auc.train   0.97743  0.98667      0.00924
$ uv run dvc plots diff
file:///Users/veit/dvc-example/dvc_plots/index.html
DVC Plot

eval/plots/images/importance.png

workspace

HEAD