Display pipelines¶
DVC represents a pipeline internally as directed acyclic graphs (DAGs).
See also
You can use dvc dag to visualise or export pipelines:
$ uv run dvc dag
+-------------------+
| data/data.xml.dvc |
+-------------------+
*
*
*
+-------+
| split |
+-------+
*
*
*
+-----------+
| featurize |
+-----------+
** **
** *
* **
+-------+ *
| train | **
+-------+ *
** **
** **
* *
+----------+
| evaluate |
+----------+
With
dvc dag --dot, a.dotfile for Graphviz can also be generated:
With dvc status, you can see whether the levels or local and remote storage
have been changed:
$ uv run dvc status
evaluate:
changed deps:
modified: src/dvc_example/evaluate.py
changed outs:
modified: eval
See also
In CI jobs, it is usually necessary
to check whether the pipeline is up to date without retrieving or executing
anything. With dvc repro --dry, you can find out which pipeline stages would
need to be executed. However, if data is missing, the command will fail. If
missing data should be ignored, you can use dvc repro --dry --allow-missing.
$ uv run dvc repro --allow-missing --dry
'data/data.xml.dvc' didn't change, skipping
Stage 'prepare' didn't change, skipping
Stage 'featurize' didn't change, skipping
Stage 'train' didn't change, skipping
Stage 'evaluate' is cached - skipping run, checking out outputs
Running stage 'evaluate':
> uv run python src/dvc_example/evaluate.py model.pkl data/features
Use `dvc push` to send your updates to remote storage.