Create project¶
DVC can be easily initialised with:
$ uv init --package dvc-example
$ cd dvc-example
$ git init
$ git add --all
$ git commit -m ':tada: Initial commit'
$ uv add dvc
$ uv run dvc init
$ git add pyproject.toml .dvc .dvcignore
$ git commit -m ":heavy_plus_sign: Add and initialise DVC"
uv run dvc initcreates a
.dvc/directory withconfig,.gitignoreandcache/directory.The first time you run
dvc init, you will be informed that DVC collects and transmits anonymised usage statistics. If you want to disable this, you can do so with the commanddvc config:$ uv run dvc config core.analytics false
This will disable it for the project. Alternatively, you can use the
--globalor--systemoptions of dvc config to disable analytics for the active account or for all accounts in the system.git add pyproject.toml .dvc .dvcignoreplaces
.dvc/config,.dvc/.gitignoreand the updatedpyproject.tomlunder Git version control.
Configure remote storage¶
Before using DVC, remote storage should be set up. This should be accessible to everyone who needs to access the data or model. It is similar to using a Git server. However, this is often also an NFS mount, which can be integrated as follows, for example:
$ mkdir ~/dvc-storage
$ uv run dvc remote add -d local ~/dvc-storage
Setting 'local' as a default remote.
$ git commit .dvc/config -m ":wrench: Configure local remote"
[main 3e0c8fb] :wrench: Configure local remote
1 file changed, 4 insertions(+)
-d,--defaultDefault value for remote storage space
localName of remote storage space
~/dvc-storageURL of remote storage space
Other protocols are also supported and can be prefixed to the path, including
ssh:,hdfs:,https:.
This means that another remote data storage location can easily be added, for example with:
$ uv run dvc remote add webserver https://dvc.cusy.io/dvc-example
The corresponding configuration file .dvc/config then looks like this:
[core]
remote = local
['remote "local"']
url = /Users/veit/dvc-storage
['remote "webserver"']
url = https://dvc.cusy.io/dvc-example
See also
Configure pre-commit¶
You can check the data managed by DVC with the pre-commit framework before every
git commit and git push, as well as after every git checkout. With
dvc config --use-pre-commit-tool, the .pre-commit-config.yaml file
receives the following checks:
- repo: https://github.com/iterative/dvc
rev: 3.63.0
hooks:
- id: dvc-pre-commit
additional_dependencies:
- .[all]
language_version: python3
stages:
- pre-commit
- id: dvc-pre-push
additional_dependencies:
- .[all]
language_version: python3
stages:
- pre-push
- id: dvc-post-checkout
additional_dependencies:
- .[all]
language_version: python3
stages:
- post-checkout
always_run: true
To ensure that not only the pre-commit hook is used, you must also activate
the pre-push and post-checkout hooks:
$ pre-commit install --hook-type pre-commit --hook-type pre-push --hook-type post-checkout
pre-commit installed at .git/hooks/pre-commit
pre-commit installed at .git/hooks/pre-push
pre-commit installed at .git/hooks/post-checkout