Cite software#

James Howison and Julia Bullard listed the following examples in descending reputations in their 2016 article Software in the scientific literature:

  1. citing publications that describe the respective software

  2. citing operating instructions

  3. citing the software project website

  4. link to a software project website

  5. mention the software name

The situation remains unsatisfactory for the authors of software, especially if they differ from the authors of the software description. Conversely, research software is unfortunately not always well suited to being cited. For example, others will hardly be able to cite your software directly if you send it to them as an email attachment. Even a download link is not really useful here. It is better to provide a persistent identifier (PID) to ensure the long-term availability of your software. Both Zenodo and figshare repositories accept source code including binaries and provide Digital Object Identifiers (DOI) for them. The same applies to CiteAs, which can be used to retrieve citation information for software.

Create a DOI with Zenodo#

Zenodo enables software to be archived and a DOI to be provided for it. In the following I will show which steps are required on the example of the Jupyter tutorial:

  1. If you haven’t already, create an account on Zenodo, preferably with GitHub.

  2. In Upload ‣ New Upload under Basic information activate the button Reserve DOI to reserve a DOI for your upload. Leave the form open to upload your software later.

  3. Create or modify the CodeMeta- und Citation File Format files in your software directory.

  4. Include the badge in the README file of your software:

    Markdown:

    [![DOI](https://zenodo.org/badge/307380211.svg)](https://zenodo.org/badge/latestdoi/307380211)
    

    reStructedText:

    .. image:: https://zenodo.org/badge/307380211.svg
       :target: https://zenodo.org/badge/latestdoi/307380211
    
  5. Now select the repository that you want to archive:

    Enable repositories for Zenodo
  6. Check whether Zenodo has created a webhook in your repository for the Releases event:

    Zenodo webhook
  7. Create a new release:

    Github releases
  8. Check that the DOI was created correctly:

    Zenodo release

Metadata formats#

The FORCE11 working group has published a paper in which the principles of scientific software citation are presented: FORCE11 Software Citation Working Group by Arfon Smith, Daniel Katz and Kyle Niemeyer 2016. Two projects are currently emerging for structured metadata:

CodeMeta#

CodeMeta is an exchange scheme for general software metadata and reference implementation for JSON for Linking Data (JSON-LD).

A codemeta.json file is expected in the root directory of the software repository. The file can look like this:

{
    "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
    "@type": "SoftwareSourceCode",
    "author": [{
        "@type": "Person",
        "givenName": "Stephan",
        "familyName": "Druskat",
        "@id": "http://orcid.org/0000-0003-4925-7248"
    }],
    "name": "My Research Tool",
    "softwareVersion": "2.0",
    "identifier": "https://doi.org/10.5281/zenodo.1234",
    "datePublished": "2017-12-18",
    "codeRepository": "https://github.com/research-software/my-research-tool"
}

Citation File Format#

Citation File Format is a scheme for software citation metadata in machine-readable YAML format

A file CITATION.cff should be stored in the root directory of the software repository.

The content of the file can look like this:

cff-version: "1.1.0"
message: "If you use this tutorial, please cite it as below."
authors:
  -
    family-names: Schiele
    given-names: Veit
    orcid: "https://orcid.org/https://orcid.org/0000-0002-2448-8958"
identifiers:
  -
    type: doi
    value: "10.5281/zenodo.4147287"
keywords:
  - "data-science"
  - jupyter
  - "jupyter-notebooks"
  - "jupyter-kernels"
  - ipython
  - pandas
  - spack
  - pipenv
  - ipywidgets
  - "ipython-widget"
  - dvc
title: "Jupyter tutorial"
version: "0.8.0"
date-released: 2020-10-08
license: "BSD-3-Clause"
repository-code: "https://github.com/veit/jupyter-tutorial"

You can easily adapt the example above to create your own CITATION.cff file or use the cffinit website.

With cff-validator you have a GitHub action that checks CITATION.cff files with the R package V8.

There are also some tools for the workflow of CITATION.cff files:

GitHub also offers a service to copy the information from CITATION.cff files in APA and BibTex format.

Popup on the landing page of a GitHub repository with the possibility to export ADA and BibTex formats.

When registering a DOI via Zenodo the CITATION.cff file in the GitHub repository is also be used. Also Zotero interprets the Citation File Format file in GitHub repositories; however, Zotero can take meta-information of the repository, such as company, programming language etc., even without a Citation File Format file.

Git2PROV#

Git2PROV generates PROV data from the information in a Git repository.

On the command line, the conversion can be easily executed with:

$ git2prov git_url [serialization]

For example:

$ git2prov git@github.com:veit/python4datascience.git PROV-JSON

In total, the following serialisation formats are available:

  • PROV-N

  • PROV-JSON

  • PROV-O

  • PROV-XML

Alternatively, Git2PROV also provides a web server with:

$ git2prov-server [port]

HERMES#

HERMES simplifies the publication of research software by continuously retrieving existing metadata in Citation File Format, CodeMeta and Git. Subsequently, the metadata is also compiled appropriately for InvenioRDM and Dataverse. Finally, CITATION.cff and codemeta.json are also updated for the publication repositories.

  1. Add .hermes/ to the .gitignore file

  2. Provide CITATION.cff file with additional metadata

    Important

    Make sure license is defined in the CITATION.cff file; otherwise, your release will not be accepted as open access by the Zenodo sandbox.

  3. Configure HERMES workflow

    The HERMES workflow is configured in the file TOML, where each step gets its own section.

    If you want to configure HERMES to use the metadata from Git and CITATION.cff, and to file in the Zenodo sandbox built on InvenioRDM, the hermes.toml file looks like this:

    hermes.toml#
    # SPDX-FileCopyrightText: 2021 Veit Schiele
    #
    # SPDX-License-Identifier: BSD-3-Clause
    
    [harvest]
    from = [ "git", "cff" ]
    
    [deposit]
    mapping = "invenio"
    target = "invenio"
    
    [deposit.invenio]
    site_url = "https://sandbox.zenodo.org"
    access_right = "open"
    
    [postprocess]
    execute = [ "config_record_id" ]
    
  4. Access token for Zenodo Sandbox

    In order for GitHub Actions to publish your repository in the Zenodo Sandbox, you need a personal access token. To do this, you need to log in to Zenodo Sandbox and then create a personal access token in your user profile with the name HERMES workflow and the scopes deposit:actions und deposit:write:

    Zenodo: Neues persönliches Zugangstoken
  5. Copy the newly created token to a new GitHub secret named ZENODO_SANDBOX in your repository: Settings –> Secrets and Variables –> Actions –> New repository secret:

    GitHub: Neues Action-Secret
  6. Configure the GitHub action

    The HERMES project provides templates for continuous integration in a special repository: hermes-hmc/ci-templates. Copy the template file TEMPLATE_hermes_github_to_zenodo.yml into the .github/workflows/ directory of your repository and rename it, for example to hermes_github_to_zenodo.yml.

    Then you should go through the file and look for comments marked # ADAPT. Modify the file to suit your needs.

    Finally, add the workflow file to version control and push it to the GitHub server:

    $ git add .github/workflows/hermes_github_to_zenodo.yml
    $ git commit -m ":construction_worker: GitHub action for automatic publication with HERMES"
    $ git push
    
  7. GitHub actions should be allowed to create pull requests in your repository

    The HERMES workflow will not publish metadata without your approval. Instead, it will create a pull request so that you can approve or change the metadata that is stored. To enable this, go to Settings ‣ Actions ‣ General in your repository and in the Workflow permissions section, enable Allow GitHub Actions to create and approve pull requests.