How to: Zenodo

October 7, 2021

✈️ Hello, this is your community manager speaking.
Bio-IT community manager, but - since recently - also Zenodo community manager.
What does this mean? Here explained.

FAIR, sustainable and (eventually) open science

FAIR stands for Findable, Accessible, Interoperable, Reusable. It refers to a list of 15 principles published in The FAIR Guiding Principles for scientific data management and stewardship in 2016, aimed at “improving the infrastructure supporting the reuse of scholarly data”. The principles can be used as a checklist to ensure that scientific data (or software, or any other research product) is prepared and stored in such a way that it could be used again.

FAIR approaches contribute to the project sustainability, i.e. they help ensuring that each stage of the project’s life cycle is well planned, and that the outputs and benefits of the research outlive the publication. Research, much like other project-based working environments, has a series of specific sustainability issues, including high turnover, a wide range of levels of expertise of everyone involved in a project, and the perception that all the scientific work ends (and starts) with publications.

On the contrary - citing an inspiring concept by Wolfgang Huber at a recent EMBL meeting - “Papers are merely an advertisement for research, not an output”. Your data, your software, your training materials and even your meeting and lab-notebook notes - these are outputs of your research! You should worry about how you structure, organise and keep them at least as much as you worry about your manuscript. Major funding agencies such as Horizon Europe started to recognise the importance and potential impact of this effort, and adapted their evaluation scheme to take it into consideration. Strengthening research infrastructure and FAIRification is also an EMBL priority, in line with larger-scale efforts from the entire research community including Elixir and de.NBI, the German Bioinformatics Infrastructure.

Despite what one may think, storing your research outputs in safe repositories, licensing them and enriching them with descriptive metadata helps you in claiming ownership of your research and in controlling who has access to it. Indeed, FAIR science is not always open science. FAIR is meant to make your and your collaborators' life easier first, especially if you plan it from the start (FAIR by design), and then eventually open your work to the outside world, only if you decide so.

To make this point, among the several actions you can take and platforms you can use to FAIRify your way of working, I would like to start from one of the last in chronological order: sharing your research outputs.

Why Zenodo, or any other scientific archives

A little grey box in Zenodo’s homepage claims that one should use Zenodo because:

Safe — your research is stored safely for the future in CERN’s Data Centre for as long as CERN exists.
Trusted — built and operated by CERN and OpenAIRE to ensure that everyone can join in Open Science.
Citeable — every upload is assigned a Digital Object Identifier (DOI), to make them citable and trackable.
No waiting time — Uploads are made available online as soon as you hit publish, and your DOI is registered within seconds.
Open or closed — Share e.g. anonymized clinical trial data with only medical professionals via our restricted access mode.
Versioning — Easily update your dataset with our versioning feature.
GitHub integration — Easily preserve your GitHub repository in Zenodo.
Usage statistics — All uploads display standards compliant usage statistics.

These features (or at least some of them) also apply to other similar platforms, such as institutional archives, bioRxiv, figshare, Open Science Framework and many more. Each has its own specialisation and, particularly for pre-prints or data, I recommend prioritising field-specific or institutional-specific archives. This will ensure your manuscript will reach the most suitable type of audience. However, there’s much more to share than preprints. In my opinion (and I encourage you reader to interact with this post and start a discussion), Zenodo has three main advantages in specific cases.

It is integrated with GitHub, so you can easily assign Digital Object Identifiers (DOIs) to stable versions of your repository. This makes them citable, hence will allow you and others to use your code or data by acknowledging the source.
Even more relevant I believe, Zenodo is non-specialist by definition, and so it is a good place to share “public outreach” research outputs. I am referring to presentations, videos and notes of meetings, collaborative documents from conferences, training materials, and so on.
Finally, Zenodo features communities, thematic and curated collections of entries. They can be automatically exported as a whole, and used as pre-filled bibliographies / data collections / thematic stocks of items. The creators specify the community description and can accept/reject entries uploaded to the community from Zenodo users. This can be a useful tool in a variety of scenarios, including ours - and this is why we have created the community that we now introduce.

Bio-IT Zenodo community

We created the Bio-IT Zenodo community as a place to store (and provide DOIs) to materials developed by the Bio-IT community to the Bio-IT community and beyond. Some examples of items that can be stored here are:

Presentations, posters, flyers and other advertising materials about the Bio-IT project;
Infographics and statistics about Bio-IT services, users, surveys;
Training materials from Bio-IT courses;
Software developed in the framework of the Bio-IT project, if relevant across EMBL and fields.

We welcome your contributions! Submit to the community through this link. We are also looking forward to hearing your opinion and encourage you to interact with this post, either starting a discussion in the EMBL chat or contacting us via email.