Navigation auf


UZH News

Reproducible Research

Scientific Community Reflects on Fundamental Research Standards

Reproducible research results is a key standard in the sciences. At the inaugural Swiss Reproducibility Conference, researchers will discuss the best ways to meet this requirement – including how to cope with growing data volumes and the pressure to publish.
Florian Meyer
A collage of genomic data: The sheer volume of data as well as natural, biological variability can make it difficult to replicate research results. Researchers are attending a conference to discuss new approaches to reproducibility. (Image: Adobe Stock)

Publishing research results that are comprehensible, verifiable and confirmable is one of the basic tenets of modern empirical science. Just like the peer review process for publications, replicating experiments plays a central role here. Being able to reproduce research results and to achieve matching results under identical conditions for different research groups are prerequisites for scientific findings to be accepted and scientific statements to be taken as credible.

Explainable and disruptive obstacles

In practice, however, it’s not always easy to achieve reproducibility in every case. A Nature survey from 2016 revealed that more than 70 percent of the 1,576 researchers surveyed had been unable to reproduce the experiments of other scientists, and more than half also failed to reproduce their own experiments. This is not necessarily due to dishonest practices or faulty science. In biology, for instance, some results are difficult to reproduce purely because of the variability in the cell material used.

But there are also other obstacles that can sometimes make it difficult to reproduce empirical data, and these are currently the object of lively discussion within the scientific community. Next week, ETH Zurich will host the inaugural Swiss Reproducibility Conference. Three topics will be on the agenda: Reproducibility and Replication, Transparency and Open Scholarship, and Meta-Research and Assessment. The conference is organized by the Swiss National Science Foundation SNSF and the Swiss Reproducibility Network (SwissRN), whose members include most Swiss universities.

SwissRN aims to promote rigorous research practices and robust research results in Switzerland. “One of the main goals of the conference is to share knowledge of new approaches that actually enhance research quality in everyday work, help achieve reproducible results and promote open science,” says Leonhard Held, professor of biostatistics, Open Science delegate at the University of Zurich and a member of the SwissRN Steering Committee. “The conference will provide researchers with a forum to discuss which new methods and techniques are best suited to achieving comprehensible research results and ensuring that experiments can be replicated,” says Daniel Stekhoven, a mathematician representing ETH Zurich on the SwissRN Steering Committee and one of the conference organizers. For almost 10 years, Stekhoven has headed ETH’s NEXUS technology platform, which provides support to biomedical research projects.

New approaches to a new level of data diversity

Reproducibility refers to how research results are verified using identical methods to carry out a repeat analysis of the identical data. Replication is about repeating a scientific study by doing a new experiment under identical or slightly altered conditions to arrive at the same conclusion as the original experiment. It has to be noted that different definitions exist for both terms.

“Reproducibility and replication are essential parts of the scientific process. But there are certain challenges to putting them into practice,” Stekhoven says. The reasons for this go beyond the research process itself. Researchers generally come up with a hypothesis and design experiments to test whether or not their assumptions are correct. “What’s changed so dramatically over the past few years,” Stekhoven says, “is that empirical and clinical researchers are now having to process massive volumes of mostly high-dimensional volumes of data. What’s more, we’re seeing a major surge in the number of new scientific publications.”

These are both quantitative phenomena that can get in the way of verifying research results. They also call for new statistical methods and expanded approaches to research assessment, without which researchers run the risk of having to retract premature conclusions. In fact, the number of journal articles that are retracted after publication has been rising for some time now. Retraction Watch is a website that tracks which articles have been retracted and why.

A number of trends that negatively impact research quality have been identified. Researchers as well as universities, research policy umbrella organizations and funding institutions have launched initiatives and programs to bring more reproducibility and transparency to the research process. The Center for Reproducible Science trains scientists in sound research practices and develops new methods for reproducibility and replicability. “One of our current areas of focus is to ensure that the assessment methods used are also transparent, documented and reproducible,” says Held, Director of the Center for Reproducible Science.

Sharing code and pre-registering research proposals

As a result of the ever increasing amount of data and analysis tools, computational reproducibility is becoming increasingly more relevant. Its objectives are three fold: first, provide the abilities to independently verify the results of computer-assisted studies; second, understand how the software used influences these results and third, determine whether computations can be replicated. One approach that is gaining more acceptance is the sharing of code.

In biomedical research and in the life sciences, Stekhoven says, researchers who intend to publish a study in a journal nowadays have to upload the molecular data they used to an archive. But with computational methods, a brief description of this data in the method section or appendix is usually sufficient, except when the focus of the journal in question – Bioinformatics for example – is squarely on methodology. “For computational reproducibility, each paper published would ideally feature a link for the code to a GitHub repository and another to a data archive.” Such solutions already exist, for instance for molecular data, there are the European Nucleotide Archive (ENA) and the European Genome-Phenome Archive (EGA).

Reproducibility and replication are two core aspects of open scholarship. But open scholarship (or open science) also encompasses open access to published materials and data. This includes the Open Research Data funding programs launched by the ETH Board and the Swiss Conference of Rectors of Higher Education Institutions. Then there are the data stewards, who help research groups to manage open research data and establish reproducible data workflows. Data exchange is framed by the FAIR principles – Findable, Accessible, Interoperable, Reusable – that facilitate data sharing without demanding unconditional openness.

Another part of the current discussion is an approach for increasing transparency: pre-registration. This means that researchers publish their proposal and the design of the planned experiment before they carry it out, gather data and publish a study. Such pre-registrations can take place with or without review, but they do generally receive a digital object identifier (DOI), which means they can be cited like a published article. Pre-registrations increase transparency by allowing the final results to be compared with the original proposal. Moreover, they help prevent a publication bias that arises due to predominantly positive results being published in journals. A reviewed registration ensures that negative results are also published; for example, if the hypothesis could not be confirmed. “Pre-registration is an indispensable tool in open scholarship. It’s proven to be an asset for clinical studies and in psychology, and has potential in other empirical research disciplines,” Held says.

New CV for other performance assessments

Regarding reproducibility in research works, Held also welcomes more recent approaches to assessing research and scientific performance that are not based solely on the number of publications and citations. As noteworthy approaches to research assessment, he points to the DORA and CoARA initiatives as well as to a new CV format that the SNSF introduced in 2022. The aims of the latter include finding ways to more adequately gauge the quality of scientific work and the value of research activities that don’t necessarily lead to a publication.

This is where meta-research comes in, because all these new approaches and methods for reproducibility, replication and transparency must themselves be verified to make sure they deliver the desired effect. “All the new approaches that we’re going to discuss share the same aim: to improve research quality,” Held says.

This article by Florian Meyer is a slightly edited version of an article published by ETH Zurich.