Daphne, an eager computer-vision scientist, encounters a novel study highly relevant to her autonomous vehicles research. The study uses computational methods to reveal hidden driving patterns. Aware of autonomous driving’s safety implications, Daphne decides to act responsibly and replicate the study to verify its findings prior to relying on them.
After carefully reading the paper’s methodology, however, Daphne realizes that the original researchers provided no access to the code and only partial access to the data sets they used in their study. Daphne emails Sasha, the corresponding author of the original study, and asks for assistance but gets no response. Daphne now faces a serious dilemma: conducting an independent study from scratch that requires tremendous resources, or relying on the existing study, although it is irreplicable.
Today, computational science, powered by complex models and vast data sets, drives research and innovation across almost every scholarly arena. Daphne’s illustrative story highlights a real challenge scientists face in replicating studies, and underscores the importance of enabling access to code, data and other information. Promoting replicability of scientific studies benefits the scientific community and fosters public trust in science.
Replication is vital to science, seen as fundamental by investigators as far back as the pioneering 17th-century chemist Robert Boyle. It ensures that published findings are valid. If other scientists can repeat the study and get the same results, we can trust the findings. If not, it’s a red flag that something might be off. Science is cumulative, and replications build its comprehensive and credible body.
Unfortunately, the scientific community has struggled with a “replication crisis” in recent decades, where scientists in fields from economics to physics find it hard to reproduce results from published studies. In fact, some estimates of the severity of the crisis in specific fields, such as biology and psychology, found that a significant portion of scientific publications in these disciplines stand in question.
One of the major challenges driving the replication crisis is that scientists often do not share all information needed to replicate their work. Access to research materials is especially crucial for the replication of computational studies, given the increasing utilization of computational methods and the data-reliant nature of such studies on large data sets. Unfortunately, it is far from guaranteed.
There are many reasons why. Sometimes academic considerations, such as avoiding criticism, fear of retraction in case a mistake is revealed, and avoiding “scooping,” may drive the decision. In other cases, as the boundary between academic and applied research blurs, commercial considerations keep replication materials private to preserve the prospect of their commercialization. This creates a culture of secrecy, which goes against the fundamental values of openness and sharing in the scientific community.
Intellectual property (IP) law plays a significant role in creating a culture of secrecy in science. Patents and trade secrets secure the economic potential held in research and development (R&D). Patent law does so by providing exclusive rights to inventions. Yet a patented invention must be novel and nonobvious, which encourages inventors to conceal information (at least until they file a patent application). This can lead them to limit access to replication materials to maximize their chances of obtaining patent protection. The mechanism by which trade secrets contribute to nonsharing norms is more straightforward: trade-secret law can essentially protect any kind of information as long as it is kept secret.
Thus, while encouraging innovation, IP rights can also discourage sharing of research materials and exacerbate the replication crisis. Nevertheless, these rights matter to researchers, organizations and commercial firms, and benefit them (and the public that enjoys their efforts) in multiple ways, from the financial to the reputational.
To address this conflict, we propose a new policy instrument that could facilitate studies’ replicability without depriving scientists of their IP protection: the conditional access agreement (CAA). In short, the CAA establishes a private, controlled channel of communication for the transfer of replication materials between authors and replicators. This allows for on-demand replicability while maintaining the proprietary potential of a scientific study.
Under the CAA mechanism, when submitting a paper for publication, an author would execute an agreement with the journal, pledging to provide full access to replication materials upon demand by other researchers. The agreement would specify that anyone requesting access to the materials can only obtain it upon signing a nondisclosure agreement (NDA). The NDA would prohibit the use of the replication materials delivered by the original authors for any purpose other than replication. Since patent law and trade-secret law bar protection only in the case of public disclosure, information shared privately under the NDA would not nullify the possibility of obtaining patent protection, nor would it negate trade secrecy. The CAA policy is feasible thanks to the involvement of scientific journals as powerful intermediaries in the scientific ecosystem. Journals are the gatekeepers of responsible science, and as such, they are continuously involved in a publication’s life cycle, from submission to post-publication. A CAA policy also aligns with journals’ mission of promoting rigorous and credible science. Notably, the CAA policy imposes minimal costs on researchers, replicators and journals, as the infrastructures for such a mechanism already exist, including repositories of data and online manuscript submission systems for embedding the CAA.
Faced with a replication crisis, we believe that replication agreements facilitated by journals can improve science. Pursuing this path will enhance the replicability of scientific research and increase public trust in science.
This is an opinion and analysis article, and the views expressed by the author or authors are not necessarily those of Scientific American.