The NIH Preprint Pilot: A New Experiment for a New Era

Guest post by Kathryn Funk, program manager for NLM’s PubMed Central.

Over the last several months, we have seen an increase in demand from the research and library communities for broader discovery and distribution of COVID-19 related literature, including early results posted as preprints. Preprints are complete, public drafts of scientific documents that are not yet peer reviewed. They are playing a key role in accelerating dissemination of research on the SARS-CoV-2 virus and COVID-19.

Recognizing the growing interest in preprints, NLM is today launching the first phase of the NIH Preprint Pilot, which will test the viability of making preprints searchable in PubMed Central (PMC) and, by extension, discoverable in PubMed, starting with COVID-19 preprints reporting NIH-supported research.

To be clear, NLM is not building a preprint server for NIH investigators, nor are we developing a comprehensive preprint discovery resource. Rather, through this pilot, we plan to add a curated collection of preprints from eligible preprint servers to our established literature resources. In doing so, our goal is to improve scholarly communications by accelerating and expanding the findability of NIH research results.

With the encouragement of NIH leadership, NLM has been exploring ways to leverage its literature databases to help accelerate the discoverability and maximize the impact of NIH-supported research via preprints. The planned pilot builds on guidance released by NIH in March 2017, which encouraged NIH investigators to use preprints and other interim research products to speed the dissemination of research and enhance the rigor of their work through public comments and new scientific collaborations.

Interest at NIH in the potential of preprints to improve scholarly communication long predates the 2017 guidance. As author Matthew Cobb recounts in “The prehistory of biology preprints: A forgotten experiment from the 1960s,”Errett C. Albritton, an administrator in the NIH Office of Research Accomplishments, established an informal network for the circulation of preprints and other scholarly communications by post to the group’s members. Although this initial “experiment” ended in 1967, support for the open sharing of scientific knowledge has continued at NIH through efforts such as the NIH Public Access Policy, which this pilot seeks to now complement.

Phase One Focus: COVID-19

In the first phase of the current pilot, NLM will make use of the NIH Office of Portfolio Analysis COVID-19 Portfolio tool to help identify preprints relating to the SARS-CoV-2 virus and COVID-19 pandemic. NLM will select preprints that either list an NIH-affiliated author or acknowledge NIH grant support. To accelerate discovery, preprints will be loaded and made searchable in PMC and PubMed once identified as in scope, without additional processing.  Standard XML versions will be loaded once the conversion process is completed. This workflow allows for rapid inclusion of preprints in the pilot without asking NIH investigators to separately submit them to PMC.

Recognizing that users come to NLM resources with varying levels of familiarity with scholarly communication practices, we want to make sure that researchers, clinicians, and the public can easily distinguish between preprints and peer-reviewed journal literature. Preprint records in PMC and PubMed will be flagged with large banners that clearly identify them as preprints. The banners will explain that the papers have not been peer reviewed and link to information about the pilot for additional context. Those who want to view only peer-reviewed journal literature will be able to exclude preprint records from search results in PMC using newly created filters.

We’ll closely monitor the early outcomes of the first phase of the pilot as we test and refine our workflows. We hope to be able to expand our scope in the next phase of the pilot to include the full spectrum of NIH-funded research, allow NIH investigators to identify their preprints through simplified reporting in My Bibliography, and establish more automated and faster curation workflows.

NLM will continually monitor the impact of the pilot on the scholarly communications landscape, including how research results are shared, discovered, disseminated, and reported, and evidence of increased awareness and emerging best practices around preprint sharing.

We expect the pilot will run for a minimum of 12 months to give us sufficient time to examine the use of preprints and their importance to scholarly communications in biomedical science. Feedback from stakeholders and lessons learned will inform future NLM efforts related to preprints.

PMC turned 20 in February, and its story over those two decades has been one of innovation, evolution and expansion, as we strive to build a collection at NLM that represents “the intellectual content and diversity of the world’s biomedical literature, data, and other research objects and information” and to foster open science practices. In launching this new preprint experiment in PMC, with an initial focus on COVID-19-related preprints, NLM hopes to continue to accelerate and expand access to relevant research in response to the ongoing public health emergency response efforts and to learn more about the impact of accelerated discovery and open sharing of research results on scholarly communications.

We encourage you to learn more about the NIH Preprint Pilot and review the pilot overview and related FAQs.

Kathryn Funk is the program manager for PubMed Central. She is responsible for PMC policy as well as PMC’s role in supporting the public access policies of numerous funding agencies, including NIH. Katie received her master’s degree in library and information science from The Catholic University of America.

