Fostering a Culture of Scientific Data Stewardship

Image of new NLM data sharing policy on computer screen with a person's finger pointing at it from behind. An image of sharing data via folders is on the left.

Guest post by Jerry Sheehan, Deputy Director, National Library of Medicine.

Making research data broadly findable, accessible, interoperable, and reusable is essential to advancing science and accelerating its translation into knowledge and innovation. The global response to COVID-19 highlights the importance and benefits of sharing research data more openly.

The National Institutes of Health (NIH) has long championed policies that make the results of research available to the public. Last week, NIH released the NIH Policy for Data Management and Sharing (DMS Policy) to promote the management and sharing of scientific data generated from NIH-funded or conducted research. This policy replaces the 2003 NIH Data Sharing Policy.

The DMS policy was informed by public feedback and requires NIH-funded researchers to plan for the management and sharing of scientific data. It also makes clear that data sharing is a fundamental part of the research process.

Data sharing benefits the scientific community and the public.

For the scientific community, data sharing enables researchers to validate scientific results, increasing transparency and accountability. Data sharing also strengthens collaborations that allow for richer analyses. Strong data-sharing practices facilitate the reuse of hard-to-generate data, such as those acquired during complex experiments or once-in-a-lifetime events like natural disasters or pandemics.

For the public, sound data-sharing practices demonstrate good stewardship of taxpayer funds. Clear, well-written data sharing and management plans promote transparency and accountability to society. They also expand opportunities for data to be access and reused by clinicians, students, educators, and innovators in health care and other sectors of the economy.

As an organization dedicated to improving access to data and information to advance biomedical sciences and public health, NLM plays a key role in implementing the new policy and supporting researchers in meeting its requirements. NLM maintains a number of data repositories, such as the Sequence Read Archive and ClinicalTrials.gov, that curate, preserve, and provide access to research data. NLM also maintains a longer list of NIH-supported data repositories that accept different types of data (e.g., genomic, imaging) from different research domains (e.g., cancer, neuroscience, behavioral sciences). Where appropriate domain-specific repositories do not exist, NLM has made clear how researchers can include small datasets (<2GB) with articles deposited in NLM’s PubMed Central (PMC) under the NIH Public Access Policy.

NLM also works with the broader library community to support improved data management and sharing. Supplemental information issued with the new policy makes it clear that research budgets can include costs of data management and sharing, such as those for data curation, formatting data to accepted standards, attaching metadata to foster discoverability, and preparing data for storage in a repository. These are the kinds of services increasingly provided by libraries and librarians in universities and academic medical centers across the country. NLM, through the Network of the National Library of Medicine, offers training in data management and data literacy to health science, public, and other librarians to expand capacity for these important services.

NIH’s DMS Policy applies to all research, funded or conducted in whole or in part by NIH, that results in the generation of scientific data. This includes research funded or conducted by extramural grants, contracts, intramural research projects, or other funding agreements. The DMS Policy does not apply to research and other activities that do not generate scientific data, including training, infrastructure development, and non-research activities.

NIH will continue to engage the research community to support the change and implementation of this new policy, which will go into effect in January 2023. NLM will continue to work within NIH and across the library and information science communities to develop innovative ways to support the policy and advance the effective stewardship of research data. Let us know how else we can support this important policy advance.

Read more about this major policy release in the NIH’s Under the Poliscope blog.

As NLM Deputy Director, Jerry Sheehan shares responsibility with the Director for overall program development, program evaluation, policy formulation, direction and coordination of all Library activities. He has made major contributions to the development and implementation of NIH, HHS, and U.S. government-wide policy related to open science, public access to government-funded information, clinical trials registration, and electronic health records.