Guest post by Taunton Paine, MA, Director of the Division of Scientific Data Sharing Policy, NIH Office of Science Policy
Behind the NIH Genomic Data Sharing Policy
In November 2021, NIH published a request for information seeking public input on the future of the NIH Genomic Data Sharing (GDS) Policy. Originally published in 2014, the NIH GDS Policy expanded and refined an existing framework for the broad and responsible sharing of genomic research data originally created for genome-wide association studies. Since this policy framework was first implemented, NIH has accepted data from more than 1,200 studies in the NIH database of Genotypes and Phenotypes (dbGaP) hosted by NLM and facilitated more than 64,000 additional research uses. Many more studies involving non-human data and human data with study participant consent for full public access have been shared as a result of the GDS Policy through a variety of additional NIH repositories, such as GenBank and the Sequence Read Archive, which are also hosted by NLM.
While the GDS Policy has been remarkably successful at spurring the timely, productive, and secure sharing of genomic data, NIH has devoted substantial effort to maintaining the relevance of this framework by issuing updates as needed. NIH has provided substantial guidance to account for trends in science, technology, and society. For example, the policy and related guidance evolved to accommodate a growing shift toward cloud computing in genomic research.
Evolving Priorities: Help Us Shape the Future of Genomic Data Sharing
In October 2020, NIH issued the Final NIH Policy for Data Management and Sharing. The final policy will be effective on January 25, 2023. To better align the GDS and the NIH Policy for Data Management and Sharing policies, NIH is soliciting input about proposed changes to the GDS policy. Described below are some of the key proposed issues for which NIH is seeking comment in the request for information.
The use of genomic data in research continues to evolve. Specifically, there is growing interest in the use of human data elements that might be considered identifiable, which cannot currently be submitted to NIH genomic data repositories, and in the ability to match participants’ data across repositories or with data from other sources. The request for information seeks comment on whether NIH should permit these activities, and if so, what additional protections may be necessary.
To reduce the technical burden of analyzing genomic data, NIH has developed additional resources for storing, sharing and analyzing human genomic data in addition to dbGaP, resulting in an increasingly federated landscape of platforms and repositories hosted by NIH and awardee institutions. To ensure consistency of operations and protections, NIH is proposing core principles for NIH-supported genomic data repositories and platforms.
NIH frequently receives questions about other types of high-dimensional “omics” data, such as microbiomic or proteomic data, which describes new and comprehensive approaches for analyzing molecular profiles of humans and other organisms. In some cases, non-genomic data types may pose similar risks of re-identification as large-scale genomic data but may not be subject to the GDS Policy in all scenarios. Furthermore, the GDS Policy may not apply even when genomic data are generated in some scenarios, such as for very small studies. As a longer-term consideration, NIH is soliciting views on whether the more specific sharing expectations of the GDS Policy or the protective framework it offers should be adjusted to account for these other data types or scenarios.
We are Listening!
We are working to ensure that the framework established by the GDS Policy keeps pace with the needs of the research enterprise, research participants, and the patients it is ultimately intended to benefit. This RFI may result in updates to the GDS Policy, related guidance, or implementation. That’s why we’re asking you, the community, for your input. Please visit the request for information page today; comments are due by February 28. We look forward to hearing your input and appreciate your efforts!
Taunton Paine, MA is the Director of the Scientific Data Sharing Policy Division in the Office of Science Policy at the NIH. Taunton has been with the Office of Science Policy since 2011. His division is responsible for issues relating to data sharing policy, including issuance of the recent NIH Data Management and Sharing Policy, oversight of the NIH Genomic Data Sharing Policy, and management of the Data Science Policy Council. He holds a dual master’s degree from Columbia University and the London School of Economics and Political Science where he studied the history of international relations