Gearing Up for 2023 Part II: Implementing the NIH Data Management and Sharing Policy

This blog post is by Lyric Jorgenson, PhD, the Acting Director of the NIH Office of Science Policy. It was originally posted on May 12 on the NIH Office of Science Policy Under the Poliscope blog. We encourage you to read it and submit comments and feedback on the draft supplemental information to the NIH Policy for Data Management and Sharing: Protecting Privacy When Sharing Human Research Participant Data by June 27.

Sequels are all the rage these days.  I figure if Marvel can make endless “Avengers” movies, I could start making blog sequels.  Back in the beginning of the year, I wrote Part I of this blog series about how NIH is working to implement the new NIH Data Management and Sharing Policy (DMS Policy).  I mentioned at that time that additional resources were forthcoming.

I should note that when we started to receive comments on what was to become the NIH DMS Policy, one thing in particular stood out to us.  Many commentors told us it would be helpful to have clear information on how to protect the privacy and respect the autonomy of participants when sharing data.  Now, we all know that cliffhangers build anticipation, so without further delay, I want to share with you some of the tools NIH has been working on to answer that call.

First, if you have seen the Avengers movies, you likely will have noticed that they tend to introduce a new villain that the team needs to battle with either new tools (think of OSP with Thor’s Stormbreaker axe) or the help of new superheroes like Captain Marvel. While not exactly a new villain, the lack of consistent consent language to facilitate secondary research with data and biospecimens is certainly a challenge many of our stakeholders have raised and one that we thought we could help address.

NIH has a long history of developing consent language and, as such, our team worked across the agency – and with you! – to develop a new resource that shares best practices for developing informed consents to facilitate data/biospecimen storage and sharing for future use.  It also provides modifiable sample language that investigators and IRBs can use to assist in the clear communication of potential risks and benefits associated with data/biospecimen storage and sharing.  In developing this resource, we engaged with key federal partners, as well as scientific societies and associations.  Importantly, we also considered the 102 comments from stakeholders in response to a RFI that we issued in 2021.

As for our second resource, we are requesting public comment on protecting the privacy of research participants when data is shared. I think I need to be upfront and acknowledge that we have issued many of these types of requests over the last several months and NIH understands the effort that folks take to thoughtfully respond.  With that said, we think the research community will greatly benefit from this resource and we want to hear your thoughts on whether it hits the mark or needs adjustment.

When reviewing the document, please bear in mind that the main purpose is to provide researchers with information on:

  • Operational Principles for Protecting Participant Privacy when Sharing Scientific Data
  • Best Practices for Protecting Participant Privacy when Sharing Scientific Data
  • Points to Consider for Designating Scientific Data for Controlled Access

Comments on the draft will be accepted until June 27, 2022, and full information and how to submit a comment can be found here.

Finally, every sequel needs a twist ending! In November 2021, NIH published a request for comments on the future directions of the NIH Genomic Data Sharing Policy.  We are still reviewing the many points and perspectives that were raised, but while we consider next steps, the comments we received are now available on the OSP website.  Okay, so maybe that twist wasn’t as big as, say, Darth Vader revealing he is (spoiler alert) Luke’s father in The Empire Strikes Back, but it’s still pretty good for the science policy world.

With a little more than half a year left until the implementation date of the NIH DMS Policy, we will continue to provide updates and resources over the next several months.

Midnight in the Library

Right now, I am reading The Midnight Library by Matt Haig. It’s a fanciful story of a woman in limbo between life and death who finds herself in a magical library, and each book represents one of the lives she could have lived had she made even one tiny different decision. She then finds herself in many of these lives, experiencing what could have been.

This book got me thinking about how NLM helps people experience lives that could be. I see this on two levels:

The first is the scientific pathway: What if . . . ? What if we knew more about the interactions between evolutionary forces and molecular constraints (like the work of Aravind Iyer, PhD), or fully appreciated the potential of proteins for genome engineering (like the discoveries made by Eugene Koonin, PhD), or could envision how and why proteins fold or switch their folds (as explored by Lauren Porter, PhD), or had the power to enable machines to understand human thought (like the research from Dina Demner-Fushman, MD, PhD). In addition to the discoveries by our NLM intramural researchers, our vast literature and data repositories hold answers that could change lives: why some genetic structures lead to human characteristics, or why a certain biochemical compound helps prevent infection. We help scientists discover these pathways and connections by providing them with the tools to uncover what could be.

The second is how NLM helps people see their what if using the amazing richness of the resources that we make available through our collections. Our resources—which encompass clinical insights, medical information, care guidelines, and self-management—help clinicians determine how to care for people with complex diseases or diagnose an illness in a timely manner. Our repository of clinical information available through PubMed ensures that those in need can access well-reasoned, recognized guiding principles for their care, and our MedlinePlus web resource provides patients and their families and friends with reliable, up-to-date health information to support and encourage healthy behavioral changes.

As in The Midnight Library, books alone do not inspire discovery, guide clinical care, or inform self-management. In Haig’s novel, a fictional librarian who knows the collection shows the main character how to select books by carefully listening to her goals and needs. It is the main character’s engagement with the books that helps her explore the lives she could have lived. At NLM, we too have librarians—located in Bethesda, Maryland, and around the country through NLM’s Network of the National Library of Medicine—who organize the library’s collections and guide patrons toward the best choice of resources. Our resources must be findable, accessible, interoperable, reusable, and actionable! And then, the person—scientist, clinician, patient—must actively engage with the material.

As we approach the future of data-powered health, guided by the NLM Strategic Plan (2017-2027), we will fulfill our mission to collect biomedical literature, organize it, preserve it, and make it accessible to the world. As the knowledge of health and biomedicine continues to grow faster than we can process, we will turn our attention to applying emerging tools, including machine learning and artificial intelligence, to make it easier to find our materials and more efficient to examine them. Through our Extramural Programs, we will continue to stimulate new ways of presenting information to scientists, clinicians, patients, and the public so they can explore possible lives to be lived and test out their promise of better health for society. What lives can we help you explore?

Meet the NLM Investigators: L. Aravind Iyer, PhD, Uncovers the Language of Our DNA

NLM is home to a robust research enterprise. Before the COVID-19 pandemic, I introduced you to two researchers from our Intramural Research Program (IRP), Dr. Lauren Porter and Dr. Xiaofang Jiang.

Now I would like you to meet another one of our researchers, L. Aravind Iyer, PhD. A member of the NLM IRP, Dr. Iyer is a Senior Investigator in the Computational Biology Branch of the National Center for Biotechnology Information. His research revolves around uncovering the stories and patterns held within DNA and RNA and is aimed at unraveling the evolutionary forces that shape biochemical functioning and biological form.

Just like any other biological structure, DNA and RNA evolve over time, which can tell a complex story of an organism’s past and illustrate relationships between organisms that aren’t obvious.

See the infographic below to learn more about the exciting research happening in Dr. Iyer’s lab.

Infographic titled: Language of Our DNA and RNA. Listing the featured researcher, L. Aravind Iyer, PhD and his title, Senior Investigator in Computational Biology. 

The first column of the infographic reads: What I'm Working On. The text in the first column lists Dr. Iyer's short term goals to: (1) Decipher evolutionary relationships of organisms (vertical and lateral) and proteins; and (2) Computationally discover biochemical activities of proteins. Next, long term goals are listed as: (1) Create a unified evolutionary theory for biological conflicts; and (2) Understand the contributions of rapid evolution in conflict on other systems.

The second column is titled: How It Works and lists the following text: (1) Reading an evolving story written in DNA/RNA and protein sequences.

(2) Closing gaps in our understanding by applying computational and statistical methods on databases to compare protein sequences and structures.

(3) Determine vertical (ancestral with a picture of an arrow pointing to  descendant) and lateral (one organism with a picture of an arrow pointing to another organism) flow of genetic information.

The third and final column of the infographic is titled: What It Looks Like and has a book in an indecipherable language with a caption that says: Deciphering the language of life written in DNA/RNA and protein sequences.


Now, in his own words, learn more about the man behind the research!

What do you enjoy about working at NLM?
NLM is one of the world’s leading centers (such can be counted on one’s fingers) for deciphering the biochemistry and biology of proteins through computational analysis of sequences and structures. As a national lab, it has an organizational structure and funding framework best suited for the kind of research that I do, which involves an extensive explorative component.

What makes your team unique?
My team embodies a considerable mass of special knowledge regarding protein evolution and function that we accumulated and systematized over a period of several decades. Given that we look at this using various computational methods, my team melds the expertise of people well versed in biology, computer programming, biochemistry, protein structure, and graph-theoretic analysis.

What is your advice for young scientists or people interested in pursuing a career in research?
I think the most interesting discoveries are those that bring together and illuminate disparate areas of inquiry. Hence, spend your early youth acquiring a very diverse knowledge base and technical capacity. Then organize this knowledge into an interconnected network that you can train your intuition on and draw from when confronted with new problems.

When you’re not in the lab, what do you enjoy doing?
Amateur astronomy, reading and writing about history and ancient texts in the original or translations, recreational mathematics, storytelling.

What inspires you?
Lives of past scientists, philosophers, and leaders from around the world. The profound insights found in the works of the ancients.

You’ve read his words, but now you can hear them for yourself. Follow along on the NLM YouTube page for more exciting content from the NLM staff that makes it all possible. If you’d like to learn more about our IRP program, view job opportunities, and explore research highlights, I invite you to explore the newly redesigned NLM IRP webpage.

YouTube: Dr. Aravind Iyer and the Protein Universe

Video transcript

[Iyer] Early in my life, I wanted to be a paleontologist. And that’s what actually led me to molecular biology. At one level, I could say that I wish to understand the whole protein universe. Proteins can be divided into evolutionary units. There’s a part of a protein that’s preserved over evolution because natural selection is maintaining that part for some reason. And one realization, which dawned on us starting around the early nineties—and this was a very profound realization for all of biology—is that there is a relatively small number of these evolutionary units of proteins, which we term domains, which constitutes the entire protein universe of all organisms across the tree of life.

If we can understand the functions of these units, then that goes a long way towards understanding what organisms do. And given there are many gaps in our understanding of what organisms do, one way to get at it is to first, find all these domains. The second aspect of it is predicting functions for them. The first phase of my research, we captured most of the low-hanging fruit, which were the big families conserved across all organisms.

Now we are moving on to the more difficult terrain, but the difficult terrain also holds a lot of promise because many un-understood functions are hiding within that difficult terrain, and it gives these offshoots in the form of biotechnological reagents. There are things like restriction enzymes, the CRISPR systems, and DNA modification systems. All of these have become very popular reagents.

NLM is a world leader in the analysis of protein sequences, protein structures, and inferring evolution from these bits of information. And this has been a very long-standing interest of mine so, this is the place to be.

Request for Public Comment: National AI Research and Development Strategic Plan

This blog post by Lynne Parker, Director, National AI Initiative Office, and Rashida Richardson, Senior Policy Advisor for Data and Democracy, was originally posted on the White House OSTP blog.

We encourage you to read it and submit comments on the update to the National Artificial Intelligence Research and Development Strategic Plan by Friday March 4, 2022.

Artificial Intelligence (AI) is becoming more prevalent in all of our lives. It powers all kinds of tools, from the digital assistants that answer questions on your phone, to breakthroughs in reading X-rays to better spot cancers. The so-called “intelligence” is the result of powerful computers sorting through mountains of data to find patterns, using algorithms designed and optimized by computer scientists.

Like all technology, AI is far from perfect. As we have started using AI for consequential decisions, we have realized that while AI can improve decision making, it too often compounds historical patterns of bias and deepens existing inequality. AI’s reliance on biased data or design processes has led to systems that produce discriminatory, or otherwise harmful, outcomes.

The Office of Science and Technology Policy is engaged in understanding the extraordinary promise of AI as well as its pitfalls. OSTP’s National AI Initiative Office (NAIIO) helps coordinate Federal activities in AI across government. OSTP is co-chairing the National AI Research Resource Task Force to answer Congress’s call to propose a vision for equitably expanding the research community’s access to the computing power, data, and testbed resources necessary to do AI research. OSTP has issued a call for the development of an AI Bill of Rights, and is working closely with both domestic and international partners across bilateral and multilateral venues to advance development, adoption, and oversight of AI in a manner that aligns with our democratic values.

Given the transformative potential of AI, we know it is critical that the American public have a voice in how this technology is used and governed. In late 2020, we initiated a public engagement process that included public listening sessions, a request for information on AI-enabled biometric technology, and stakeholder engagement meetings. Today, our National AI Initiative Office, in coordination with the Networking and Information Technology Research and Development Program of the National Science and Technology Council, is seeking public comments about how we should revise the National Artificial Intelligence Research and Development Strategic Plan. First published in 2016 and updated in 2019, the National AI R&D Strategic Plan identifies scientific and technological needs for AI innovation and investment priorities for Federally-funded AI research. In preparation for the Congressionally mandated 2022 Strategic Plan, this request for information seeks input on the goals, priorities, and metrics that Federal agencies should use to guide AI research and development. 

OSTP’s mission is to “maximize the benefits of science and technology to advance health, prosperity, security, environmental quality, and justice for all Americans.” Our work in AI is intended to maximize its benefits while ensuring that AI-driven systems do not cause harm or impede our pursuit of American ideals.

Through DS-I Africa, NIH is Fostering a New Health Data Science Community

Guest post by Laura K. Povlich, PhD, Program Director at the NIH Fogarty International Center (FIC) and Tiffani B. Lash, PhD, Program Director for the NIH National Institutes of Biomedical Imaging and Bioengineering (NIBIB). They co-coordinate the DS-I Africa program with assistance from a trans-NIH Working Group that includes Patricia Brennan, RN, PhD, Director of NLM; Roger Glass, MPH, MD, PhD, Director of FIC; Joshua A. Gordon, MD, PhD, Director of the National Institute of Mental Health; and Bruce Tromberg, PhD, Director of NIBIB.

Advances in data science and data ecosystems that support the mission of the NIH are reshaping biomedical and behavioral research. Enhanced international data ecosystems not only have the potential to support improved healthcare and public health domestically but could also be transformative in low- and middle-income countries. As a step in realizing this potential, the NIH Common Fund established the Harnessing Data Science for Health Discovery and Innovation in Africa (DS-I Africa) program.

The purpose of this program is to leverage data science technologies and prior NIH investments to develop solutions to the continent’s most pressing medical and public health problems through a robust ecosystem of new partners from academic, government, and private sectors.

Building off the success of a virtual symposium in 2020, the NIH recently invested over $74 million over five years to support 19 DS-I Africa awards that will conduct research and training activities across the continent. The DS-I Africa Open Data Science Platform and Coordinating Center, led out of the University of Cape Town, will catalyze and support the unique continental network of health data scientists, innovators, and researchers that work across the DS-I Africa program. This award will coordinate the 19 awards as a collaborative research consortium that benefits from shared resources and knowledge. Additionally, the Open Data Science Platform will develop into a scalable gateway that aims to lower some of the barriers to collaboration by democratizing access to data and tools.

The DS-I Africa consortium includes African-led multidisciplinary and multisectoral research hubs with projects in several important areas such as anti-microbial resistance, SARS-CoV-2, climate change, mental health, multi-disease morbidity, and more. Research training programs will build the next generation of African health data scientists and innovators. Lastly, research projects on the ethical, legal, and social implications of health data science from an African perspective are a key component of DS-I Africa and will further the policy discussions of these issues on the continent. The consortium will expand throughout the life of the program with the goal of bringing in new partners through pilot projects and possibly other funding mechanisms.

We are excited to see the DS-I Africa consortium grow and to stay apprised of opportunities to connect with other data science communities around the world. Many funding organizations see the potential for data science to transform medicine and public health in Africa, and we hope additional investments will have a synergistic effect in strengthening the health data science ecosystem in Africa. For more information about the DS-I Africa research studies visit the Harnessing Data Science for Health Discovery and Innovation in Africa Funded Research page.

In addition to her work at the NIH FIC with DS-I Africa, Dr. Povlich also co-coordinates Human Heredity and Health in Africa (H3Africa), which is another NIH Common Fund program. She earned both a BSE in Materials Science and Engineering and a PhD in Macromolecular Science and Engineering from the University of Michigan. Dr. Povlich was previously a Science & Technology Policy Fellow for the American Association for the Advancement of Science (AAAS).

Dr. Lash is the Program Director for the NIH Rapid Acceleration of Diagnostics Tech and Advanced Technology Platforms initiative, NIH Technology Accelerator Challenge and the NIBIB Point of Care Technologies Research Network. Her research portfolio includes Point of Care Technologies and Digital Health, both with the goal of developing biomedical technologies through collaborative efforts that merge scientific and technological capabilities with clinical need. Dr. Lash has been selected as a science policy fellow for both the AAAS and the National Academy of Engineering. Dr. Lash earned her PhD in Physical Chemistry from North Carolina State University

%d bloggers like this: