RADx-UP Program Addresses Data Gaps in Underrepresented Communities

Guest post by Richard J. Hodes, MD, Director, National Institute on Aging, and Eliseo Pérez-Stable, MD, Director, National Institute on Minority Health and Health Disparities, NIH.

A few months into the COVID-19 pandemic, we shared how NIH was working to speed innovation in the development, commercialization, and implementation of technologies for COVID-19 through NIH’s Rapid Acceleration of Diagnostics (RADx) initiative.

Two years later, one of the RADx programs—RADx Underserved Populations (RADx-UP)—reflects on lessons learned that have broken the mold of standard research paradigms to address health disparities.

Use of Common Data Elements

RADx-UP has presented unique challenges in terms of data collection, privacy concerns, measurement standardization, principles of data-sharing, and the opportunity to reexamine community-engaged research. Establishment of Common Data Elements (CDEs)—standardized, precisely defined questions paired with a set of allowable responses used systematically across different sites, studies, or clinical trials to ensure that the whole is greater than the sum of its parts—are not commonly used in community-engaged research. Use of CDEs enables data harmonization, aggregation, and analysis of related data across study sites as well as the ability to investigate relationships among data in unrelated data sets. CDEs can also lend statistical power to analyses of data for small subpopulations typically underrepresented in research.

RADx-UP is a community-engaged research program that builds on years of developing partnerships between communities and scientists. RADx-UP has funded 127 research projects with sites in every state and six U.S. territories as well as a RADx-UP Coordination and Data Collection Center (CDCC). RADx-UP assesses the needs and barriers related to COVID-19 testing and increase access to COVID-19 testing in underserved and vulnerable populations experiencing the highest rates of disparities in morbidity and mortality.

The COVID-19 pandemic necessitated establishing RADx-UP and its associated CDEs with unprecedented speed relying heavily on data elements derived from those already defined in the NIH-based PhenX Toolkit and Disaster Research Response (DR2) resources. The short time frame for this process did not allow for as extensive collaboration and input from RADx-UP investigators and community partners that would have been ideal. Additionally, many researchers, especially community partners engaged in RADx-UP projects, were not familiar with CDE data collection practices. As a result, CDE questionnaires had to be modified as studies progressed to better suit the needs of the consortium and investigators new to CDE collection had to be familiarized with these processes quickly. NIH program officers, NIH RADx-UP and CDCC leadership and engagement impact teams (EITs)—staff liaisons provided by the CDCC that link RADx-UP research teams to testing, data, and community-engagement resources—helped research teams implement and adjust CDE collection, ensured alignment across consortium research teams, and assisted with other data-related issues that arose.

All RADx programs are required to collect a standardized set of CDEs, including sociodemographic, medical history, and health status elements with the intent to provide researchers rapid access to data for secondary research analyses in the RADx Data Hub, the central repository for RADx data. However, implementation of CDEs in the context of underserved communities in the rapidly evolving COVID-19 pandemic presented complex issues for consideration.

Some of these issues included data privacy, the risk of re-identification of underserved and undocumented populations, and data collection burden on participants as well as researchers. The privacy of health data is protected under federal law. The RADx-UP program instituted measures to ensure program participants’ data remain protected and de-identified using a token-based hashing algorithm methodology that allows researchers to share individual-level participant data without exposing personally identifiable information. To address data collection and respondent burden concerns, projects modified questions to allow some flexibility in expanding response options more appropriate to some underserved communities. The CDCC also developed COLECTIV, a digital interface for projects to directly enter data into the data repository and included gateway questions to relieve respondent burden.

Respect for Tribal Data Sovereignty

RADx-UP leadership and investigators recognized that additional considerations for tribal sovereignty, practices, and policies needed to be addressed for projects that include American Indian and Alaska Native (AI/AN) participants. Through consultations with the NIH Tribal Advisory Committee and the broader AI/AN community and meetings with an informal RADx-UP AI/AN project working group established by the CDCC, NIH realized that deposition of tribal data into the RADx Data Hub would not meet the cultural, governance, or sovereignty needs of AI/AN RADx research data. In response, NIH hopes to establish a RADx Tribal Data Repository (TDR) responsible for the collection, protection, and sharing of data collected in AI/AN communities with respect for the practices and policies of Tribal data sovereignty. Applications for the repository have been solicited and NIH hopes to make an award for the TDR sometime in FY23.

Rapid Data Sharing

One of the largest hurdles the RADx-UP program has faced is implementing rapid sharing of research data for secondary analyses and to inform decision-making and public health practices related to the COVID-19 pandemic. RADx-UP research teams are expected to share their data on a timely cadence before data collection ends. This is a far more stringent practice relative to the current standard NIH data-sharing policy that requires data to be shared at the time of acceptance for publication of the main findings from the final data set. NIH and CDCC staff have worked together with the RADx research community to highlight the importance of and compliance with rapid data-sharing. Within the first six months, a total of 69 Phase 1 projects began transmitting CDE data to the RADx-UP CDCC. The COVID-19 pandemic posed a tremendous challenge, and NIH responded by collaborating with vulnerable and underserved communities. This collaboration has opened an unprecedented opportunity to build on a now established foundation for future research to address gaps in understanding the broader social, cultural, and structural factors that influence disparities in morbidity and mortality from COVID-19 and other diseases. Data collection and sharing efforts of the RADx-UP initiative comprise a significant contribution. Collaboration among the NIH, research investigators, and communities impacted by COVID-19 has been the catalyst. To learn more about RADx-UP, please visit a recent journal article available on PubMed.


Dr. Hodes has served as NIA director since 1993, overseeing studies of the biological, clinical, behavioral, and social aspects of aging. He has devoted his tenure to the development of a strong, diverse, and balanced research program focused on the genetics and biology of aging, basic and clinical studies aimed at reducing disease and disability, and investigation of the behavioral and social aspects of aging. Ultimately, these efforts have one goal — improving the health and quality of life for older people and their families. As a leading researcher in the field of immunology, Dr. Hodes has published more than 250 peer-reviewed papers.

Dr. Pérez-Stable practiced primary care internal medicine for 37 years at the University of California, San Francisco before becoming the Director of NIMHD in 2015. His research interests have centered on improving the health of individuals from racial and ethnic minority communities through effective prevention interventions, understanding underlying causes of health disparities, and advancing patient-centered care for underserved populations. Recognized as a leader in Latino health care and disparities research, he spent 32 years leading research on smoking cessation and tobacco control in Latino populations in the United States and Latin America. Dr. Pérez-Stable has published more than 300 peer-reviewed papers.

Meet the Next Generation of Leaders Advancing Data Science and Informatics at NLM

Guest post by Virginia Meyer, PhD, Training Director for the Intramural Research Program, National Library of Medicine, National Institutes of Health.

Working at NLM means being at the forefront of innovation in the rapidly evolving fields of data science and informatics. Within that environment, the NLM Intramural Research Program (IRP) is dedicated to supporting individuals looking to develop and apply computational approaches to a broad range of problems in biomedicine, molecular biology, and health.

NLM understands that contributions from people of diverse backgrounds, cultures, and histories enables research that has the greatest impact and reaches the widest possible audience. Such a workforce is necessary to drive innovation and scientific advancement and is imperative to ensuring that computational tools and data sets are free from bias. To that end, the Diversity in Data Science and Informatics (DDSI) Summer Internship, a program of the NLM IRP now in its inaugural year, was developed to support and engage young scientists who are dedicated to careers in computational biology and biomedical informatics. It is our hope that time spent in the DDSI program and Principal Investigators (PI) will encourage trainees to continue along the path toward becoming leaders in their chosen fields.

Meet four of this year’s DDSI interns and learn about the work they are doing in the NLM IRP!

Will Hibbard
Graduate Student in Biomedical Informatics
University of Buffalo

PI: Olivier Bodenreider, MD, PhD, Computational Health Research Branch, Lister Hill National Center for Biomedical Communications at NLM
Research Area: Natural Language Processing

What interested you most about the DDSI program?
I found out about the program when a teacher recommended it to me out of the blue, and after looking into it, I found a lot of fun research projects I could join. The program offered an opportunity to join research projects in familiar and unfamiliar fields. Ultimately, it was pleasantly outside of my comfort zone and presented the kind of challenge that makes me love research.

What research project are you working on and why?
I ended up working with Dr. Olivier Bodenreider using neural networks to better develop natural language processing in medical databases. I applied to this project because it involved two areas in which I had less experience: ontology and data structures. I pursued this research area because it allowed me the chance to improve in fields that I did not understand well at the time.

Why might someone want to apply to the DDSI program in the future?
This is the kind of experience with challenges that allow you to grow as a person and as a professional. Whether you know the area of research well or have trouble understanding it, this program will give you an opportunity to learn through a practical research project.

What is next for you after you complete your internship?
I will be taking a gap year while I apply to medical school. I am hoping to work in my local oncology institute and medical corridor.

MG Hirsch
PhD Student in Computer Science
University of Maryland, College Park

PI: Teresa Przytycka, PhD, Computational Biology Branch, National Center for Biotechnology Information at NLM
Research Area: Evolutionary Genomics

What interested you most about the DDSI program?
Evolution of gene expression and modeling different modes of evolution is something that I had yet to explore in my PhD research. I thought a summer program would be perfect to learn about it. It also gives me the opportunity to get a feel for working at the NIH and if I would want to consider the NIH Graduate Partnerships Program.

What research project are you working on and why?
I am evaluating the possibility of different modes of gene expression evolution within a tumor. Previous work in the lab considers different models of gene expression evolution between animal species. Many models of evolution assume neutral evolution, that mutations occur and persist randomly; however, we know that mutations that change phenotypes undergo various selective pressures from the environment. Considering this, previous work, resulting in the software EvoGeneX, has fit computational models using Ornstein-Uhlenbeck processes to evaluate potential divergence of gene expression within fly species. My research project is applying this same concept to cancer tumors. After tumorigenesis, cancer cells rapidly accumulate further mutations and diversify into subclones within the same tumor. Owing to the different sets of mutations, these subclones evolve differently. We can hypothesize then that the evolution of the gene expression of subclones can be modeled using the same computational models.

Why might someone want to apply to the DDSI program in the future?
The DDSI program offers extra speaker talks and networking opportunities.

What is next for you after you complete your internship?
I will be finishing my PhD in computer science at UMD.

Sirisha Koirala
Undergraduate Student in Computer Science
University of Maryland, College Park

PI: Zhiyong Lu, PhD, Computational Biology Branch, National Center for Biotechnology Information at NLM
Research Area: Natural Language Processing and Computational Biology

What interested you most about the DDSI program?
I was most interested in the unique ongoing research projects that students had the opportunity to participate in, which I would not have been able to find at other programs. It was very interesting to learn about the ways that artificial intelligence (AI) could be applied to medical practices, and this stood out to me as medicine and AI are two of my main interests.

What research project are you working on and why?
I am working on AI in the prediction of progression in age-related macular degeneration. In my first year of college, I was on the pre-medicine track; however, while gaining greater exposure, I realized that I have a stronger passion for computer science. Within the field of computer science, I have a particular interest in AI, and this project specifically allowed me to combine both of my interests and backgrounds.

Why might someone want to apply to the DDSI program in the future?
The DDSI program provides students who come from underrepresented backgrounds a chance to gain real hands-on experience. As a student who came from a small, all-women’s university where I did not have the availability to engage in such opportunities, this program has helped me significantly. I have been able to get the real-world experience I need to help me excel further in my career preparations, and students who are in similar positions should consider applying for this reason.

What is next for you after you complete your internship?
After I complete my internship, I will be starting my second year of college at University of Maryland, College Park where I am pursuing a major in computer science.

Tochi Oguguo
Undergraduate Student in Computer Science and Information Systems
University of Maryland, Baltimore County

PI: Sameer Antani, PhD, Computational Health Research Branch, Lister Hill National Center for Biomedical Communications at NLM
Research Area: Bias in Machine Learning

What interested you most about the DDSI program?
What interests me the most about this program is the amount of experience you gain during the summer. You leave understanding concepts at a higher level and applying lessons to your life outside of research.

What research project are you working on and why?
My research project is about bias in machine learning. By using fair active learning, we teach the machine how to give accurate responses when diagnosing or classifying a dataset or image. Bias is one of the biggest issues in machine learning, especially in health care where inaccurate judgment can be dangerous.

Why might someone want to apply to the DDSI program in the future?
DDSI is a great program to help students and interns learn more about career paths out there for them to explore and to help you become a more resilient person and scientist outside of research.

What is next for you after you complete your internship?
I plan to apply again next summer and keep working in research and machine learning! Also, I will take more classes in information science to help me become a better programmer.

The Importance of Listening—and Listening More—in Strategic Planning

I have spent much of the last year deeply involved in the creation of the very first NIH Digital Strategy Plan—an effort that requires listening across myriad NIH institutions and our stakeholders to glean what each finds important to our mission of seeking out and applying fundamental health care knowledge. I am proud to serve as co-chair of this effort with Andrea T. Norris, MBA, NIH’s Chief Information Officer and Director of NIH’s Center for Information Technology. Since October 2021, more than 40 staff members from across NIH have been working to develop this plan.

The purpose of the NIH Digital Strategy Plan is to identify 10 to 12 key capabilities needed to support NIH’s intramural research and extramural research program management, administration, and operations. Intramural research is NIH’s internal research program, and extramural research is research and training opportunities funded by NIH at universities, medical centers, and other institutions around the world. This helps create a vision for the future in which agency-wide commitments and institute-specific investments unite to form an information infrastructure vigorous enough to support the largest organization in the world that conducts and funds research.

In terms of information infrastructure, NIH is like other large, complex organizations where individual institute, center, office, and division investments complement enterprise-wide resources. We are creating a roadmap to make sure the organization has a robust, efficient, and secure information platform on which to conduct business. Developing the NIH Digital Strategy Plan is a lot like the parable of the blind men and the elephant, where different people can have different perceptions of the same thing. So how do we build a holistic strategic vision for the information technology capabilities needed for the future of NIH?

Actually, in much the same way as the Rajah in the parable gently nudges the six blind men towards truth: by instructing them to share their own vision on the journey of putting all the parts together!

Well, that’s what our committee has been doing. We’ve held over 25 listening and benchmarking sessions with stakeholders across NIH, as well as government and non-governmental enterprises that share a similar mission. We listened, and we asked questions, and then we listened some more.

Listening takes time, attention, and presence. As co-chairs of this effort, Ms. Norris and I have attended almost every session to convey our commitment, our interest, and our realization that the future of NIH depends strongly on the information technologies needed to support it. We’ve gleaned novel ideas and heard echoes of important themes. We’ve heard where there are areas of consensus and some frank differences. We’ve listened, learned, and discussed what we’ve heard. We’ve also listened for things that may not have been said.

In an enterprise this large, no single team can or should do all the listening. Our committee members also attended many of the sessions, and we used our conversations with them to verify or refine what we heard. Committee members also reached out to stakeholders around the country to hear how they viewed their future and what best practices they followed to make IT investments.

Listening works best when guided by questions, verifying what is being heard, and elaborating when needed. Listening is NOT list-making nor a process of determining which themes are raised more often or which ideas gain resonance. Listening is engagement, deepening one’s understanding of a complex environment by reaching deeply within and without.

In a few months, I’ll report on the results of our efforts to build the inaugural strategic plan for digital strategy at NIH. For now, I’ll continue to listen.

Meet the NLM Investigators: Dr. Demner-Fushman Knows the Answers to Your Questions!

Meet my close colleague, Dr. Dina Demner-Fushman! This brilliant researcher is the face behind what many of you have already accessed on NLM’s websites. Many of you will agree with me when I say that having one PhD is extremely impressive–but would you believe she has TWO?! In addition to her master’s degree, Dr. Demner-Fushman has PhDs in immunology and computer science.

Dr. Demner-Fushman and her team use advanced artificial intelligence (AI), natural language processing, and data mining techniques to answer consumers’ questions about a variety of health topics. Did you know that it was Dr. Demner-Fushman’s research that led to the developmental stages of the indexing initiative that produced the current iteration of the MEDLINE resource? This work helps all of us navigate a plethora of NLM resources.

Check out the infographic below to learn more about the innovative, important research happening in Dr. Demner-Fushman’s lab.

Infographic titled: Biomedical Question Answering. The title area features a picture of Dr. Demner-Fushman along with her title and accreditations (MD, Phd): Investigator, Computational Health. The first column of the graphic explores her short and long-term goals  for her projects. The center column describes the processes she uses to achieve these goals, and the last column depicts a simple graphic illustrating a Q and A service.

What makes your team unique? Tell us more about the people working in your lab.   

It is a diverse, multicultural team. Some were even born after I got my first IT job checking computers at Hunter College for Y2K compliance. The team is united by the task of enabling computers to understand health-related information needs and the socioeconomic and professional status of people who come to NLM seeking information. It is a group of exceptionally dedicated and talented people. Our diverse backgrounds make us see all possible aspects of addressing the informational and emotional needs of our users. 

What is your advice for young scientists or people interested in pursuing a career in research?  

  • Be proactive: Seek information and take advantage of training opportunities.  
  • Be brave: Admit you don’t know or don’t understand something. Most people will try to help.  
  • Be bold: Reach out to people who you would like to work with or to discuss your ideas.  
  • Be honest.  
  • Be patient: Research implies working hard, sometimes without immediate results. Even if research is your passion and fun, sometimes you have to do things that you might not enjoy or you might fear but still have to do, like giving talks or writing paper.

What do you enjoy about working at NLM?  

The community of dedicated people across all divisions, the mission, and the intellectual freedom.  

Where are you planning to travel to this year?  

I was just in Dublin, Ireland, in May for the 60th meeting of the Association for Computational Linguistics and co-chaired the BioNLP workshop for the 15th time. I loved Dublin when I visited shortly before the pandemic. I enjoyed revisiting a place I loved and discovering new things to love.

What are you reading right now?  

In the Garden of Beasts by Erik Larson. It provides an amazing view of pre-World War II Germany and political relations. I hope some lessons have been learned! 

You’ve read her words, now hear them for yourself. Follow our NLM YouTube page for more exciting content from the NLM staff that make it all possible. If you’d like to learn more about our Intramural Research Program (IRP), view job opportunities, and explore research highlights, I invite you to explore our recently redesigned NLM IRP webpage.

Transcript [Demner-Fushman]*: When people need information, what they really like is to ask a question and get a really good comprehensive answer, and to also know that the answer is true and correct.

When I started my independent clinician career, I had lots of questions, but I was sometimes not even sure if I was getting the right answer. “Question answering” is this system to understand the question, what the question is about, and why it is asked. When the answer is found, it’s usually not a single answer: It’s parts of the answer in different places. It’s multiple answers. So, all of that then needs to be condensed into one comprehensive answer with evidence of where the answer came from. So that’s the focus of my research.

On the surface, very similar questions asked by clinicians and by the public should be answered very differently. Different deep-learning systems are needed to find the answers to the same question asked by two different people.

The long-term goal is one entry point to all the NLM resources. It doesn’t matter who the person is and how they ask their question or look for information. We should be able to recognize what the person needs and provide it. There is no one—other than NLM—who is specifically dedicated to biomedical information retrieval and biomedical question answering. Although it seems industry is doing that kind of research as well, it is not their main focus, whereas we keep people focused on what really matters for health and advancing medicine.

*Transcript edited for clarity

Is Age Really Just a Number?

Last week I turned 69! Can you believe that??? This is so amazing to me—how could I be THAT OLD?? Two years ago (when I was just 67!), I shared that…

In midlife, I think I’m where I’m supposed to be, because I feel like I’m 39, think I look like I’m 49, believe I have a career worthy of someone who’s 59, and am approaching the wisdom of someone who’s 69.

So now that I am 69, I still believe all those things are true—particularly the wisdom part. I am wiser about the speed of change, the value of tempering my vision with a dose of realism, and the importance of understanding people clearly. I still feel youthful, look pretty good for a woman my age, and remain proud of my career.

But suppose I want to pick the number that really represents my age. Age is a very important descriptor of patients and research participants. Across all types of clinical research, one of the most common variables collected is a participant’s age. Age is an important indicator of many things human, from physical capabilities that determine their likely response to a treatment, to potential behavioral or mental health challenges. Knowing participants’ ages helps guide the interpretation of research results, allowing scientists and clinicians to determine the relevance of those results to specific groups of people or to better understand the clinical manifestation of a disease. And knowing the age of a participant provides evidence that our NIH studies appropriately engage people across their lifespan.

You might be surprised to know that there are many ways to represent age. For most of us, age is estimated by counting the number of years since our birth. However, for babies, it may be more important to know the number of days, weeks, or months since birth. Some studies compute age as the difference between the date of birth and the date that the data are collected. In fact, in the PhenX Toolkit, a web-based catalogue of expert-provided recommended measurement protocols, there are almost 200 different ways to measure age in a research study. Sometimes information about age is acquired through self-report of the participant, and other times the information is obtained from some existing document like a patient’s clinical record. The PhenX Toolkit is an enumeration of a wide range of measurement approaches and allows for broad coverage in a way that lets a researcher pick the measure that best represents the phenomena of interest to their study.

Over the past decade, NLM has supported the creation, identification, and distribution of Common Data Elements (CDEs). CDEs are specialized ways to measure concepts common to two or more research projects in a manner that is consistent across studies. Using a similar approach to measures similar concepts sounds like a no-brainer, right? It improves the rigor and reproducibility of research and allows data collected in different studies to be grouped together, adding power to the interpretation of research efforts. The COVID-19 pandemic illustrated the value of the common approaches to measuring research concepts by allowing us to track this deadly virus and its manifestations across time and people.

NLM established the NIH CDE Repository to serve as a one-stop location for research programs and for NIH Institutes and Centers to house CDEs and make them available to other researchers. Each record includes the definition of the variable as an indicator of the concept, a way to measure the variable (usually a question-and-answer pair with acceptable responses), and machine-readable codes where possible. Recently, the NIH CDE Repository began supporting an NIH governance process that indicates which of the proposed CDEs that have been received are described with sufficient rigor to be designated as NIH-endorsed. This endorsement helps potential users who are seeking good ways to measure complex concepts. NIH-endorsed CDEs support FAIR (findable, accessible, interoperable, and reusable) data sharing. Adherence to FAIR principles provides high-quality, “computation-ready” data with standardized vocabularies and readable metadata retrievable by identifiers that modernize the NIH data ecosystem. When data are collected consistently across studies using CDEs, it’s possible to integrate data from multiple studies, which can make it easier to get meaningful results. CDEs can also make it easier to reuse data for future research by improving the data quality.

So if I wanted to be “counted” according to the years-alive mode of assessing age, I guess I am 69. But if you really want to know something else, like how happy I am in my career or how I’m feeling, don’t be surprised if I give a different number!

%d bloggers like this: