Revealing and Preserving Data for Today and Tomorrow

Guest post by Jeffrey S. Reznick, PhD, Chief of the History of Medicine Division (HMD) at the National Library of Medicine (NLM); Kenneth M. Koyle, MA, Deputy Chief of HMD; and Christie Moffatt, MLIS, Program Manager of the HMD Digital Manuscripts Program.

On this International Day for Universal Access to Information, we proudly showcase the globally appreciated role of NLM as a long-standing steward of vast collections of data even as it is now a recognized home of data science at the National Institutes of Health and beyond. A key part of the NLM mission is to provide access to that data and all the biomedical information we hold in our collections, which span ten centuries and originate from nearly every part of our world.

During the past several years, talented staff of the library have recognized this enduring and dedicated stewardship as part of our institution’s data-driven present and future by curating Revealing Data, an ongoing series of posts on the division’s popular blog Circulating Now. This series explores what data-minded researchers from a variety of disciplines are learning from centuries of data preserved in the collections of the NLM and associated with a variety of topics: from 17th-century bills of mortality to tuberculosis in the 19th-century to the 1918 influenza pandemic and more recent 20th- and 21st-century public health issues. Circulating Now also explores data-driven conservation research on some of our most treasured collections, research methods and tools for analysis in the study of digitized images and texts, and the origins, purpose, and development of highly regarded NLM resources like GenBank and the Index-Catalogue of the library of the Surgeon General’s Office.

A fundamental role of the NLM binds these data-driven explorations: its Congressionally mandated mission to collect, preserve, and provide access to past and present medical and scientific information in its multiplicity of formats, and, by extension, the vast amounts of data which reside in them. Generations of dedicated civil servants, including archivists, data scientists, historians, librarians, and many others, contributed their expertise to the NLM preserving the data-rich collections studied by a diverse field of researchers today. Without this commitment and these efforts, so much of this research would not be possible.

The NLM’s work of preservation continues today not only because it is mandated but also because the institution owes such work to future generations so they will be able to undertake their research, reveal new stories about the human condition, and make new discoveries. Today’s preservation work is evolving in tandem with changes to the collections themselves. NLM staff are developing new processes to collect and preserve web content, born-digital records, and digital ephemera while continuing to preserve vast quantities of data stored in paper, parchment, and vellum, some of it centuries old.

Viewed nearly 18,000 times since it was launched in 2017, Revealing Data reveals much more than valued data. It connects us to the very essence of NLM’s mission, its history, and the enduring importance of our institution’s initiative to preserve this data and the contexts in which it was originally created for today and tomorrow.

Dr. Reznick leads all aspects of HMD and has over two decades of leadership experience in federal, nonprofit, and academic spaces. As a cultural historian, he also maintains a diverse, interdisciplinary, and highly collaborative historical research portfolio supported by the library and based on its diverse collections and associated programs. Dr. Reznick is author of three books and numerous book chapters and journal articles including as co-author with Ken Koyle of History matters: in the past, present & future of the NLM, published in 2021 by the Journal of the Medical Library Association.

Before joining NLM, Mr. Koyle served as a medical evacuation helicopter pilot and as a historian in the U.S. Army. He is the co-editor with Dr. Reznick of Images of America: U.S. National Library of Medicine, which is a collaborative work with HMD staff.

Ms. Moffatt leads content development for NLM’s Profiles in Science website, which provides access to 20th century manuscripts in science, medicine, and public health. As Chair of the Library’s Web Collecting and Archiving Working Group, she supports web archiving on topics and events related to NLM collecting interests, including Global Health Events (Ebola, COVID-19, Monkeypox), HIV/AIDS, and the opioid epidemic, among others.

Traveling a Bridge2AI in a Quest for High-Quality, FAIR Data Sets

This blog was authored by NIH staff who serve on the Bridge to Artificial Intelligence (Bridge2AI) Working Group.

In April 2021, we introduced NIH Common Fund’s Bridge to Artificial Intelligence (Bridge2AI) program to tap the potential of artificial intelligence (AI) for revolutionizing biomedical discovery, increasing our understanding of human health, and improving the practice of medicine. In the past year, Bridge2AI researchers have been creating guidance and standards for the development of ethically sourced, state-of-the-art, AI-ready data sets to help solve some of the most pressing challenges in human health such as uncovering how genetic, behavioral, and environmental factors influence health and wellness. The program will also support the training required to enable the broader biomedical and behavioral research community to leverage AI technologies.

The NIH initiative will support diverse teams and tools to ensure that data sets adhere to FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Beyond ensuring compliance to FAIR principles, Bridge2AI will develop and disseminate best practices that promote a culture of diversity and continuous ethical inquiry into how data are collected.

The Bridge2AI program will support innovative data-generation projects nationwide to collect complex AI-ready data in four biomedical areas:

Clinical Care Informatics—Intensive care units treat patients with urgent medical conditions such as sepsis and cardiac arrest. This data generation project will collect, integrate, annotate, and share high-resolution physiological data from adult and pediatric critical care patients from 14 health systems that can then be used by AI technologies to identify approaches to improve recovery from acute illness.

Functional GenomicsWithin each cell in the human body lies a wealth of information about health, disease, and the impact of environmental factors. This project will generate richly detailed proteomic, genomic, and cellular imaging data to help predict disease mechanisms and associated gene pathways and networks for a variety of health outcomes.

Precision Public Health—The human voice is as unique as a fingerprint and has been found to contain acoustic signatures of human health and disease. This project will collect large-scale multimodal data sets containing voice, genomic, and clinical data, which AI technologies can use to help improve screening for and the diagnosis and treatment of a variety of developmental, neurological, and mental health conditions.

Return to Health—Much can be learned by uncovering how individuals move from a less healthy to a healthier state, a process called salutogenesis. This project will collect data from a diverse population with varying stages of type 2 diabetes to help improve our understanding of chronic disease progression and recovery. To learn more about Bridge2AI and salutogenesis, please view Bridging Our Way to Health Restoration by Helene M. Langevin, MD, director of the National Center for Complementary and Integrative Health.

To support these data generation projects, the Bridge2AI program includes a BRIDGE Center with a range of expertise to support interdisciplinary team science. The center will facilitate development of cross-cutting products such as standards harmonization, ethical AI best practices, and workforce development opportunities for the research community.

One of the goals of Bridge2AI is to foster a culture that will identify, assess, and address ethical issues as an integral part of creating AI-ready data sets. Ethical considerations include informed consent, data privacy, bias in data, and its impact on fairness and trustworthiness of AI applications, equity, and justice, and inclusion and transparency in design.

Every component of the Bridge2AI program includes a plan for incorporating diverse perspectives at every step. The BRIDGE Center will serve as a hub for supporting ethical and trustworthy AI development across Bridge2AI with the goal of providing tools, best practices, and resources to address cross-cutting biomedical challenges.

Learn more about Bridge2AI in the press release and video. Find the latest news by visiting the Bridge2AI website and following the @NIH_CommonFund on Twitter.

Top Row (left to right):
Patricia Flatley Brennan, RN, PhD, Director, National Library of Medicine
Michael F. Chiang, MD, Director, National Eye Institute
Eric D. Green, MD, PhD, Director, National Human Genome Research Institute

Bottom Row (left to right):
Helene M. Langevin, MD, Director, National Center for Complementary and Integrative Health
Bruce J. Tromberg, PhD, Director, National Institute of Biomedical Imaging and Bioengineering

RADx-UP Program Addresses Data Gaps in Underrepresented Communities

Guest post by Richard J. Hodes, MD, Director, National Institute on Aging, and Eliseo Pérez-Stable, MD, Director, National Institute on Minority Health and Health Disparities, NIH.

A few months into the COVID-19 pandemic, we shared how NIH was working to speed innovation in the development, commercialization, and implementation of technologies for COVID-19 through NIH’s Rapid Acceleration of Diagnostics (RADx) initiative.

Two years later, one of the RADx programs—RADx Underserved Populations (RADx-UP)—reflects on lessons learned that have broken the mold of standard research paradigms to address health disparities.

Use of Common Data Elements

RADx-UP has presented unique challenges in terms of data collection, privacy concerns, measurement standardization, principles of data-sharing, and the opportunity to reexamine community-engaged research. Establishment of Common Data Elements (CDEs)—standardized, precisely defined questions paired with a set of allowable responses used systematically across different sites, studies, or clinical trials to ensure that the whole is greater than the sum of its parts—are not commonly used in community-engaged research. Use of CDEs enables data harmonization, aggregation, and analysis of related data across study sites as well as the ability to investigate relationships among data in unrelated data sets. CDEs can also lend statistical power to analyses of data for small subpopulations typically underrepresented in research.

RADx-UP is a community-engaged research program that builds on years of developing partnerships between communities and scientists. RADx-UP has funded 127 research projects with sites in every state and six U.S. territories as well as a RADx-UP Coordination and Data Collection Center (CDCC). RADx-UP assesses the needs and barriers related to COVID-19 testing and increase access to COVID-19 testing in underserved and vulnerable populations experiencing the highest rates of disparities in morbidity and mortality.

The COVID-19 pandemic necessitated establishing RADx-UP and its associated CDEs with unprecedented speed relying heavily on data elements derived from those already defined in the NIH-based PhenX Toolkit and Disaster Research Response (DR2) resources. The short time frame for this process did not allow for as extensive collaboration and input from RADx-UP investigators and community partners that would have been ideal. Additionally, many researchers, especially community partners engaged in RADx-UP projects, were not familiar with CDE data collection practices. As a result, CDE questionnaires had to be modified as studies progressed to better suit the needs of the consortium and investigators new to CDE collection had to be familiarized with these processes quickly. NIH program officers, NIH RADx-UP and CDCC leadership and engagement impact teams (EITs)—staff liaisons provided by the CDCC that link RADx-UP research teams to testing, data, and community-engagement resources—helped research teams implement and adjust CDE collection, ensured alignment across consortium research teams, and assisted with other data-related issues that arose.

All RADx programs are required to collect a standardized set of CDEs, including sociodemographic, medical history, and health status elements with the intent to provide researchers rapid access to data for secondary research analyses in the RADx Data Hub, the central repository for RADx data. However, implementation of CDEs in the context of underserved communities in the rapidly evolving COVID-19 pandemic presented complex issues for consideration.

Some of these issues included data privacy, the risk of re-identification of underserved and undocumented populations, and data collection burden on participants as well as researchers. The privacy of health data is protected under federal law. The RADx-UP program instituted measures to ensure program participants’ data remain protected and de-identified using a token-based hashing algorithm methodology that allows researchers to share individual-level participant data without exposing personally identifiable information. To address data collection and respondent burden concerns, projects modified questions to allow some flexibility in expanding response options more appropriate to some underserved communities. The CDCC also developed COLECTIV, a digital interface for projects to directly enter data into the data repository and included gateway questions to relieve respondent burden.

Respect for Tribal Data Sovereignty

RADx-UP leadership and investigators recognized that additional considerations for tribal sovereignty, practices, and policies needed to be addressed for projects that include American Indian and Alaska Native (AI/AN) participants. Through consultations with the NIH Tribal Advisory Committee and the broader AI/AN community and meetings with an informal RADx-UP AI/AN project working group established by the CDCC, NIH realized that deposition of tribal data into the RADx Data Hub would not meet the cultural, governance, or sovereignty needs of AI/AN RADx research data. In response, NIH hopes to establish a RADx Tribal Data Repository (TDR) responsible for the collection, protection, and sharing of data collected in AI/AN communities with respect for the practices and policies of Tribal data sovereignty. Applications for the repository have been solicited and NIH hopes to make an award for the TDR sometime in FY23.

Rapid Data Sharing

One of the largest hurdles the RADx-UP program has faced is implementing rapid sharing of research data for secondary analyses and to inform decision-making and public health practices related to the COVID-19 pandemic. RADx-UP research teams are expected to share their data on a timely cadence before data collection ends. This is a far more stringent practice relative to the current standard NIH data-sharing policy that requires data to be shared at the time of acceptance for publication of the main findings from the final data set. NIH and CDCC staff have worked together with the RADx research community to highlight the importance of and compliance with rapid data-sharing. Within the first six months, a total of 69 Phase 1 projects began transmitting CDE data to the RADx-UP CDCC. The COVID-19 pandemic posed a tremendous challenge, and NIH responded by collaborating with vulnerable and underserved communities. This collaboration has opened an unprecedented opportunity to build on a now established foundation for future research to address gaps in understanding the broader social, cultural, and structural factors that influence disparities in morbidity and mortality from COVID-19 and other diseases. Data collection and sharing efforts of the RADx-UP initiative comprise a significant contribution. Collaboration among the NIH, research investigators, and communities impacted by COVID-19 has been the catalyst. To learn more about RADx-UP, please visit a recent journal article available on PubMed.


Dr. Hodes has served as NIA director since 1993, overseeing studies of the biological, clinical, behavioral, and social aspects of aging. He has devoted his tenure to the development of a strong, diverse, and balanced research program focused on the genetics and biology of aging, basic and clinical studies aimed at reducing disease and disability, and investigation of the behavioral and social aspects of aging. Ultimately, these efforts have one goal — improving the health and quality of life for older people and their families. As a leading researcher in the field of immunology, Dr. Hodes has published more than 250 peer-reviewed papers.

Dr. Pérez-Stable practiced primary care internal medicine for 37 years at the University of California, San Francisco before becoming the Director of NIMHD in 2015. His research interests have centered on improving the health of individuals from racial and ethnic minority communities through effective prevention interventions, understanding underlying causes of health disparities, and advancing patient-centered care for underserved populations. Recognized as a leader in Latino health care and disparities research, he spent 32 years leading research on smoking cessation and tobacco control in Latino populations in the United States and Latin America. Dr. Pérez-Stable has published more than 300 peer-reviewed papers.

Meet the Next Generation of Leaders Advancing Data Science and Informatics at NLM

Guest post by Virginia Meyer, PhD, Training Director for the Intramural Research Program, National Library of Medicine, National Institutes of Health.

Working at NLM means being at the forefront of innovation in the rapidly evolving fields of data science and informatics. Within that environment, the NLM Intramural Research Program (IRP) is dedicated to supporting individuals looking to develop and apply computational approaches to a broad range of problems in biomedicine, molecular biology, and health.

NLM understands that contributions from people of diverse backgrounds, cultures, and histories enables research that has the greatest impact and reaches the widest possible audience. Such a workforce is necessary to drive innovation and scientific advancement and is imperative to ensuring that computational tools and data sets are free from bias. To that end, the Diversity in Data Science and Informatics (DDSI) Summer Internship, a program of the NLM IRP now in its inaugural year, was developed to support and engage young scientists who are dedicated to careers in computational biology and biomedical informatics. It is our hope that time spent in the DDSI program and Principal Investigators (PI) will encourage trainees to continue along the path toward becoming leaders in their chosen fields.

Meet four of this year’s DDSI interns and learn about the work they are doing in the NLM IRP!

Will Hibbard
Graduate Student in Biomedical Informatics
University of Buffalo

PI: Olivier Bodenreider, MD, PhD, Computational Health Research Branch, Lister Hill National Center for Biomedical Communications at NLM
Research Area: Natural Language Processing

What interested you most about the DDSI program?
I found out about the program when a teacher recommended it to me out of the blue, and after looking into it, I found a lot of fun research projects I could join. The program offered an opportunity to join research projects in familiar and unfamiliar fields. Ultimately, it was pleasantly outside of my comfort zone and presented the kind of challenge that makes me love research.

What research project are you working on and why?
I ended up working with Dr. Olivier Bodenreider using neural networks to better develop natural language processing in medical databases. I applied to this project because it involved two areas in which I had less experience: ontology and data structures. I pursued this research area because it allowed me the chance to improve in fields that I did not understand well at the time.

Why might someone want to apply to the DDSI program in the future?
This is the kind of experience with challenges that allow you to grow as a person and as a professional. Whether you know the area of research well or have trouble understanding it, this program will give you an opportunity to learn through a practical research project.

What is next for you after you complete your internship?
I will be taking a gap year while I apply to medical school. I am hoping to work in my local oncology institute and medical corridor.

MG Hirsch
PhD Student in Computer Science
University of Maryland, College Park

PI: Teresa Przytycka, PhD, Computational Biology Branch, National Center for Biotechnology Information at NLM
Research Area: Evolutionary Genomics

What interested you most about the DDSI program?
Evolution of gene expression and modeling different modes of evolution is something that I had yet to explore in my PhD research. I thought a summer program would be perfect to learn about it. It also gives me the opportunity to get a feel for working at the NIH and if I would want to consider the NIH Graduate Partnerships Program.

What research project are you working on and why?
I am evaluating the possibility of different modes of gene expression evolution within a tumor. Previous work in the lab considers different models of gene expression evolution between animal species. Many models of evolution assume neutral evolution, that mutations occur and persist randomly; however, we know that mutations that change phenotypes undergo various selective pressures from the environment. Considering this, previous work, resulting in the software EvoGeneX, has fit computational models using Ornstein-Uhlenbeck processes to evaluate potential divergence of gene expression within fly species. My research project is applying this same concept to cancer tumors. After tumorigenesis, cancer cells rapidly accumulate further mutations and diversify into subclones within the same tumor. Owing to the different sets of mutations, these subclones evolve differently. We can hypothesize then that the evolution of the gene expression of subclones can be modeled using the same computational models.

Why might someone want to apply to the DDSI program in the future?
The DDSI program offers extra speaker talks and networking opportunities.

What is next for you after you complete your internship?
I will be finishing my PhD in computer science at UMD.

Sirisha Koirala
Undergraduate Student in Computer Science
University of Maryland, College Park

PI: Zhiyong Lu, PhD, Computational Biology Branch, National Center for Biotechnology Information at NLM
Research Area: Natural Language Processing and Computational Biology

What interested you most about the DDSI program?
I was most interested in the unique ongoing research projects that students had the opportunity to participate in, which I would not have been able to find at other programs. It was very interesting to learn about the ways that artificial intelligence (AI) could be applied to medical practices, and this stood out to me as medicine and AI are two of my main interests.

What research project are you working on and why?
I am working on AI in the prediction of progression in age-related macular degeneration. In my first year of college, I was on the pre-medicine track; however, while gaining greater exposure, I realized that I have a stronger passion for computer science. Within the field of computer science, I have a particular interest in AI, and this project specifically allowed me to combine both of my interests and backgrounds.

Why might someone want to apply to the DDSI program in the future?
The DDSI program provides students who come from underrepresented backgrounds a chance to gain real hands-on experience. As a student who came from a small, all-women’s university where I did not have the availability to engage in such opportunities, this program has helped me significantly. I have been able to get the real-world experience I need to help me excel further in my career preparations, and students who are in similar positions should consider applying for this reason.

What is next for you after you complete your internship?
After I complete my internship, I will be starting my second year of college at University of Maryland, College Park where I am pursuing a major in computer science.

Tochi Oguguo
Undergraduate Student in Computer Science and Information Systems
University of Maryland, Baltimore County

PI: Sameer Antani, PhD, Computational Health Research Branch, Lister Hill National Center for Biomedical Communications at NLM
Research Area: Bias in Machine Learning

What interested you most about the DDSI program?
What interests me the most about this program is the amount of experience you gain during the summer. You leave understanding concepts at a higher level and applying lessons to your life outside of research.

What research project are you working on and why?
My research project is about bias in machine learning. By using fair active learning, we teach the machine how to give accurate responses when diagnosing or classifying a dataset or image. Bias is one of the biggest issues in machine learning, especially in health care where inaccurate judgment can be dangerous.

Why might someone want to apply to the DDSI program in the future?
DDSI is a great program to help students and interns learn more about career paths out there for them to explore and to help you become a more resilient person and scientist outside of research.

What is next for you after you complete your internship?
I plan to apply again next summer and keep working in research and machine learning! Also, I will take more classes in information science to help me become a better programmer.

Informing Success from the Outside In: Introducing the NLM Board of Regents CGR Working Group

Guest post by Valerie Schneider, PhD, staff scientist at the National Library of Medicine (NLM) National Center for Biotechnology Information (NCBI), National Institutes of Health (NIH), and Kristi Holmes, PhD, Director of Galter Health Sciences Library & Learning Center and Professor of Preventive Medicine at Northwestern University Feinberg School of Medicine.

Last year, we described how NLM is developing the NIH Comparative Genomics Resource (CGR)—a project that offers content, tools, and interfaces for genomic data resources associated with eukaryotic research organisms—in two blog posts:

Eukaryote refers to any single-celled or multicellular organisms whose cell contains a distinct and membrane-bound nucleus. Since eukaryotes all likely evolved from the same common ancestry, studying them can grant us insight into how other eukaryotes—including those in humans—work and makes CGR and its resources that much more important to eukaryotic research.

CGR aims to:

  • Promote high-quality eukaryotic genomic data submission.
  • Enrich NLM’s genomic-related content with community-sourced content.
  • Facilitate comparative biological analyses.
  • Support the development of the next generation of scientists.

Since our last two posts, the team at NCBI has been hard at work making important technical and content updates to and socializing CGR’s suite of tools. For instance, they published new webpages that organize genome-related data by taxonomy, making it available for browsing and immediate download. They also created the ClusteredNR Database, a new database for the Basic Local Alignment Search Tool (BLAST), to provide results with greater taxonomic context for sequence searches, and incorporated new gene information from the Alliance of Genome Resources, an organization that unites data and information for model organisms’ unique aspects, into Gene. NCBI is also engaging with genomics communities to understand their needs and requirements for comparative genomics through the NLM Board of Regents Comparative Genomics Working Group.

The working group is lending their perspective and extensive expertise to the project, activities that are essential to CGR’s success and development. We have charged working group members with guiding the development of a new approach to scientific discovery that relies on genomic-related data from research organisms, helping project teams keep pace with changes in the field, and understanding the scientific community’s needs and expectations for key functionalities. To do this, working group members help NLM set development priorities such as exploring CGR’s integration with existing infrastructures and related workforce development opportunities.

Projects like CGR highlight how critical interdisciplinary collaboration is to modern research and how success requires community perspectives and involvement. Working group members will be sharing more information about this project at upcoming conferences and in biomedical literature, and our team at NCBI will also share events and resources through our NIH Comparative Genomics Resource website.

If you are a member of a model organism community, are working on emerging eukaryotic research models, or support eukaryotic genomic data—whether you are a researcher, educator, student, scholarly society member, librarian, data scientist, database resource manager, developer, epidemiologist, or other stakeholder in our progress—we encourage you to reach out and get involved. Here are a few suggestions:

  • Invite us to join you at a conference, teach a workshop, partner on a webinar, or discuss other ideas you may have to foster information sharing and feedback.
  • Use and share CGR’s suite of tools and share your feedback.
  • Be on the lookout for project updates and events on the CGR website or follow @NCBI on Twitter.

We’re always excited to get feedback through CGR listening sessions and user testing for tool and resource updates. Email cgr@nlm.nih.gov to learn all the ways you can participate.

Thank you to the members of the NLM Board of Regents CGR Working Group!

Alejandro Sanchez Alvarado, PhD

Executive Director and Chief Scientific Officer
Priscilla Wood Neaves Chair in the Biomedical Sciences
Stowers Institute for Medical for Medical Research

Hannah Carey, PhD
Professor, Department of Comparative Biosciences, School of Veterinary Medicine
University of Wisconsin-Madison

Wayne Frankel, PhD
Professor, Department of Genetics & Development
Director of Preclinical Models, Institute of Genomic Medicine
Columbia University Medical Center

Kristi L. Holmes, PhD (Chair)
Director, Galter Health Services Library & Learning Center
Professor of Preventive Medicine (Health & Biomedical Informatics)
Northwestern University Feinberg School of Medicine

Ani W. Manichaikul, PhD
Associate Professor, Center for Public Health Genomics
University of Virginia School of Medicine

Len Pennacchio, PhD
Senior Scientist
Lawrence Berkeley National Laboratory

Valerie Schneider, PhD (Executive Secretary)
Program Head, Sequence Enhancements, Tools and Delivery (SeqPlus)
HHS/NIH/NLM/NCBI

Kenneth Stuart, PhD
Professor, Center of Global Infectious Disease Research
Seattle Children’s Research Institute

Tandy Warnow, PhD
Grainger Distinguished Chair in Engineering
Associate Head of Computer Science
University of Illinois, Champaign-Urbana

Rick Woychik, PhD (NIH CGR Steering Committee Liaison)
Director, National Institute of Environmental Health Sciences (NIEHS) and the National Toxicology Program (NTP)

Cathy Wu, PhD
Unidel Edward G. Jefferson Chair in Engineering and Computer Science
Director, Center for Bioinformatics & Computational Biology
Director, Data Science Institute
University of Delaware

Dr. Schneider is the deputy director of Sequence Offerings and the head of the Sequence Plus program. In these roles, she coordinates efforts associated with the curation, enhancement, and organization of sequence data, as well as oversees tools and resources that enable the public to access, analyze, and visualize biomedical data. She also manages NCBI’s involvement in the Genome Reference Consortium, which is the international collaboration tasked with maintaining the value of the human reference genome assembly.

Dr. Holmes is dedicated to empowering discovery and equitable access to knowledge through the development of computational and social architectures to support these goals. She also serves on the leadership team of the Northwestern University Clinical and Translational Sciences Institute.

%d bloggers like this: