Bringing NLM to You

Guest post by Andrew Wiley, Video Producer, NLM Office of Communications and Public Liaison.

Before the COVID-19 pandemic, visitors from all over the world came to NLM for free, in-person, guided tours to learn about the largest biomedical library in the world. Visitors ranged from members of the public to students, educators, scientists, and nurses. They were introduced to many of NLM’s exciting research and information resources, such as the Visible Human Project — a library of digital images representing the complete anatomy of a man and a woman allowing visitors to discover a new perspective on the human body. Visitors could also explore the NLM Data Center, which houses the vital databases visitors know and love, such as PubMed, ClinicalTrials.gov, MedlinePlus, and GenBank.

NLM is not your typical library. During tours, visitors could interact with the investigators in our NLM Intramural Research Program who are using computational biology and computational health science approaches to solve biological and clinical problems. Visitors could also descend into the underground stacks to see medical librarians scanning the world’s largest collection of scientific and medical literature. They could also view some of the world’s oldest and rarest medical books in NLM’s extensive historical collections — discovering just a few of the features that makes NLM so unique.

While the pandemic put a temporary stop to our ability to continue with physical tours of NLM, we know that visitors are eager for a virtual alternative. That’s why we created our new NLM Welcome Page.

This is where you can start your virtual tour and explore NLM’s offerings and resources. Here you can embark on a journey to explore some of what NLM has to offer through webpages that guide you from the world’s richest collections of historical material to the most cutting-edge data of the 21st century.

We want you to be able to experience NLM’s past, present, and future, and continue to see how NLM’s research and information services directly support scientific discovery, health care, and public health.

NLM is committed to serving scientists and society. What would you like to explore at NLM?


Photo of Andrew Wiley

Andrew Wiley is a video producer and writer for NLM’s Office of Communications and Public Liaison. Before joining NLM in 2008, Andrew produced local television in Frederick, Maryland and worked as a video journalist for The Frederick News-Post.


Video Transcript (below):

Hello, I’m Dr. Patti Brennan. I’m the Director of the National Library of Medicine.

As a nurse and an industrial engineer, I’ve spent my career making sure that information is available to help people make everyday health choices and to support biological and medical discoveries.

At the National Library of Medicine, we provide trusted information to scientists, to society, and to people living every day with healthcare challenges.

For over 200 years, the National Library of Medicine has been a partner in biological discovery, clinical care decision making, and health care choices in everyday living. We began humbly as a small collection of books in the 1800’s and now have grown to massive genomic databanks accessible worldwide every day by millions of people.

As one of the 27 Institutes and Centers here at the National Institutes of Health, we have three primary missions:

  • First, we have researchers that develop the tools that translate health data into health information and health action.
  • Second, we serve society by collecting the world’s biological and biomedical literature making it useful to scientists through our PubMed resource, and to everyday people through MedlinePlus.
  • Finally, we have a mission for outreach to make the National Library of Medicine’s resources accessible to everyone through our 7,000 points of presence around the United States. We make sure that the resources of the National Library of Medicine are available through public libraries, through hospital libraries, and in schools and clinics.

Making all of the resources of the National Library of Medicine available to the public requires a very large workforce. We have over 1,700 women and men working here. We have librarians, computer scientists, researchers, and biological scientists. We have individuals who understand clinical care, and who understand how to educate the public. We work together to make sure we can deliver—24 hours a day, 7 days a week—trusted health information.

Thank you for visiting us today. We hope you will join with us as we begin our third century bringing health information to scientists and society, accelerating biomedical discovery, improving health care, and ensuring health for all globally.

Artificial Intelligence, Imaging, and the Promising Future of Image-Based Medicine

In mid-October I gave the NLM Research in Trustable, Transparent AI for Decision Support keynote speech to the 50th Institute of Electrical and Electronics Engineers (IEEE) Applied Imagery Pattern Recognition conference in Washington, D.C. (virtually, for me). The IEEE continues to advance new topics in applied image and visual understanding, and the focus this year was to explore artificial intelligence (AI) in medicine, health care, and neuroscience.

To prepare for my talk, I reviewed our extramural research portfolio so I could highlight current research on these topics. NLM’s brilliant investigators are using a range of machine learning and AI strategies to analyze diverse image types. Some of the work fosters biomedical discovery; other work is focused on creating novel decision support or quality improvement strategies for clinical care. As I did with the audience at IEEE, I’d like to introduce you to a few of these investigators and their projects.

Hagit Shatkay and her colleagues from the University of Delaware direct a project titled Incorporating Image-based Features into Biomedical Document Classification. This research aims to support and accelerate the search for biomedical literature by leveraging images within articles that are rich and essential indicators for relevance-based searches. This project will build robust tools for harvesting images from PDF articles and segment compound figures into individual image panels, identify and investigate features for biomedical image-representation and categorization of biomedical images, and create an effective representation of documents using text and images grounded in the integration of text-based and image-based classifiers.

Hailing from the University of Michigan, Jenna Wiens leads a project called Leveraging Clinical Time Series to Learn Optimal Treatment of Acute Dyspnea. Managing patients with acute dyspnea is a challenge, sometimes requiring minute-to-minute changes in care approaches. This team will develop a novel clinician-in-the-loop reinforcement learning (RL) framework that analyzes electronic health record (EHR) clinical time-series data to support physician decision-making. RL differs from the more traditional classification-based supervised learning approach to prediction; RL “learns” from evaluating multiple pathways to many different solution states. Wiens’ team will create a shareable, de-identified EHR time-series dataset of 35,000 patients with acute dyspnea and develop techniques for exploiting invariances (different approaches to the same outcome) in tasks involving clinical time-series data. Finally, the team will develop and evaluate an RL-based framework for learning optimal treatment policies and validating the learned treatment model prospectively.

Quynh Nguyen from the University of Maryland leads a project called Neighborhood Looking Glass: 360 Degree Automated Characterization of the Built Environment for Neighborhood Effects Research. Using geographic information systems and images to assemble a national collection of all road intersections and street segments in the United States, this team is developing informatics algorithms to capture neighborhood characteristics to assess the potential impact on health.

Corey Lester from the University of Michigan leads a multidisciplinary team using machine intelligence in a project titled Preventing Medication Dispensing Errors in Pharmacy Practice with Interpretable Machine Intelligence. Machine intelligence is a branch of AI distinguished by its reliance on deductive logic, and the ability to make continuous modifications based in part on its ability to detect patterns and trends in data. The team is designing interpretable machine intelligence to double-check dispensed medication images in real-time, evaluate changes in pharmacy staff trust, and determine the effect of interpretable machine intelligence on long-term pharmacy staff performance. More than 50,000 images are captured and put through an automated check process to predict the shape, color, and National Drug Code of the medication product. This use of interpretable machine intelligence methods in the context of medication dispensing is designed to provide pharmacists with confirmatory information about prescription accuracy in a way that reduces cognitive demand while promoting patient safety.

Alan McMillan from the University of Wisconsin-Madison and his team are examining how image interpretation can improve noisy data in a project called Can Machines be Trusted? Robustification of Deep Learning for Medical Imaging. Noisy data is information that cannot be understood and interpreted correctly by machines (such as unstructured text). While deep learning approaches (methods that automatically extract high-level features from input data to discern relationships) to image interpretation is gaining acceptance, these algorithms can fail when the images themselves include small errors arising from problems with the image capture or slight movements (e.g., chest excursion in the breathing of the patient). The project team will probe the limits of deep learning when presented with noisy data with the ultimate goal of making the deep learning algorithms more robust for clinical use.

In the work of Joshua Campbell’s team at Boston University, the images emerge at the end of the process to allow for visualization of large-scale datasets of single-cell data. The project, titled Integrative Clustering of Cells and Samples Using Multi-Modal Single-Cell Datauses a Bayesian hierarchical model developed by the team to perform bi-clustering of genes into modules and cells into subpopulations. The team is developing innovative models that cluster cells into subpopulations using multiple data types and cluster patients into subgroups using both single-cell data and patient-level characteristics. This approach offers improvements over discrete Bayesian hierarchical models for classification in that it will support multi-modal and multilevel clustering of data.

Several things struck me as I reviewed these research projects. The first was a sense of excitement over the engagement of so many smart young people at the intersection of analytics, biomedicine, and technology. The second was the variety of types of images considered within each project. While one study explores radiological images, another study examines how image data types vary from figures in journal papers to pictures of the built environment and images of workflows in a pharmacy. Two of these studies use AI techniques to analyze the impact of the physical environment to better understand its influence on patient health and safety, and one study uses images as a visualization tool to better support inference of large-scale biomedical research projects. Images appear at all points of the research process, and their effective use heralds an era of image-based medicine. Let’s see what lies ahead!

Leveraging the Value of Biomedical Informatics Across NIH

The American Medical Informatics Association (AMIA) 2021 Annual Symposium is coming to a close today, and I was honored to moderate NLM’s Annual Update Panel. This was an opportunity to talk about NLM’s contributions to NIH data science and tools, common data elements and clinical informatics. Over the last 40 years I’ve proudly served in many roles in AMIA and its predecessor organizations, including president, member of the Board of Directors, associate editor of the Journal of the American Medical Informatics Association, and numerous committee assignments; all of which fostered the advancement of health at the intersection of informatics, clinical, and biomedical knowledge. I’ve made many friends, been mentored by some of the greatest minds in the field, mentored others, and am grateful for the intellectual leadership and personal support provided by attendees at this meeting.

This year I am leading a completely new effort; for the first time in its 134-year history, NIH has multiple leaders across its Institutes and Centers who also are leaders in biomedical informatics. Together, with Michael F. Chiang, MD, Director of the National Eye Institute; Joshua Denny, MD, MS, CEO of NIH’s All of Us Program; Zhiyong Lu, PhD, FACMI, Senior Investigator in NLM’s National Center for Biotechnology Information; and Clement McDonald, MD, Chief, Health Data Standards Officer at NLM, I had the pleasure of leading a panel discussion about how informatics is accelerating efforts at NIH in support of biomedical discovery and the public health response to the COVID-19 pandemic.

NIH recognizes that the future of biomedical discovery rests, in part, on being able to leverage the knowledge embodied in clinical health records. For example:

  • As we learned throughout the pandemic, the insights we glean from clinical health records and from understanding the natural history of COVID-19 inform best practices for addressing its spread. More importantly, the ability to quickly and efficiently access clinical information provides an opportunity to titrate clinical trials in response to the ‘in-the-moment’ understanding of the course of an illness and clinical care for patients.
  • An improved understanding of the long-term course of COVID-19 and its clinical sequalae rests on being able to follow patients across time. Clarifying the impact of novel vaccines or clinical therapeutics would be enhanced by the ability to integrate participant information across any and every study in which the participant is represented.
  • NIH is engaged in exploring the value of contemporary and emerging informatics innovations, such as cloud-based reusable platforms and datasets, common data models, and the effective development and use of artificial intelligence and machine learning approaches to biomedical research and clinical practice.

Each of these requires effective deployment of informatics innovations into the research process.

AMIA’s Clinical Research Informatics community was foundational to the enhancement of the integration of biomedical informatics concepts into the research process. Much progress has been made to structure information for clinical and translational research and common data elements. Incremental progress must be praised, but to engage the operations of the world’s largest biomedical enterprise requires multiple touch points. Specifically, expanding the critical mass of leadership with expertise in biomedical informatics is essential for instituting enterprise-wide change.

Perhaps, as evidence of the importance of biomedical informatics in the research enterprise, NIH now has directors of three institutes and/or major operations who stand as leaders in the biomedical informatics community. The number of American College of Medical Informatics fellows among the NIH staff is expanding. Not only does this allow a ‘community of conversation,’ but embedding informatics expertise across NIH has accelerated the acceptance of, and the valuing of, biomedical informatics in the biomedical research enterprise. Together we are stimulating new biomedical informatics methods, processes, and the application of biomedical informatics innovation to science and health. I invite you to come join us!

Using Comparative Genomics to Advance Scientific Discoveries

Guest post by Valerie Schneider, PhD, staff scientist at the National Library of Medicine’s National Center for Biotechnology Information, National Institutes of Health.

In a post from earlier this year, A Journey to Spur Innovation and Discovery, I shared news of an exciting NIH-supported NLM initiative, now known as the NIH Comparative Genomics Resource (CGR). CGR, which supports eukaryotic organisms, is modernizing NIH resources and infrastructure to support research involving non-human organisms. This initiative will improve the data foundational to analyses that rely on comparisons of diverse genomes in NLM databases, increase its connectivity to related content, and facilitate the discovery and retrieval of this information. Just as researchers look to the data from these organisms to teach them about a wide range of fundamental biological processes underpinning human health, NLM relies on the research community to help inform the development and delivery of organism-agnostic core tools and interfaces for CGR so that it can best support these analyses.

Stakeholder feedback and engagement is central to the vision and ethos of the NLM Strategic Plan 2017-2027. Since the plan’s inception, NLM enterprises undertaken in support of our three primary goals have placed heavy emphasis on community connections in both their planning and execution. Likewise, understanding stakeholder needs is a fundamental element of CGR. With more than 19,000 genomes from over 8,500 species (excluding bacteria and viruses) found in our Assembly database, it’s clear that CGR’s user base will hail from a large and diverse collection of research organism communities. Within each community, there is diversity in the role CGR will play due to variability in the amount of genomic sequence available, as well as the existence of organism-specific data resources, such as community knowledge bases. Data consumers, themselves, are a heterogeneous population and represent different levels of research interests, education, bioinformatics expertise, and analysis needs.

CGR is using a multi-tiered and multi-faceted approach to ensure stakeholder requirements are understood and appropriately prioritized throughout the project duration. CGR is working to identify community-supplied genome-related data that can be integrated to enhance content supplied by NLM. Two governance bodies are playing important roles in this effort. A trans-NIH CGR steering committee provides strategic oversight by guiding CGR with respect to the priorities of NIH institutional stakeholders, and an NLM Board of Regents CGR working group is charged with helping engage with the scientific community and enlist them as partners in the development effort. Working group members have expertise in topics relevant to the CGR initiative, such as comparative genomic analysis, emerging large-scale genomics approaches, organism-centered research into general biological or disease processes, biological education, and workforce development.

We are developing a presence for CGR at scientific conferences and workshops to encourage partnerships with members of research communities and connect with attendees. A CGR-related talk given at the BioDiversity Genomics 2021 conference in September introduced a new cloud-based tool for improving genomic quality to be released in 2022 and identified researchers to serve as beta testers. Additional targeted outreach will be held independent of conferences to gather feedback and inform development.

The CGR project utilizes an iterative development process in which user testing is an integral element. Feedback gathered through these testing exercises is incorporated into the next development cycle. This approach ensures we remain engaged with the CGR target audience throughout the project by understanding their needs and providing a resource that is valuable to their research pursuits. For example, recent user testing of a prototype Basic Local Alignment Search Tool (BLAST) database engineered to support sequence queries seeking a broad distribution of organisms in the results taught us about other content that will need to be provided for proper interpretation of results.

NLM is poised to learn great things from our users as part of the CGR project. You can learn more about engagement opportunities by contacting us at info@ncbi.nlm.nih.gov. We value your input as we continue this journey together.

Valerie Schneider, PhD, is the deputy director of Sequence Offerings and the head of the Sequence Plus program. In these roles, she coordinates efforts associated with the curation, enhancement, and organization of sequence data, as well as oversees tools and resources that enable the public to access, analyze, and visualize biomedical data. She also manages NCBI’s involvement in the Genome Reference Consortium, the international collaboration tasked with maintaining the value of the human reference genome assembly.

Pursuing Data-Driven Responses to Public Health Threats

In my 11th grade civics class, I learned about how a bill becomes a law, and I‘ll bet some of you can even remember the steps. Today, I want to introduce you to another way that the federal government takes actions – executive orders. As head of the executive branch, the president can issue an executive order to manage operations of the federal government.

In light of the COVID-19 pandemic, President Biden has issued executive orders to accelerate the country’s ability to respond to public health threats.

This is where I come in. As Director of the National Library of Medicine (NLM) and a member of the leadership team of the National Institutes of Health, I’m part of a group developing the implementation plan for the Executive Order entitled Ensuring a Data-Driven Response to COVID-19 and Future High-Consequence Public Health Threats.

This order directs the heads of all executive departments and agencies to work on COVID-19 and pandemic-related data issues. This includes making data that is relevant to high-consequence public health threats accessible to everyone, reviewing existing public health data systems to issue recommendations for addressing areas for improvement, and reviewing the workforce capacity for advanced information technology and data management. And, like all good government work, a report summarizing findings and providing recommendations will be issued.

Since March 2021, I have been meeting 2 to 3 times a month with public health and health data experts across the U.S. Department of Health & Human Services (HHS). Our committee includes staff from the Office of the National Coordinator for Health Information Technology, Food and Drug Administration, Centers for Disease Control and Prevention, Centers for Medicare & Medicaid Services, and Office of the Assistant Secretary for Planning and Evaluation.

After creating a work plan, our group arranged briefings with many other groups, including public health officials from states and territories, representatives from major health care systems, and the public, among others. We reviewed many initiatives to promote open data, data sharing, and data protection across the government sphere. We learned about the challenges of developing and adopting data standards, and the ability of different groups to come together to make data more useful in preparing the country to anticipate and respond to high-consequence public health threats. We discussed future strategies for data management and data protection, new analytical models, and workforce development initiatives. Our working group provided a report to the Office of Science and Technology Policy (OSTP), handing it off to the next team who will take the work process and keep moving it toward completion. In coordination with the National Science and Technology Council, OSTP will develop a plan for advancing innovation in public health data and analytics.

This was a beneficial experience for me, and I certainly learned a great deal. Implementing a public health response system requires engagement with many HHS divisions, each of which brings a unique perspective and experience. I also developed new relationships based on trust and collaboration with these colleagues. At NLM, we have experts in data standards and data collection, and we oversee vast data repositories, so we have substantial domain-specific knowledge to contribute. I drew frequently on the knowledge and expertise of NLM staff to inform the process through analyses of information and the preparation of reports. I am grateful for all who helped and supported me.

I believe our country is prepared to have the data necessary to prevent, detect, and respond to future high-consequence public health threats. This is yet another way that NLM is helping shape data-powered health for the future. What else can we do for you?

%d bloggers like this: