The Next Normal: Supporting Biomedical Discovery, Clinical Practice, and Self-Care

As we start year three of the COVID-19 pandemic, it’s time for NLM to take stock of the parts of our past that will support the next normal and what we might need to change as we continue to fulfill our mission to acquire, collect, preserve, and disseminate biomedical literature to the world.

Today, I invite you to join me in considering the assumptions and presumptions we made about how scientists, clinicians, librarians and patients are using critical NLM resources and how we might need to update those assumptions to meet future needs. I will give you a hint… it’s not all bad—in fact, I find it quite exciting!

Let’s highlight some of our assumptions about how people are using our services, at least from my perspective. We anticipated the need for access to medical literature across the Network of the National Library of Medicine and created DOCLINE, an interlibrary loan request routing system that quickly and efficiently links participating libraries’ journal holdings. We also anticipated that we were preparing the literature and our genomic databases for humans to read and peruse. Now we’re finding that more than half of the accesses to NLM resources are generated and driven by computers through application programming interfaces. Even our MedlinePlus resource for patients now connects tailored electronic responses through MedlinePlus Connect to computer-generated queries originating in electronic health records.

Perhaps, and most importantly, we realize that while sometimes the information we present is actually read by a living person, other times the information we provide—for example, about clinical trials (ClinicalTrials.gov) or genotype and phenotype data (dbGaP)—is actually processed by computers! Increasingly, we provide direct access to the raw, machine-readable versions of our resources so those versions can be entered into specialized analysis programs, which allow natural-language processing programs to find studies with similar findings or machine-learning models to determine the similarities between two gene sequences. For example, NLM makes it possible for advocacy groups to download study information from all ClinicalTrials.gov records so anyone can use their own programs to point out trials that may be of interest to their constituents or to compare summaries of research results for related studies.

Machine learning and artificial intelligence have progressed to the point that they perform reasonably well in connecting similar articles—to this end, our LitCovid open-resource literature hub has served as an electronic companion to the human curation of coronavirus literature. NLM’s LitCovid is more efficient and has a sophisticated search function to create pathways that are more relevant and are more likely to curate articles that fulfill the needs of our users. Most importantly, innovations such as LitCovid help our users manage the vast and ever-growing collection of biomedical literature, now numbering more than 34 million citations in NLM’s PubMed, the most heavily used biomedical literature citation database.

Partnerships are a critical asset to bring biomedical knowledge into the hands (and eyes) of those who need it. Over the last decade, NLM moved toward a new model for managing citation data in PubMed. We released the PubMed Data Management system that allows publishers to quickly update or correct nearly all elements of their citations and that accelerates the delivery of correct and complete citation data to PubMed users.

As part of the MEDLINE 2022 Initiative, NLM transitioned to automated Medical Subject Headings (MeSH) indexing of MEDLINE citations in PubMed. Automated MeSH indexing significantly decreases the time for indexed citations to appear in PubMed without sacrificing the quality MEDLINE is known to provide. Our human indexers can focus their expertise on curation efforts to validate assigned MeSH terms, thereby continuously improving the automated indexing algorithm and enhancing discoverability of gene and chemical information in the future.

We’re already preparing for the next normal—what do you think it will be like?

I envision making our vast resources increasingly available to those who need them and forging stronger partnerships that improve users’ ability to acquire and understand knowledge. Imagine a service, designed and run by patients, that could pull and synthesize the latest information about a disease, recommendations for managing a clinical issue, or help a young investigator better pinpoint areas ripe for new interrogation! The next normal will make the best use of human judgment and creativity by selecting and organizing relevant data to create a story that forms the foundation of new inquiry or the basis of new clinical care. Come along and help us co-create the next normal!

Meet the NLM Investigators: For Sameer Antani, PhD, Seeing is More Than Meets the Eye

It’s time for another round of introductions! Many of you may already know Sameer Antani, PhD—one of NLM’s most decorated and prestigious investigators—from his many awards and accolades. In March 2022, he was inducted into the American Institute for Medical and Biological Engineering’s College of Fellows, an impressive group that represents the top two percent of medical and biological engineers. This distinction is one of the highest honors that can be bestowed upon a medical and biological engineer. Can you tell we are proud of him?!  

We selected Dr. Antani to join our NLM family after a nationwide, competitive search, and his genius was readily apparent from the start. Dr. Antani’s career spans over two decades, during which he developed an innovative research portfolio focused on machine learning and artificial intelligence (AI). His lab at NLM focuses on using these tools to analyze enormous sets of biomedical data. Through this analysis, AI technology can “learn” to detect disease and assist health care professionals provide more efficient diagnoses. Examples of Dr. Antani’s work can be found in mobile radiology vehicles, which allow professionals to take chest X-rays and screen for HIV and tuberculosis using software containing algorithms developed in his lab. Check out the infographic below to learn more about the exciting research happening in Dr. Antani’s lab.

Infographic titled: Seeing is more than meets the eye. Under the title the investigator's name, title and division are listed as: Sameer Antani, PhD, Investigator, Computational Health. 

The first column of the infographic is titled: Projects. Two bullets are listed in the first column. The first bullet reads: Discovering the impact of data on automated AI and machine learning (AI/ML) processes on diagnostics. The second bullet says: Improving AI/ML algorithm decisions to be consistent, reproducible, portable, explainable, unbiased, and representative of severity.

The second column is titled: Process. The first bullet in this column reads: Using images and videos alongside AIML technology to identify and diagnose:
Cancers: Cervical, Oral, Skin (Kaposi Sarcoma)
Cardiomyopathy 
Cardiopulmonary diseases. 
The second bullet reads: Analyzing a variety of image types, including:
Computerized Tomography (CT), Magnetic Resonance Imaging (MRI), X-ray, ultrasound, photos, videos, microscopy. 

The third and final column in the infographic is titled: What It Looks Like. In this column there are four images of chest x-rays illustrating the detection of HIV and TB.

Now, in his own words, learn more about what makes Dr. Antani’s work so important!

What makes your team unique? Tell us more about the people working in your lab.   

The postdoctoral research fellows, long-term staff scientists, and research scientists on my team explore challenging computational health topics while simultaneously advancing topics in machine learning for medical imaging. Dr. Ghada Zamzmi, Dr. Peng Guo, and Dr. Feng Yang bring expertise and drive to our lab. The scientists on my team, Dr. Zhiyun (Jaylene) Xue and Dr. Sivarama Krishnan Rajaraman, add over two decades of combined research and mentoring experience.  

What do you enjoy about working at NLM?  

There are many positives about working at NLM. At the top of the list is the encouragement and support to explore cutting-edge problems in medical informatics, data science, and machine intelligence, among other initiatives. 

What is your advice for young scientists or people interested in pursuing a career in research?  

I urge young scientists to recognize the power of multidisciplinary teams. I would also urge them to develop skills to clearly communicate their goals and research interests with colleagues who might be from a different domain so they can effectively collaborate and arrive at mutually beneficial results. 

Where is your favorite place to travel?

I like to travel to places that exhibit the natural wonders of our planet. I hope to visit all our national parks someday. 

When you’re not in the lab, what do you enjoy doing?

I am studying and exploring different aspects of music structure.

You’ve read his words, and now you can hear him for yourself! Follow our NLM YouTube page for more exciting content from the NLM staff that make it all possible. If you’d like to learn more about our IRP program, view job opportunities, and explore research highlights, I invite you to explore our recently redesigned NLM IRP webpage.

YouTube: Sameer Antani and Artificial Intelligence

Transcript: [Antani]: I went to school for computer engineering in India. I’ve worked with image processing, computer vision, pattern recognition, machine learning. So my world was filled with developing algorithms that could extract interesting objects from images and videos. Pattern recognition is a family of techniques that looks for particular pixel characteristics or voxel characteristics inside an image and learns to recognize those objects. Deep learning is a way of capturing the knowledge inside an image and encapsulating it, and then researchers like me spend time advancing newer deep-learning networks that look more broadly into an image, recognizing these objects—recognizing organs, in my case, and diseases—and converting those visuals into numerical risk predictors that could be used by clinicians.

So my research is currently in three very different areas. One area looks at cervical cancer. A machine could look at the images and be a very solid predictor of the risk to the woman of developing cervical precancer, encouraging early treatment. Another area I work with [is] sickle cell disease. One of the risk factors in sickle cell disease is cardiac myopathy or cardiac muscle disease, which leads to stroke and perhaps even death. Looking at cardiac echo videos and using AI to be a solid predictor, along with other blood lab tests, improves the chances of survival.

A third area that I’m interested in is understanding the expression of tuberculosis [TB] in chest X-rays, particularly for children and those that are HIV-positive. The expression of disease in that subpopulation is very different from adults with TB who are not HIV positive. Every clinician has seen a certain number of patients in their clinical training. They perhaps have spent more time at hospitals or clinical centers, been exposed to a certain population, and they become very adept at that population. Machines, on the other hand, could be trained on data that is free of bias, from different parts of the world, different ethnicities, different age groups, so that there’s an improved caregiving and therefore, a better expectation on treatment and care.

Note: Transcript was modified for clarity.

Informing Success from the Outside In: Introducing the NLM Board of Regents CGR Working Group

Guest post by Valerie Schneider, PhD, staff scientist at the National Library of Medicine (NLM) National Center for Biotechnology Information (NCBI), National Institutes of Health (NIH), and Kristi Holmes, PhD, Director of Galter Health Sciences Library & Learning Center and Professor of Preventive Medicine at Northwestern University Feinberg School of Medicine.

Last year, we described how NLM is developing the NIH Comparative Genomics Resource (CGR)—a project that offers content, tools, and interfaces for genomic data resources associated with eukaryotic research organisms—in two blog posts:

Eukaryote refers to any single-celled or multicellular organisms whose cell contains a distinct and membrane-bound nucleus. Since eukaryotes all likely evolved from the same common ancestry, studying them can grant us insight into how other eukaryotes—including those in humans—work and makes CGR and its resources that much more important to eukaryotic research.

CGR aims to:

  • Promote high-quality eukaryotic genomic data submission.
  • Enrich NLM’s genomic-related content with community-sourced content.
  • Facilitate comparative biological analyses.
  • Support the development of the next generation of scientists.

Since our last two posts, the team at NCBI has been hard at work making important technical and content updates to and socializing CGR’s suite of tools. For instance, they published new webpages that organize genome-related data by taxonomy, making it available for browsing and immediate download. They also created the ClusteredNR Database, a new database for the Basic Local Alignment Search Tool (BLAST), to provide results with greater taxonomic context for sequence searches, and incorporated new gene information from the Alliance of Genome Resources, an organization that unites data and information for model organisms’ unique aspects, into Gene. NCBI is also engaging with genomics communities to understand their needs and requirements for comparative genomics through the NLM Board of Regents Comparative Genomics Working Group.

The working group is lending their perspective and extensive expertise to the project, activities that are essential to CGR’s success and development. We have charged working group members with guiding the development of a new approach to scientific discovery that relies on genomic-related data from research organisms, helping project teams keep pace with changes in the field, and understanding the scientific community’s needs and expectations for key functionalities. To do this, working group members help NLM set development priorities such as exploring CGR’s integration with existing infrastructures and related workforce development opportunities.

Projects like CGR highlight how critical interdisciplinary collaboration is to modern research and how success requires community perspectives and involvement. Working group members will be sharing more information about this project at upcoming conferences and in biomedical literature, and our team at NCBI will also share events and resources through our NIH Comparative Genomics Resource website.

If you are a member of a model organism community, are working on emerging eukaryotic research models, or support eukaryotic genomic data—whether you are a researcher, educator, student, scholarly society member, librarian, data scientist, database resource manager, developer, epidemiologist, or other stakeholder in our progress—we encourage you to reach out and get involved. Here are a few suggestions:

  • Invite us to join you at a conference, teach a workshop, partner on a webinar, or discuss other ideas you may have to foster information sharing and feedback.
  • Use and share CGR’s suite of tools and share your feedback.
  • Be on the lookout for project updates and events on the CGR website or follow @NCBI on Twitter.

We’re always excited to get feedback through CGR listening sessions and user testing for tool and resource updates. Email cgr@nlm.nih.gov to learn all the ways you can participate.

Thank you to the members of the NLM Board of Regents CGR Working Group!

Alejandro Sanchez Alvarado, PhD

Executive Director and Chief Scientific Officer
Priscilla Wood Neaves Chair in the Biomedical Sciences
Stowers Institute for Medical for Medical Research

Hannah Carey, PhD
Professor, Department of Comparative Biosciences, School of Veterinary Medicine
University of Wisconsin-Madison

Wayne Frankel, PhD
Professor, Department of Genetics & Development
Director of Preclinical Models, Institute of Genomic Medicine
Columbia University Medical Center

Kristi L. Holmes, PhD (Chair)
Director, Galter Health Services Library & Learning Center
Professor of Preventive Medicine (Health & Biomedical Informatics)
Northwestern University Feinberg School of Medicine

Ani W. Manichaikul, PhD
Associate Professor, Center for Public Health Genomics
University of Virginia School of Medicine

Len Pennacchio, PhD
Senior Scientist
Lawrence Berkeley National Laboratory

Valerie Schneider, PhD (Executive Secretary)
Program Head, Sequence Enhancements, Tools and Delivery (SeqPlus)
HHS/NIH/NLM/NCBI

Kenneth Stuart, PhD
Professor, Center of Global Infectious Disease Research
Seattle Children’s Research Institute

Tandy Warnow, PhD
Grainger Distinguished Chair in Engineering
Associate Head of Computer Science
University of Illinois, Champaign-Urbana

Rick Woychik, PhD (NIH CGR Steering Committee Liaison)
Director, National Institute of Environmental Health Sciences (NIEHS) and the National Toxicology Program (NTP)

Cathy Wu, PhD
Unidel Edward G. Jefferson Chair in Engineering and Computer Science
Director, Center for Bioinformatics & Computational Biology
Director, Data Science Institute
University of Delaware

Dr. Schneider is the deputy director of Sequence Offerings and the head of the Sequence Plus program. In these roles, she coordinates efforts associated with the curation, enhancement, and organization of sequence data, as well as oversees tools and resources that enable the public to access, analyze, and visualize biomedical data. She also manages NCBI’s involvement in the Genome Reference Consortium, which is the international collaboration tasked with maintaining the value of the human reference genome assembly.

Dr. Holmes is dedicated to empowering discovery and equitable access to knowledge through the development of computational and social architectures to support these goals. She also serves on the leadership team of the Northwestern University Clinical and Translational Sciences Institute.

Watch All About It!

Guest post by Bart Trawick, PhD, director of the Customer Services Division at the National Library of Medicine’s National Center for Biotechnology Information, National Institutes of Health.

NLM’s PubMed is the most heavily used biomedical literature citation database in the world. PubMed provides free access to more than 30 million citations and is searched by more than 2.5 million users daily. It is a critical resource for helping researchers, health care professionals, students, and the public share information and learn more about the latest developments in life sciences.

Earlier this year, NLM launched an updated version of PubMed with an enhanced design that provides advanced technology to improve the user experience on mobile as well as desktop devices. This modern interface includes updated web elements for easier navigation and enhanced search results, including previews with highlighted text snippets that can help you scan your results.

Instead of telling you more about these new features and how they work, I invite you to check out a few of them in this video.

Click to watch and learn more about a few of PubMed’s exciting features.

Video Transcript (below):

PubMed is the most heavily used biomedical citation database in the world, guiding over two and a half million users per day to the latest advances in life sciences research. We’re constantly improving PubMed to meet the needs of its diverse user base and to take advantage of ever-evolving internet technologies and standards.

The latest version of PubMed, released in May 2020, is the product of hundreds of hours of stakeholder engagement and research undertaken to give you a better experience.

And it’s not the first time we’ve made big changes.

From its humble beginnings in 1997, PubMed now comprises more than 30 million biomedical literature citations from MEDLINE, life science journals, and online books. These citations may include links to full-text content in PubMed Central and publisher websites to take you directly to the information you need.

To be sustainable going forward, the latest release of PubMed required major changes including new databases, web architecture and cloud delivery. Combined, these changes resulted in a much more resilient version of PubMed with a modern design that looks and works great on your desktop, your laptop, and your mobile device!

We realize this feels like a big change, but we’ve been working hard to help everyone make the transition to the new site and have continued to make improvements along the way.

Here are a couple new and revamped features designed to improve the user experience.

The new Cite button makes it easy to retrieve styled citations you can copy and paste into a document or download an .nbib file to use with your reference manager software.

Using the Cite button for an item will open a pop-up window where you can copy the citation formatted in four popular styles.

Automatic Term Mapping, also called “ATM”, was present in the legacy PubMed, but it’s been expanded to include additional British and American spellings, singular and plural word forms, and other synonyms to provide more consistent and comprehensive search retrieval.

We’re always looking for ways to improve PubMed. Just as we’ve done for the past 24 years, we’ll continue to add features and data to stay current as technology, publishing standards, and our users’ needs evolve.

Please think about other ways that NLM can help you, and share your ideas  with us.  

Headshot image of Bart Trawick, PhD

As director of the Customer Services Division, Dr. Trawick works to connect customers with the vast information resources available from NLM’s National Center for Biotechnology Information. He has also worked to support the National Institutes of Health Public Access Policy since its establishment in 2005. Dr. Trawick is a graduate of Texas A&M University and the University of Texas Health Science Center at Houston.

A New Era of Health Communications

I’ve been reflecting on how communications has transformed our lives, particularly since the COVID-19 pandemic radically changed our ability to interact with others.

Before NLM’s physical workspace shifted to maximum telework, I was walking to work when I passed a strange sight — the last vestiges of pay phones on the National Institutes of Health campus! Those decommissioned pay phones got me thinking about how technology changes over time, how essential communication technology has become, and how NLM’s approach to providing trustworthy biomedical data and health information must evolve as methods of delivery change. As technology advances, we have more choices and greater sophistication in the methods we use to meet our responsibility to deliver biomedical data and health information, as well as in the tools we use to interrogate that information.

Payphones sit outside of the National Library of Medicine, having been removed from use in the building.

The Lister Hill National Center for Biomedical Communications (LHNCBC), now more than 50 years old, provides a case study of how NLM’s efforts to communicate information have been transformed.

LHNCBC was established by a joint congressional resolution in 1968 to stimulate the application of modern communications technologies to the challenges of delivering health information worldwide to support health care services and enhance medical education.

In that same decade, push-button telephone pads were replacing rotary dials, and the Trimline telephone, with the earpiece, mouthpiece, and dial pad in the handset, was introduced. The ARPANET, the early version of the packet-switching internet, appeared soon after. Just as the Trimline phone presaged the design of mobile phones, the early design of LHNCBC laid the foundation for robust innovation in the use of telecommunications tools, computer networks, and high-performance visualization to deliver health information and ensure its use.   

An intramural division of NLM, LHNCBC develops advanced health information resources and software tools that are widely used in biomedical research and by health information technology professionals, health care providers, and consumers. As it seeks to improve access to biomedical information for individuals around the world, LHNCBC conducts and supports research and development on the dissemination of high-quality imagery, medical language processing, high-speed access to biomedical information, intelligent database systems development, multimedia visualization, knowledge management, data mining, and machine-assisted indexing.

In 1994, it launched the Visible Human Project, a landmark accomplishment that made a complete, anatomically detailed, three-dimensional representations of a human male body and a human female body publicly available. 

Current LHNCBC researchers come from a variety of disciplines, including medicine, computer science, library and information science, linguistics, engineering, and education. The Biomedical Informatics Training Program brings together talented individuals to learn from and collaborate with research staff.

Research and development conducted by the interdisciplinary teams across LHNCBC has led to many advances in biomedical communication and information dissemination, such as:

  • Consumer Health Question Answering — This project involves research on both the automatic classification of customer requests and the automatic answering of consumer health questions.
  • Discoveries from Mimic II/III and Other Sources — This effort examines and attempts to validate controversial findings from smaller-scale clinical studies through the interrogation of de-identified medical records and information from health information exchanges. Researchers also conduct retrospective epidemiological studies in areas that lack clinical trials.
  • Open-i — This experimental multimedia search engine retrieves and displays bibliographic citations and their related images by linking to images based on image features.
  • Unified Medical Language System® (UMLS) — This tool integrates key health care terminologies, classifications, and coding systems used by clinicians, billing systems, insurance companies, and researchers. Sources developed include the Metathesaurus®, Semantic Network, and SPECIALIST Lexicon. The UMLS supports health care communication through interoperability, specifically, the mapping of key terms from one vocabulary system to another.

The changes to LHNCBC since its creation in 1968 parallel changes in telecommunications over the past 50 years. Early work at LHNCBC demonstrated how technological advances such as fiber optic networks and semiconductors could be put to best use by the health care sector. Today, LHNCBC continues to improve health through methodological advances in clinical data science and health informatics. We recognize that contemporary communication relies on interoperable data, scalable methods and translation of discovery into operations.  

As health care becomes more highly distributed and NLM resources are increasingly used by individuals around the world and beyond, LHNCBC will continue to be a partner in accelerating health communication.

What trends in health communication do you see ahead? How do you think COVID-19 will shape health communications?

%d bloggers like this: