Guest post by Dianne Babski, Associate Director for Library Operations at NLM; Zhiyong Lu, PhD, Senior Investigator for the Computational Biology Branch at NLM’s National Center for Biotechnology Information (NCBI); and Donald Comeau, PhD, Staff Scientist at NCBI.
In January 2021, the Department of Health and Human Services (HHS) released their Artificial Intelligence (AI) Strategy to help agencies best use AI to advance the health and wellbeing of all Americans. NIH has long collaborated and invested in AI-based projects to discover health solutions across research and medical settings, including the analysis of biomedical imaging to diagnose diseases such as COVID-19.
For many years, NLM has been enthusiastic about the promise and possibilities of AI. You can learn more about NLM’s awareness of and use of AI through some of our recent Musings posts, including: Artificial Intelligence, Imaging, and the Promising Future of Medicine; How NIH is Using Artificial Intelligence to Improve Operations; and NIH Strategically, and Ethically Building a Bridge to AI.
In support of the HHS AI Strategy, we’d like to share a few examples of how NLM is using AI to revolutionize our products and services to enhance usability and discovery of biomedical information.
Best Match is a relevance search algorithm for NLM’s PubMed – a free search engine for biomedical literature accessed by millions of users around the world every day. This AI technique leverages the intelligence of our users and cutting-edge machine-learning technology as an alternative to the traditional date sort order that appears in many traditional search engines. Trained with past user searches with dozens of relevance-ranking factors, the Best Match algorithm demonstrates state-of-the-art retrieval performance and an improved user experience. Best Match increases the effectiveness of PubMed searches across the rapidly growing collection of biomedical literature to help users efficiently find the most relevant and high-quality information they need.
SingleCite is another automated search algorithm designed to improve single citation searches in the PubMed database. It predicts the probability of a retrieved document being the target of a query based on predefined variables. This helps increase the effectiveness of PubMed searches by making a user’s search for a specific document in PubMed more successful.
Computed Author is a machine-learning method that solves for irrelevant retrieval results in PubMed due to author name ambiguity (where different authors share the same name). When users search based on author names, Computed Author uses an algorithm to sort out papers with multiple authors with the same name, and cluster at the top of your results the articles that are likely by the same author. Again, the result is increased effectiveness of PubMed searches supporting NLM’s mission to advance health research and discovery.
Fully automated Medical Subject Headings (MeSH) indexing in NLM’s flagship bibliographic database, MEDLINE, is one of our most recent AI advancements. Automated indexing has been under development at NLM for many years, and the most significant outcome is the development of the NLM Medical Text Indexer (MTI). The MTI algorithm has been undergoing refinements as we move towards automation, including incorporation of deep learning approaches to improve the application of MeSH subheadings, the incorporation of rules and triggers for the indexing of publication types, and the application of Indexing Method designation. Automated indexing greatly expedites the time needed to access MeSH indexing metadata and allows NLM to scale MeSH indexing for MEDLINE to the volume of published biomedical literature.
Gene and chemical indexing are also part of automated MEDLINE indexing efforts to improve literature retrieval and information access. Currently, gene and chemical indexing is performed manually by expert indexers. To assist this process, we are using advanced Natural Language Process and deep learning methods to develop NLM-Gene and NLM-Chem— automatic tools for finding gene and chemical names in the biomedical literature.
We are very excited about our efforts to leverage AI and the advances we have made. Looking forward, we will strive to lead AI innovation in partnership with HHS, NIH, and the broader research community to ensure that we continue to meet our mission to accelerate biomedical discovery and data-powered health.
Do you have ideas for how we can harness AI to improve our projects? What are some of the ways you are using AI to improve products and services?
Dianne Babski is responsible for overall management of one of NLM’s largest divisions with more than 450 staff who provide health information services to a global audience of health care professionals, researchers, administrators, students, historians, patients, and the public. She oversees budget, facilities, administration, and operations, including of a national network of more than 8,000 academic health science libraries, hospital and public libraries, and community organizations to improve access to health information.
Zhiyong Lu is a senior investigator (tenured) at the NLM Intramural Research Program, and he leads research in biomedical information retrieval, natural language processing, and machine learning. As NCBI’s Deputy Director for Literature Search, Dr. Lu directs the research and development efforts to improve PubMed search and information access. Over the years, Dr. Lu has co-authored around 300 scientific publications and mentored 40 trainees, many of whom have gone on to independent faculty/research positions. Dr. Lu is a fellow of the American College of Medical Informatics.
Donald Comeau is a Staff Scientist working in Dr. Lu’s Text Mining Research group at the NLM’s NCBI. His primary responsibilities include identifying key phrases in PMC articles and NCBI Bookshelf. His research projects focus on applying text mining, machine learning, and natural language processing techniques to improving access to NCBI’s biomedical literature collections. Dr. Comeau earned his PhD in Physical Chemistry at Ohio State University,