NIH Strategically, and Ethically, Building a Bridge to AI (Bridge2AI)

This piece was authored by staff across NIH that serve on the working group for the NIH Common Fund’s Bridge2AI program—a new trans-NIH effort to harness the power of AI to propel biomedical and behavioral research forward.

The evolving field of Artificial Intelligence (AI) has the potential to revolutionize scientific discovery from bench to bedside. The understanding of human health and disease has vastly expanded as a result of research supported by the National Institutes of Health (NIH) and others. Every discovery and advance in contemporary medicine comes with a deluge of data. These large quantities of data, however, still result in restricted, incomplete views into the natural processes underlying human health and disease. These complex processes occur across the “health-disease” spectrum over temporal scales – sub-seconds to years – and biological scales – atomic, molecular, cellular, organ systems, individual to population. AI provides the computational and analytical tools that have the potential to connect the dots across these scales to drive discovery and clinical utility from all of the available evidence.

A new NIH Common Fund program, Bridge to Artificial Intelligence (Bridge2AI), will tap into the power of AI to lead the way toward insights that can ultimately inform clinical decisions and individualize care. AI, which encompasses many methods, including modern machine learning (ML), offers potential solutions to many challenges in biomedical and behavioral research.

AI emerged in the 1960s and has evolved substantially in the past two decades in terms of its utility for biomedical research. The impact of AI for biomedical and behavioral research and clinical care derives from its ability to use computer algorithms to quickly find connections from within large data sets and predict future outcomes. AI is already used to improve diagnostic accuracy, increase efficiency in workflow and clinical operations, and facilitate disease and therapeutic monitoring, to name a few applications. To date, the FDA has approved more than 100 AI-based medical products.

AI-assisted learning and discovery is only as good as the data used to train it. 

The use of AI/ML modeling in biomedical and behavioral research is limited by the availability of well-defined data to “train” AI algorithms to learn how to recognize patterns within the data. Existing biomedical and behavioral data sets rarely include all necessary information as they are collected on relatively small samples and lack the diversity of the U.S. population. Data from a variety of sources are necessary to characterize human health, such as those from -omics, imaging, behavior, and clinical indicators, electronic health records, wearable sensors, and population health summaries. The data generation process itself involves human assumptions, inferences, and biases that must be considered in developing ethical principles surrounding data collection and use. Standardizing collection processes is challenging and requires new approaches and methods. Comprehensive, systematically generated and carefully collected data is critical to build AI models that provide actionable information and predictive power. Data generation remains among the greatest challenges that must be resolved for AI to have a real-world impact on medicine.

Bridge2AI is a bold new initiative at the National Institutes of Health designed to propel research forward by accelerating AI/ML solutions to complex biomedical and behavioral health challenges whose resolution lies far beyond human intuition. Bridge2AI will support the generation of new biomedically relevant data sets amenable to AI/ML analysis at scale; development of standards across multiple data sources and types; production of tools to accelerate the creation of FAIR (Findable, Accessible, Interoperable, Reusable) AI/ML-ready data; design of skills and workforce development materials and activities; and promotion of a culture of diversity and ethical inquiry throughout the data generation process.

Bridge2AI plans to support several Data Generation Projects and an Integration, Dissemination and Evaluation (BRIDGE) Center to develop best practices for the use of AI/ML in biomedical and behavioral research. For additional information, see NOT-OD-21-021 and NOT-OD-21-022. Keep up with the latest news by visiting the Bridge2AI website regularly and subscribing to the Bridge2AI listserv.

Top Row (left to right):
Patricia Flatley Brennan, RN, PhD, Director, National Library of Medicine
Michael F. Chiang, MD, Director, National Eye Institute
Eric Green, MD, PhD, Director, National Human Genome Research Institute
 
Bottom Row (left to right):
Helene Langevin, MD, Director, National Center for Complementary and Integrative Health
Bruce J. Tromberg, PhD, Director, National Institute of Biomedical Imaging and Bioengineering

AI is coming. Are the data ready?

The artificial intelligence (AI) revolution is upon us. You can barely read the paper, watch TV, or see a movie without encountering AI and how it promises to change society. In fact, last month, the President signed an executive order directing the US government to prioritize artificial intelligence in its research and development spending to help drive economic growth and benefit the American people.

Artificial intelligence refers to a suite of computer analysis methods—including machine learning, neural networks, deep learning models, and natural language processing—that can enable machines to function as if possessing human reasoning. With AI, computer systems ingest and analyze vast amounts of data and then “learn” through high-volume repetition how to do the task better and better, “reasoning” or “self-modifying” to improve the analytics that shape the outcome.

That learning process results in some pretty amazing stuff. In the health care field alone, AI can determine the presence or absence of abnormalities in clinical images, predict which patients are at risk for rare disorders, and detect irregular heartbeats.

To make all that happen requires data, massive amounts of data.

But like the computer-era quip, “garbage in, garbage out,” the data need to be good to yield valid analyses. What does “good” mean? Two things:

  • The data are accurate, truly representing the underlying phenomena.
  • The data are unbiased, i.e., the observations reflect the complete experience and no inherent errors were introduced anywhere along the chain from data capture to coding to processing.

As much as we’d like to think otherwise, we already know data are biased. Human genetic sequences drawn from studies of white males of Northern European descent do not adequately represent the genetic diversity within women or people from other parts of the globe. Image data generated by different X-ray machines might show slight variations depending upon how the machines were calibrated. Electrical pathways collected from neurological studies conducted as recently as a decade ago do not reflect the level of resolution possible today.

So, what can we do?

It doesn’t make sense to throw out existing data and start anew, but it can be misleading to apply AI to data known to be biased. And it can be risky. Bias in underlying data can result in algorithms that propagate the same bias, leading to inaccurate findings.

That’s why NLM is working to develop computational approaches to account for bias in existing data sets and why we’re investing in this line of research. In fact, we’re actively encouraging grant applications focused on reducing or mitigating gaps and errors in health data research sets.

I have confidence that researchers will crack the puzzle, but until then, let’s look at how the business intelligence community is approaching the issue.

Concerned with reducing the effect of biases in management decision-making, business intelligence specialists have identified strategies to help uncover patterns and probabilities in data sets. They pair these patterns with AI algorithms to create calibration tools informed by human judgment while taking advantage of the algorithms’ power. That same approach might work with biomedical data.

In addition, our colleagues in business now approach data analysis in ways that help detect bias and limit its impact. They:

  • invest more human resources in interpreting the results of AI analytics, not relying exclusively on the algorithms;
  • challenge decision makers to consider plausible alternative explanations for the generated results; and
  • train decision makers to be skeptical and to anticipate aberrant findings.

There’s no reason we can’t adopt that approach in biomedical research.

So, as you read and think more about the potential of artificial intelligence, remember that AI applications are only as good as the data upon which they are trained and built. Remember, too, that the results of an AI-powered analysis should only factor in to the final decision; they should not be the final arbiter of that decision. After all, the findings may sound good, but they may not be real, just an artifact of biased, imperfect data.

Models: The Third Leg in Data-Driven Discovery

Considering a library of models

George Box, a famous statistician, once remarked, “All models are wrong, and some are useful.”

As representations or approximations of real-world phenomena, models, when done well, can be very useful.  In fact, they serve as the third leg to the stool that is data-driven discovery, joining the published literature and its underlying data to give investigators the materials necessary to explore important dynamics in health and biomedicine.

By isolating and replicating key aspects within complex phenomena, models help us better understand what’s going on and how the pieces or processes fit together.

Because of the complexity within biomedicine, health care research must employ different kinds of models, depending on what’s being looked at.

Regardless of the type used, however, models take time to build, because the model builder must first understand the elements of the phenomena that must be represented. Only then can she select the appropriate modeling tools and build the model.

Tracking and storing models can help with that.

Not only would tracking models enable re-use—saving valuable time and money—but doing so would enhance the rigor and reproducibility of the research itself by giving scientists the ability to see and test the methodology behind the data.

Enter libraries.

As we’ve done for the literature, libraries can help document and preserve models and make them discoverable.

The first step in that is identifying and collecting useful models.

Second, we’d have to apply metadata to describe the models. Among the essential elements to include in such descriptions might be model type, purpose, key underlying assumptions, referent scale, and indicators of how and when the model was used.

screencapture with the DOI and RRIDs highlighted
The DOI and RRIDs in a current PubMed record.
(Click to enlarge.)

We’d then need to apply one or more unique identifiers to help with curation. Currently, two different schema provide principled ways to identify models: the Digital Object Identifier (DOI) and the Research Resource Identifier (RRID). The former provides a persistent, unique code to track an item or entity at an overarching level (e.g., an article or book).  The latter documents the main resources used to produce the scientific findings in that article or book (e.g., antibodies, model organisms, computational models).

Just as clicking on an author’s name in PubMed can bring up all the articles he or she has written, these interoperable identifiers, once assigned to research models, make it possible to connect the studies employing those models.  Effectively, these identifiers can tie together the three components that underpin data-driven discovery—the literature, the supporting data, and the analytical tools—thus enhancing discoverability and streamlining scientific communication.

NLM’s long-standing role in collecting, organizing, and making available the biomedical literature positions us well to take on the task of tracking research models, but is that something we should do?

If so, what might that library of models look like? What else should it include? And how useful would this library of models be to you?

Photo credit (stool, top): Doug Belshaw [Flickr (CC BY 2.0) | erased text from original]

The Rise of Computational Linguistics Geeks

Guest post by Dina Demner-Fushman, MD, PhD, staff scientist at NLM.

“So, what do you do for a living?”

It’s a natural dinner party question, but my answer can prompt that glazed-over look we all dread.

I am a computational linguist, also known (arguably) as a specialist in natural language processing (NLP), and I work at the National Library of Medicine.

If I strike the right tone of excitement and intrigue, I might buy myself a few minutes to explain.

My work combines computer science and linguistics, and since I focus on biomedical and clinical texts, it also requires adding some biological, medical, and clinical know-how to the mix.

I work specifically in biomedical natural language processing (BioNLP). The definition of BioNLP has varied over the years, with the spotlight shifting from one task to another—from text mining to literature-based discovery to pharmacovigilance, for example—but the core purpose has remained essentially unchanged: training computers to automatically understand natural language to speed discovery, whether in service of research, patient care, or public health.

The field has been around for a while. In 1969 NIH researchers Pratt and Pacak described the early hope for what we now call BioNLP in the paper, “Automated processing of medical English,” which they presented at a computational linguistics conference:

The development of a methodology for machine encoding of diagnostic statements into a file, and the capability to retrieve information meaningfully from [a] data file with a high degree of accuracy and completeness, is the first phase towards the objective of processing general medical text.

NLM became involved in the field shortly thereafter, first with the Unified Medical Language System (UMLS) and later with tools to support text processing, such as MetaMap and TextTool, all of which we’ve improved and refined over the years. The more recent Indexing Initiative combines these tools with other machine learning methods to automatically apply MeSH terms to PubMed journal articles. (A human checks the computer’s work, revising as needed.)

These and NLM’s other NLP developments help improve the Library’s services, but they are also freely shared with the world, broadening our impact but more importantly, helping to handle the global proliferation of scientific and clinical text.

It’s that last piece that makes NLP so hot right now.

NLP, we’re finding, can take in large numbers of documents and locate relevant content, summarize text, apply appropriate descriptors, and even answer questions.

It’s every librarian’s—and every geek’s—dream.

But how can we use it?

Imagine, for example, the ever-expanding volume of health information around patients’ adverse reactions to medications. At least four different—and prolific—content streams feed into that pool of information:

  • the reactions reported in the literature, frequently in pre-market research (e.g., in the results of clinical trials);
  • the labeled reactions, i.e., the reactions described in the official drug labels provided by manufacturers;
  • the reactions noted in electronic health records and clinical progress notes; and
  • the reactions described by patients in social media.

NLM’s work in NLP—and its funding of extramural research in NLP—is helping develop approaches and resources for extracting and synthesizing adverse drug reactions from all four streams, giving a more complete picture of how people across the spectrum are responding to medications.

It’s a challenging task. Researchers must address different vocabularies and language structures to extract the information, but NLP, and my fellow computational linguists, will, I predict, prove up to it.

Now imagine parents seeking health information regarding their sick child.

NLP can answer their question, first by understanding key elements in the incoming question and then by providing a response, either by drawing upon a database of known answers (e.g., FAQs maintained by the NIH institutes) or by summarizing relevant PubMed or MedlinePlus articles. Such quick access to accurate and trustworthy health information has the potential to save time and to save lives.

We’re not fully there yet, but as our research continues, we get closer.

Maybe it’s time I reconsider how I answer that perennial dinner party question: “I’m a computational linguist, and I help improve health.”

headshot of Dr. Demner-FushmanDina  Demner-Fushman, MD, PhD is a staff scientist in NLM’s Lister Hill National Center for Biomedical Communications. She leads research in information retrieval and natural language processing focused on clinical decision-making, answering clinical and consumer health questions, and extracting information from clinical text.