Data Science Tools Will Speed Rare Disease Solutions

image of NLM building with red, green, and blue uplighting and the Rare Disease Day logo

From the NLM Director: I’m happy to have my colleague, Joni Rutter, the Director of NCATS, bring forward these important issues to the Musings audience. NLM is a proud partner to NCATS in data science, effective use of genomic data, and providing accurate health information to the public.

Guest post by Joni L. Rutter, PhD, Director of the National Center for Advancing Translational Sciences (NCATS).

When it comes to rare diseases, the data are stark.

More than 10,000 rare diseases affect up to 400 million people worldwide, including over 30 million people in the United States. Those with rare diseases struggle for about six years on average before they receive an accurate diagnosis. Unfortunately, a diagnosis usually does not deliver a therapeutic answer: Only about 5% of rare diseases have treatments that are approved by the U.S. Food and Drug Administration. When you add up all the related direct and indirect costs, rare diseases carry a U.S. economic burden of nearly $1 trillion every year.

But these numbers also offer hope. Data-driven innovations are unlocking answers about rare diseases—as well as more common diseases—faster than ever before, and that’s why data science is so important to NCATS’ vision of more treatments for all people more quickly.

One of our key strategies is to leverage or connect existing data in new and meaningful ways. This year’s Rare Disease Day at NIH event highlighted several ways NCATS is applying this approach to help address the public health challenge of rare diseases.

Here’s a snapshot of key opportunities and the data-driven solutions we’re developing.

Raising Awareness and Educating 

The Genetic and Rare Diseases (GARD) Information Center uses data science to speed the translation of research findings into reliable, accurate, and understandable information that patients can use to learn about rare diseases and find other helpful resources. Each year, millions of people tap GARD’s online resources, and GARD Information Specialists answer thousands of individual questions every year on how to learn about a rare disease, find out more about specialists or clinical studies, and seek diagnostic help for themselves or their loved ones.

We’re in the process of modernizing the GARD website so it can pull information automatically from a range of trusted data sources, including NLM’s MedlinePlus, Unified Medical Language System, and NCBI MedGen. We’re also applying user experience and health literacy best practices to GARD’s website to better address patients’ and caregivers’ changing information needs.

Please check out our other user-friendly and helpful rare diseases community resources.

Shortening the Diagnostic Odyssey 

We are currently exploring use of real-world data from electronic health records and other clinical data sources to shorten the journey to get a correct rare disease diagnosis. This work is part of a broader research effort that focuses on using genetic analysis, machine learning, and clinical evaluation to make it easier for front-line health care providers to diagnose people with rare diseases correctly. Large data enclaves of integrated patient data such as the NCATS-led National COVID Cohort Collaborative (N3C) model have enormous promise to speed the identification of signs or signals of specific rare diseases.

Our study on rare disease medical costs also showed the potential of using machine-learning strategies to speed rare disease diagnoses from health care systems and insurance claims data. But we also need to make sure these artificial intelligence/machine learning (AI/ML) strategies are free from bias. To that end, we launched a new challenge to jump start the development of AI/ML tools that detect and correct bias in health care algorithms, with the goal of improving clinician and patient trust in AI/ML-enabled clinical decision-making support tools.

Discovering and Developing New Drugs 

Many promising therapeutic candidates already exist; the challenge is finding them among vast and disparate data sets. To bridge data silos, NCATS has invested heavily in organizing, aggregating, and harmonizing high-quality data, and we make those data available openly and responsibly. We’re applying AI/ML to the task of therapeutic discovery and development through efforts like the Biomedical Data Translator and A Specialized Platform for Innovative Research Exploration (ASPIRE). I’m also excited to see how the winners of NCATS’ LitCoin Natural Language Processing (NLP) Challenge prize will use natural language processing to transform information across biomedical literature into new concepts and hypotheses to be tested.

What else can innovative data science tools deliver for people with rare diseases and others with unmet health needs? Please leave a comment with your thoughts!

headshot of Joni L. Rutter, Ph.D., Director, National Center for Advancing Translational Sciences, National Institutes of Health

Joni L. Rutter, PhD

Director, National Center for Advancing Translational Sciences

Dr. Rutter is responsible for the planning and execution of the complex and multifaceted NCATS preclinical and clinical programs. Her primary objective as Director of NCATS is to use translational science to provide all people with more treatments more quickly and efficiently. Before joining NCATS, she served as the Director of Scientific Programs within the NIH All of Us Research Program and prior to that as Director of the Division of Neuroscience and Behavior at the NIH National Institute on Drug Abuse.

2 thoughts on “Data Science Tools Will Speed Rare Disease Solutions

Leave a Reply