NLM . . . Bridging the Gap between COVID-19 Data and Resources

Guest post by Stephen Sherry, PhD, Acting Director of the National Library of Medicine’s National Center for Biotechnology Information (NCBI), and Bart Trawick, PhD, Director of the NCBI Customer Services Division.

A little over two years ago, America woke to the emerging SARS-CoV-2 pandemic that would alter everyone’s perception of ‘normal’ in the months and years to follow. From the start, NLM’s technological infrastructure quickly bridged the gap between resources and action to support efforts to study, understand, and develop a plan of action to deal with the deadly virus that has claimed the lives of more than 6 million people worldwide according to the World Health Organization COVID-19 dashboard. NLM has worked throughout the pandemic to provide timely data and develop digital resources to help combat this global crisis.

NLM’s COVID-19 Resources

To assist in understanding the most fundamental questions surrounding the virus, NLM provides coronavirus-related tools and services centered around genetic information, literature, and clinical research protocols.

SARS-CoV-2 Gene Sequences in GenBank

In January 2020, NLM released the first complete severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome to the public through our GenBank database, the world’s largest database of publicly available genetic sequences. It’s a significant scientific community accomplishment to release a fully annotated viral genome within a month of its detection in a human population and NLM is proud to have played a role in this. This information was essential to scientists – because only with the genetic sequence could they begin to determine the specific properties of the virus and its evolutionary relationship to other viruses.

Creating a Lens to Information

An immediate challenge during the early months of the pandemic was the lack of a standardized vocabulary for talking about the virus and the disease it caused, COVID-19. This made searching the biomedical literature for coronavirus information difficult. To address this issue, NLM developed the LitCovid tool, which provided researchers with a current and curated literature hub for coronavirus information. Its handy classification system makes it easy for users to find the type of information they are interested in (e.g., “Mechanism” or “Treatment”). To further expand the application of this tool, NLM incorporated it into the PubMed Clinical Queries search interface.

Bringing COVID-19 Information to Your Fingertips

One of NLM’s biggest challenges amid the demand for COVID-19 information has been organizing all the data, tools, and resources related to SARS-CoV-2 and COVID-19. To date, there are over 3 million SARS-CoV-2–related sequencing experiments in NLM’s Sequence Read Archive (SRA), the world’s largest publicly available repository of high-throughput genetic sequencing data; 4 million SARS-CoV-2 sequences in GenBank; more than 7,500 registered coronavirus clinical trials in NLM’s, the world’s largest clinical trial registry and results database; and 285,000 COVID-19 articles in PubMed Central (PMC), NLM’s digital archive of nearly 8 million freely accessible, full-text biomedical and life sciences journal articles. To organize and make this wealth of information findable, we created the NCBI SARS-CoV-2 resources page and regularly update the NLM homepage with news and information to guide users to relevant information.

Information When You Need It Most

Patients and their families also need access to up-to-date, reliable information, and our MedlinePlus web resource added a COVID-19 page to address this. For people seeking information on clinical studies related to COVID-19, our resource provides this, along with a dedicated page that breaks down all COVID-19 studies by funding source.

NLM Answers the Call for Access to COVID-19 Publications

NLM answered the call from the White House Office of Science and Technology Policy and science policy leaders of other nations by collaborating with publishers and scholarly societies to provide free and immediate public access to all coronavirus-related publications and associated data via PubMed Central (PMC) as part of its Public Health Emergency COVID-19 Initiative.

This initiative also enabled artificial intelligence researchers to contribute to the COVID-19 response effort by making more than 200,000 full-text articles available in formats and under license terms that enabled computational analysis as part of efforts such as the COVID-19 Open Research Dataset (CORD-19) Challenge and Text Retrieval Conference (TREC) COVID Challenge.  These global challenges aimed to improve and apply natural language processing and other AI techniques to coronavirus literature in an effort to generate new insights into the disease.

Reaching More People in More Ways

As the Nation’s archive for biotechnology information, we rely upon scientists to freely share data with us so that others may benefit. To assist submitters in getting this data into our sequence archives, we quickly worked to automate and simplify submissions, expedite data analysis, and prioritize data release to within minutes of submission. NLM investments ensured that database management work continues to scale in tandem with increased submission rates.

None of this work would have been possible without the support of NIH leadership and the efforts of NLM staff who accomplished great feats during a time of transition to remote work. We will apply the lessons learned during this period to serve the continuing needs of our community as the biology continues to evolve around us.

As NCBI Acting Director, Dr. Sherry plans, directs, and manages the research, development, and technical operations of the National Center for Biotechnology Information. He has 25 years of experience performing research, education, and data resource management involving human variation, genetics, and genomics. Dr. Sherry is a graduate of the Pennsylvania State University.

 As Director of the NCBI Customer Services Division, Dr. Trawick works to connect customers with the vast information resources available from NCBI. Dr. Trawick is a graduate of Texas A&M University and the University of Texas Health Science Center at Houston.

One thought on “NLM . . . Bridging the Gap between COVID-19 Data and Resources

Leave a Reply