The Next Normal: Supporting Biomedical Discovery, Clinical Practice, and Self-Care

As we start year three of the COVID-19 pandemic, it’s time for NLM to take stock of the parts of our past that will support the next normal and what we might need to change as we continue to fulfill our mission to acquire, collect, preserve, and disseminate biomedical literature to the world.

Today, I invite you to join me in considering the assumptions and presumptions we made about how scientists, clinicians, librarians and patients are using critical NLM resources and how we might need to update those assumptions to meet future needs. I will give you a hint… it’s not all bad—in fact, I find it quite exciting!

Let’s highlight some of our assumptions about how people are using our services, at least from my perspective. We anticipated the need for access to medical literature across the Network of the National Library of Medicine and created DOCLINE, an interlibrary loan request routing system that quickly and efficiently links participating libraries’ journal holdings. We also anticipated that we were preparing the literature and our genomic databases for humans to read and peruse. Now we’re finding that more than half of the accesses to NLM resources are generated and driven by computers through application programming interfaces. Even our MedlinePlus resource for patients now connects tailored electronic responses through MedlinePlus Connect to computer-generated queries originating in electronic health records.

Perhaps, and most importantly, we realize that while sometimes the information we present is actually read by a living person, other times the information we provide—for example, about clinical trials (ClinicalTrials.gov) or genotype and phenotype data (dbGaP)—is actually processed by computers! Increasingly, we provide direct access to the raw, machine-readable versions of our resources so those versions can be entered into specialized analysis programs, which allow natural-language processing programs to find studies with similar findings or machine-learning models to determine the similarities between two gene sequences. For example, NLM makes it possible for advocacy groups to download study information from all ClinicalTrials.gov records so anyone can use their own programs to point out trials that may be of interest to their constituents or to compare summaries of research results for related studies.

Machine learning and artificial intelligence have progressed to the point that they perform reasonably well in connecting similar articles—to this end, our LitCovid open-resource literature hub has served as an electronic companion to the human curation of coronavirus literature. NLM’s LitCovid is more efficient and has a sophisticated search function to create pathways that are more relevant and are more likely to curate articles that fulfill the needs of our users. Most importantly, innovations such as LitCovid help our users manage the vast and ever-growing collection of biomedical literature, now numbering more than 34 million citations in NLM’s PubMed, the most heavily used biomedical literature citation database.

Partnerships are a critical asset to bring biomedical knowledge into the hands (and eyes) of those who need it. Over the last decade, NLM moved toward a new model for managing citation data in PubMed. We released the PubMed Data Management system that allows publishers to quickly update or correct nearly all elements of their citations and that accelerates the delivery of correct and complete citation data to PubMed users.

As part of the MEDLINE 2022 Initiative, NLM transitioned to automated Medical Subject Headings (MeSH) indexing of MEDLINE citations in PubMed. Automated MeSH indexing significantly decreases the time for indexed citations to appear in PubMed without sacrificing the quality MEDLINE is known to provide. Our human indexers can focus their expertise on curation efforts to validate assigned MeSH terms, thereby continuously improving the automated indexing algorithm and enhancing discoverability of gene and chemical information in the future.

We’re already preparing for the next normal—what do you think it will be like?

I envision making our vast resources increasingly available to those who need them and forging stronger partnerships that improve users’ ability to acquire and understand knowledge. Imagine a service, designed and run by patients, that could pull and synthesize the latest information about a disease, recommendations for managing a clinical issue, or help a young investigator better pinpoint areas ripe for new interrogation! The next normal will make the best use of human judgment and creativity by selecting and organizing relevant data to create a story that forms the foundation of new inquiry or the basis of new clinical care. Come along and help us co-create the next normal!

Informing Success from the Outside In: Introducing the NLM Board of Regents CGR Working Group

Guest post by Valerie Schneider, PhD, staff scientist at the National Library of Medicine (NLM) National Center for Biotechnology Information (NCBI), National Institutes of Health (NIH), and Kristi Holmes, PhD, Director of Galter Health Sciences Library & Learning Center and Professor of Preventive Medicine at Northwestern University Feinberg School of Medicine.

Last year, we described how NLM is developing the NIH Comparative Genomics Resource (CGR)—a project that offers content, tools, and interfaces for genomic data resources associated with eukaryotic research organisms—in two blog posts:

Eukaryote refers to any single-celled or multicellular organisms whose cell contains a distinct and membrane-bound nucleus. Since eukaryotes all likely evolved from the same common ancestry, studying them can grant us insight into how other eukaryotes—including those in humans—work and makes CGR and its resources that much more important to eukaryotic research.

CGR aims to:

  • Promote high-quality eukaryotic genomic data submission.
  • Enrich NLM’s genomic-related content with community-sourced content.
  • Facilitate comparative biological analyses.
  • Support the development of the next generation of scientists.

Since our last two posts, the team at NCBI has been hard at work making important technical and content updates to and socializing CGR’s suite of tools. For instance, they published new webpages that organize genome-related data by taxonomy, making it available for browsing and immediate download. They also created the ClusteredNR Database, a new database for the Basic Local Alignment Search Tool (BLAST), to provide results with greater taxonomic context for sequence searches, and incorporated new gene information from the Alliance of Genome Resources, an organization that unites data and information for model organisms’ unique aspects, into Gene. NCBI is also engaging with genomics communities to understand their needs and requirements for comparative genomics through the NLM Board of Regents Comparative Genomics Working Group.

The working group is lending their perspective and extensive expertise to the project, activities that are essential to CGR’s success and development. We have charged working group members with guiding the development of a new approach to scientific discovery that relies on genomic-related data from research organisms, helping project teams keep pace with changes in the field, and understanding the scientific community’s needs and expectations for key functionalities. To do this, working group members help NLM set development priorities such as exploring CGR’s integration with existing infrastructures and related workforce development opportunities.

Projects like CGR highlight how critical interdisciplinary collaboration is to modern research and how success requires community perspectives and involvement. Working group members will be sharing more information about this project at upcoming conferences and in biomedical literature, and our team at NCBI will also share events and resources through our NIH Comparative Genomics Resource website.

If you are a member of a model organism community, are working on emerging eukaryotic research models, or support eukaryotic genomic data—whether you are a researcher, educator, student, scholarly society member, librarian, data scientist, database resource manager, developer, epidemiologist, or other stakeholder in our progress—we encourage you to reach out and get involved. Here are a few suggestions:

  • Invite us to join you at a conference, teach a workshop, partner on a webinar, or discuss other ideas you may have to foster information sharing and feedback.
  • Use and share CGR’s suite of tools and share your feedback.
  • Be on the lookout for project updates and events on the CGR website or follow @NCBI on Twitter.

We’re always excited to get feedback through CGR listening sessions and user testing for tool and resource updates. Email cgr@nlm.nih.gov to learn all the ways you can participate.

Thank you to the members of the NLM Board of Regents CGR Working Group!

Alejandro Sanchez Alvarado, PhD

Executive Director and Chief Scientific Officer
Priscilla Wood Neaves Chair in the Biomedical Sciences
Stowers Institute for Medical for Medical Research

Hannah Carey, PhD
Professor, Department of Comparative Biosciences, School of Veterinary Medicine
University of Wisconsin-Madison

Wayne Frankel, PhD
Professor, Department of Genetics & Development
Director of Preclinical Models, Institute of Genomic Medicine
Columbia University Medical Center

Kristi L. Holmes, PhD (Chair)
Director, Galter Health Services Library & Learning Center
Professor of Preventive Medicine (Health & Biomedical Informatics)
Northwestern University Feinberg School of Medicine

Ani W. Manichaikul, PhD
Associate Professor, Center for Public Health Genomics
University of Virginia School of Medicine

Len Pennacchio, PhD
Senior Scientist
Lawrence Berkeley National Laboratory

Valerie Schneider, PhD (Executive Secretary)
Program Head, Sequence Enhancements, Tools and Delivery (SeqPlus)
HHS/NIH/NLM/NCBI

Kenneth Stuart, PhD
Professor, Center of Global Infectious Disease Research
Seattle Children’s Research Institute

Tandy Warnow, PhD
Grainger Distinguished Chair in Engineering
Associate Head of Computer Science
University of Illinois, Champaign-Urbana

Rick Woychik, PhD (NIH CGR Steering Committee Liaison)
Director, National Institute of Environmental Health Sciences (NIEHS) and the National Toxicology Program (NTP)

Cathy Wu, PhD
Unidel Edward G. Jefferson Chair in Engineering and Computer Science
Director, Center for Bioinformatics & Computational Biology
Director, Data Science Institute
University of Delaware

Dr. Schneider is the deputy director of Sequence Offerings and the head of the Sequence Plus program. In these roles, she coordinates efforts associated with the curation, enhancement, and organization of sequence data, as well as oversees tools and resources that enable the public to access, analyze, and visualize biomedical data. She also manages NCBI’s involvement in the Genome Reference Consortium, which is the international collaboration tasked with maintaining the value of the human reference genome assembly.

Dr. Holmes is dedicated to empowering discovery and equitable access to knowledge through the development of computational and social architectures to support these goals. She also serves on the leadership team of the Northwestern University Clinical and Translational Sciences Institute.

Bridging the Resource Divide for Artificial Intelligence Research

This blog post is by Lynne Parker, Director, National AI Initiative Office and was originally posted on the White House Office of Science and Technology Policy blog. The Office of Science and Technology Policy and the National Science Foundation are seeking comments on the initial findings and recommendations contained in the interim report of the National Artificial Intelligence Research Resource (NAIRR) Task Force (“Task Force”) and particularly on potential approaches to implement those recommendations. We encourage you to read the RFI and submit comments on Implementing Initial Findings and Recommendations of the National Artificial Intelligence Research Resource Task Force by June 30, 2022.

Artificial Intelligence (AI) is transforming our world. The field is an engine of innovation that is already driving scientific discovery, economic growth, and new jobs. AI is an integral component of solutions ranging from those that tackle routine daily tasks to societal-level challenges, while also giving rise to new challenges necessitating further study and action. Most Americans already interact with AI-based systems on a daily basis, such as those that help us find the best routes to work and school, select the items we buy, and ask our phones to remind us of upcoming appointments.

Once studied by few, AI courses are now among the most popular across America’s universities. AI-based companies are being founded and scaled at a rapid rate. Worldwide AI-related research publications and patent applications continue to climb. 

However, this growth in the importance of AI to our future and the size of the AI community obscures the reality that the pathways to participate in AI research and development (R&D) often remain limited to those with access to certain essential resources. Progress at the current frontiers of AI is often tied to the use of large volumes of advanced computational power and data, and access to those resources today is too often limited to large technology companies and well-resourced universities. Consequently, the breadth of ideas and perspectives incorporated into AI innovations can be limited and lead to the creation of systems that perpetuate biases and other systemic inequalities.

This growing resource divide has the potential to adversely skew our AI research ecosystem, and in the process, threaten our Nation’s ability to cultivate an AI research community and workforce that reflects America’s rich diversity – and harness AI in a manner that serves all Americans. To prevent unintended consequences or disparate impacts from the use of AI, it matters who is doing the AI research and development.

Established in June 2021 pursuant to the National AI Initiative Act of 2020, the National AI Research Resource (NAIRR) Task Force has been seeking to address this resource divide. As a Congressionally-chartered Federal advisory committee, the NAIRR Task Force has been developing a plan for the establishment of a National AI Research Resource that would democratize access to AI R&D for America’s researchers and students. The NAIRR is envisioned as a broadly available and federated collection of resources, including computational infrastructure, public- and private-sector data, and testbeds. These resources would be made easily accessible in a manner that protects privacy, with accompanying educational tools and user support to facilitate their use. An important element of the NAIRR will be the expertise to design, deploy, federate, and operate these resources.

Since its establishment, the Task Force has held 7 public meetings, engaged with 39 experts on a wide range of aspects related to the design of the NAIRR, and considered 84 responses from the public to a request for information (RFI). Materials from all public meetings and responses to the RFI can be found at www.AI.gov/nairrtf.

Today, as co-chair of the Task Force and as part of OSTP’s broader work to advance the responsible research, development, and use of AI, I am proud to announce the submission of the interim report of the NAIRR Task Force to the President and Congress. This report lays out a vision for how this national cyberinfrastructure could be structured, designed, operated, and governed to meet the needs of America’s research community. In the report, the Task Force presents an approach to establishing the NAIRR that builds on existing and future Federal investments; designs in protections for privacy, civil rights, and civil liberties; and promotes diversity and equitable access. It details how the NAIRR should support the full spectrum of AI research – from foundational to use-inspired to translational – by providing opportunities for students and researchers to access resources that would otherwise be out of their reach. The vision laid out in this interim report is the first step towards a more equitable future for AI R&D in America – a future where innovation can flourish and the promise of AI can be realized in a way that works for all Americans.

Going forward, the Task Force will develop a roadmap for achieving the vision defined in the interim report. This implementation roadmap is planned for release as the final report of the Task Force at the end of this year. To inform this work, we are asking for feedback from the public on the findings and recommendations presented in the interim report as well as how those recommendations could be effectively implemented. Public responses to this request for information will be accepted through June 30, 2022. In addition, OSTP and the National Science Foundation will host a public listening session on June 23 to provide additional means for public input. Please see here for more information on how to participate.

If successful, the NAIRR would transform the U.S. national AI research ecosystem by strengthening and democratizing foundational, use-inspired, and translational AI R&D in the United States. The interim report of the NAIRR Task Force being released today represents a first step towards this future, putting forward a vision for the NAIRR for public comment and feedback.

A New Frontier: The Impact of a 1959 Board Meeting

Guest blog by Ken Koyle, MA, Deputy Chief of the History of Medicine Division (HMD) at the NIH National Library of Medicine. This post celebrates the important work performed by our archival professionals and the archival collections held by the library, from which the source material was drawn, as NLM celebrates International Archives Week #IAW2022.

In November 1959, when construction of NLM’s current building at NIH was still underway and digital computing was in its infancy, the NLM Board of Regents convened on the third floor of the Old Red Brick building for a demonstration of the indexing process. When Board Chairman Michael E. DeBakey, MD, asked if computer technology could be used in indexing, NLM Director Col. Frank B. Rogers, MD, was ready with an answer. Dr. Rogers, clearly interested in the emerging technology of automated data processing (ADP), described an article by Robert S. Ledley, DDS, in that month’s issue of Science and noted that Dr. Ledley was already contracted with NLM to report on using computers in indexing.

Black-and-white photo of Dr. Rogers leaning on a stack of books with bookshelves in background.
Dr. Frank Rogers at NLM, 1962.

Dr. Rogers was instrumental in NLM’s first explorations of automated processes and had a clear vision of the potential of electronic computing, including how it could improve efficiency at NLM, but his optimism was tempered by prescient realism. Dr. Rogers recognized—and conveyed to the Board—that the potential benefits of ADP would require a commensurate investment of staff time and labor. “We should not forget that ‘automatically’ means ‘because we told it to do so beforehand,’ and this in itself may turn out to be quite a trick.” Dr. Rogers made it clear that the computer age would bring a change in work, but not necessarily a reduction in work. “Remarkable as the capacity of the computer may be for sustaining a long sequence of operations, it is nevertheless ultimately only the end-phase of that still longer sequence which must include as a first phase the human labor of input.”

Acknowledging the upfront labor investment in ADP was only part of Dr. Rogers’ insight. He also explained that the human work was not only substantial and necessary, but also incredibly complex: “The instructions [for a computer] are a thousand times more detailed, for the simplest task, than those required to be given to the . . . clerk.” Unleashing computers’ potential would require staff to think in new ways, conceive new methods of organizing data, and embark on a new journey of continuous learning and professional development.

Black-and-white photo of members of the NLM Board of Regents posing for a photo. Four members sit behind a table stacked with papers. 13 members stand in the background. Dr. Rogers is featured on the far right.
Dr. Frank Rogers (far right) with the NLM Board of Regents meeting in the “Old Red Brick,” 1957.

Along with the challenges of training staff to work with ADP equipment came the interminable problem of cost. Much as today’s public institutions are grappling with the costs of cloud computing, digitization, and increasing storage requirements, Dr. Rogers had to balance the potential benefits with the considerable costs of computer equipment. The type of computer necessary to realize Dr. Rogers’ vision would cost about $1.5 million in 1960—98% of NLM’s total budget of $1,566,000.

Undeterred, Dr. Rogers found an answer to the funding problem by collaborating with another agency that would benefit from the increased processing speed of scientific literature that the envisioned system could provide: the National Heart Institute. They provided the initial funding, NLM did the legwork, and in 1963, the new MEDLARS computer went into service. Dr. Rogers had realized his vision of bringing automated indexing to NLM. As Surgeon General Luther Terry said at the Board meeting in April 1961, “If any institution ever stood on the borderland of a new frontier it is the National Library of Medicine.”

Computer operators working with the Honeywell 800 mainframe computer, originally acquired by NLM in the 1960s.

Dr. Rogers was very clear about the issues of cost, labor, and expectations in his 1960 presentation to the Board, including his overarching concern about balancing NLM’s core mission with these potential new directions:

[The] purpose of the Library is not to operate a particular machine system, however great an acrobatic achievement that might be in itself. It is not to publish and distribute a particular index in a particular way, however ingenious and successful that operation may be deemed to be. It is not even just to be a good library, however great and distinguished that library may be. It is rather, by virtue of being a library, to use every available bibliothecal means to promote awareness of and access to the subject content of recorded medical knowledge, to the end that the science of medicine will advance and prosper.

More than 60 years later, NLM still holds fast to that purpose. As stated in our statutory mission and reiterated in our current strategic plan, we are here “to assist the advancement of medical and related sciences and to aid in the dissemination and exchange of scientific and other information important to the progress of medicine and to the public health.” Our continued pioneering work in data science is just one way we accomplish that mission.

Mr. Koyle joined HMD in the NLM Division of Library Operations in 2012. Before joining NLM, Ken served as a medical evacuation helicopter pilot and a historian in the U.S. Army. He is the co-editor with Jeffrey Reznick of Images of America: U.S. National Library of Medicine, a collaborative work with HMD staff.

%d bloggers like this: