Do You Play Word Games?

There is an astoundingly popular word game in which the player gets six tries to guess the word of the day, which has been pre-drawn from a list of five-letter words. The only skills one needs are the ability to recognize the alphabet and basic English-language spelling ability. My sisters and I play every day and compare how many tries it takes each of us to come up with the answer. It’s fun, challenging, and easy at the same time, and it gives us a quick way to share time together.

Today’s answer got me thinking (no spoiler alerts here!), what words describe NLM, its mission, and its impact? Let me share a few with you:

HEALTH

Health, that state of optimal well-being for all, is the North Star of all we do here at the National Institutes of Health. NIH’s motto is “Turning Discovery Into Health,” and NLM’s job is to turn information into discovery. The literature collected by NLM provides rich descriptions that help scientists and clinicians understand health and illness, discover new therapies, and relay patients’ experience. In fact, NLM has played an important role in almost every biomedical and clinical discovery of the past 50 years, each of which fosters the world’s understanding of health.

TRUST

The cornerstone of our great national library is the provision of high-quality, trusted resources to the scientific community and general public. We imbue trust in our resources by following important principles of libraries, including collecting widely from literature resources recognized for meeting standards of scientific communication. We provide documentation and publicly available standard practices and policies. Our work is overseen by an NLM Board of Regents as well as by NIH leadership. These checks and balances help us accommodate a body of scientific resources that are congruent with the scientific and clinical knowledge at the time they are collected and reflective of diverse viewpoints and knowledge maturation.

SERVE

NLM serves science and society by collecting, curating, and connecting all types of scientific communication artifacts and making these accessible to the public. Our biomedical literature resources are open to the world, presenting almost 35 million citations, close to 8.5 million machine- and human-readable full-text articles, and over 1,000 consumer-level health information topics. We provide specialized genomic data resources that help scientists discover the origins of life for many species. By linking the genomic data with the literature, NLM can help clinicians make decisions about how to treat complex illnesses that arise from genetic anomalies.

ALIGN

Biologists use laboratory procedures to distill the genetic material out of samples collected from humans, animals, and other types of matter like wastewater and then compare the sampled genetic material to other known records of genetic materials. Using this process, scientists align and compare one set of proteins gleaned from their experiments to others stored in our genomic repositories to detect genetic anomalies or determine if a discovered sequence is actually a new organism or a variant of a known species. Researchers “align” this newly acquired genetic structure with known structures. But we have millions of records of genetic samples, so this process can be time consuming. However, NLM has built the tool to blast through this alignment challenge!

BLAST

The Basic Local Alignment Search Tool (BLAST) is an algorithm and program developed by NLM staff at our National Center for Biotechnology Information that finds regions of similarities between genetic sequences. The program compares nucleotide or protein sequences to reference sequence databases and calculates the statistical significance of any matches. BLAST helps scientists understand functional and evolutionary relationships between sequences, and it can also be used to identify members of gene families.

It actually takes more than a few five-letter words to describe what NLM does and what it means to science and society. Nonetheless, it was quite fun to wordplay NLM!

Giving Thanks Where Thanks is Due

One of the great joys of being the Director of the National Library of Medicine is the many opportunities for me to express gratitude. In the past, I have given thanks to NLM staff who are veterans (2021), for progress during my tenure (2020), and to our amazing NLM staff members (2019). This year, I am pausing to give thanks for the outstanding products and services developed and stewarded by our NLM staff, made available every day of the year to anyone with an internet connection—and even to some without!

First, I am thankful for our information collections in their many forms. The NLM Board of Regents oversees our Collection and Preservation Policy, which guides NLM as it meets its mission to acquire, organize, preserve, and disseminate biomedical knowledge from around the world. Our collection spans ten centuries from the 11th to the 21st, and ranges from the third oldest Arabic medical manuscript in existence to the “Rosetta Stone” of modern science, Marshall Nirenberg’s genetic chart, from genomic sequences essential for current and future research to information for mothers taking care of sick children.

Organizing the collections and making them findable and accessible builds on the knowledge of library and information science. This foundational knowledge means we can tag objects—real or virtual—with codes and terms that help with organization and retrieval. It also means we use our knowledge of library and information science to guide efforts to annotate and curate molecular data, literature citations, and images so they are accessible to the public. So I am grateful not only for the 66 miles of shelving that hold our precious objects, books, and journals here in Bethesda, but for the ever-powerful computer clouds that preserve our high-value research databases and 34 million bibliographic citations in PubMed. Libraries do more than house books; they use sophisticated knowledge to organize materials and make them readily available.

I am thankful for the ways that staff at NLM’s National Center for Biotechnology Information (NCBI) manages the submission, curation, and dissemination of our enormous genomic and molecular databases. From ClinVar (our collection of genomic sequences linked to clinical annotation) to the Sequence Read Archive (the world’s largest scientific data repository), our staff makes sure that depositors can effectively deposit data, scientific curators can conduct quality checks, and web and interface designers allow access to the data. A few years ago, the NCBI team led a cloud migration process to make available data from the entire 15-petabyte SRA resource on two commercial cloud providers. This bold step democratized sequence-based scientific inquiry and harnessed the computational power of cloud platforms, which contributed to industrial innovations and shortened the pathway for scientific discovery from days and months to minutes and hours. I am thankful for the role NLM plays in accelerating scientific advances and leveraging research resources for public health benefit.

NLM offers more than 1,000 easy-to-read health topic articles through our online consumer health information resource known as MedlinePlus. MedlinePlus is available in both English and Spanish, thereby assuring information access to speakers of two of the world’s most common languages. Through MedlinePlus Connect, our technical team also provides direct, tailored access to MedlinePlus resources automatically through electronic health records, patient portals, and other health information technology systems to deliver information from MedlinePlus to patients and providers at the point of care. I am thankful for the efforts of the MedlinePlus teams that bring timely and trusted information to the lives of everyone, everywhere.

I hinted earlier that there are two main pathways to access NLM products and services. Electronic access, supporting both human- and machine-readable forms, is by far the most common pathway to NLM. We also support the Network of the National Library of Medicine (NNLM) and its more than 8,000 members around the country in public, hospital, and academic medical center libraries to bring the power of NLM and its resources to the public. I am grateful for everyone who works as part of NNLM for their ability to bring NLM’s products and services to communities everywhere as well as how the needs and practices of those communities bring awareness of NLM.

As you pause this year in thanksgiving for the many public services that support you in everyday life, please remember to give thanks for NLM’s products and services. We think they are world class, and we are grateful for our ability to serve you.

Informing Success from the Outside In: Introducing the NLM Board of Regents CGR Working Group

Guest post by Valerie Schneider, PhD, staff scientist at the National Library of Medicine (NLM) National Center for Biotechnology Information (NCBI), National Institutes of Health (NIH), and Kristi Holmes, PhD, Director of Galter Health Sciences Library & Learning Center and Professor of Preventive Medicine at Northwestern University Feinberg School of Medicine.

Last year, we described how NLM is developing the NIH Comparative Genomics Resource (CGR)—a project that offers content, tools, and interfaces for genomic data resources associated with eukaryotic research organisms—in two blog posts:

Eukaryote refers to any single-celled or multicellular organisms whose cell contains a distinct and membrane-bound nucleus. Since eukaryotes all likely evolved from the same common ancestry, studying them can grant us insight into how other eukaryotes—including those in humans—work and makes CGR and its resources that much more important to eukaryotic research.

CGR aims to:

  • Promote high-quality eukaryotic genomic data submission.
  • Enrich NLM’s genomic-related content with community-sourced content.
  • Facilitate comparative biological analyses.
  • Support the development of the next generation of scientists.

Since our last two posts, the team at NCBI has been hard at work making important technical and content updates to and socializing CGR’s suite of tools. For instance, they published new webpages that organize genome-related data by taxonomy, making it available for browsing and immediate download. They also created the ClusteredNR Database, a new database for the Basic Local Alignment Search Tool (BLAST), to provide results with greater taxonomic context for sequence searches, and incorporated new gene information from the Alliance of Genome Resources, an organization that unites data and information for model organisms’ unique aspects, into Gene. NCBI is also engaging with genomics communities to understand their needs and requirements for comparative genomics through the NLM Board of Regents Comparative Genomics Working Group.

The working group is lending their perspective and extensive expertise to the project, activities that are essential to CGR’s success and development. We have charged working group members with guiding the development of a new approach to scientific discovery that relies on genomic-related data from research organisms, helping project teams keep pace with changes in the field, and understanding the scientific community’s needs and expectations for key functionalities. To do this, working group members help NLM set development priorities such as exploring CGR’s integration with existing infrastructures and related workforce development opportunities.

Projects like CGR highlight how critical interdisciplinary collaboration is to modern research and how success requires community perspectives and involvement. Working group members will be sharing more information about this project at upcoming conferences and in biomedical literature, and our team at NCBI will also share events and resources through our NIH Comparative Genomics Resource website.

If you are a member of a model organism community, are working on emerging eukaryotic research models, or support eukaryotic genomic data—whether you are a researcher, educator, student, scholarly society member, librarian, data scientist, database resource manager, developer, epidemiologist, or other stakeholder in our progress—we encourage you to reach out and get involved. Here are a few suggestions:

  • Invite us to join you at a conference, teach a workshop, partner on a webinar, or discuss other ideas you may have to foster information sharing and feedback.
  • Use and share CGR’s suite of tools and share your feedback.
  • Be on the lookout for project updates and events on the CGR website or follow @NCBI on Twitter.

We’re always excited to get feedback through CGR listening sessions and user testing for tool and resource updates. Email cgr@nlm.nih.gov to learn all the ways you can participate.

Thank you to the members of the NLM Board of Regents CGR Working Group!

Alejandro Sanchez Alvarado, PhD

Executive Director and Chief Scientific Officer
Priscilla Wood Neaves Chair in the Biomedical Sciences
Stowers Institute for Medical for Medical Research

Hannah Carey, PhD
Professor, Department of Comparative Biosciences, School of Veterinary Medicine
University of Wisconsin-Madison

Wayne Frankel, PhD
Professor, Department of Genetics & Development
Director of Preclinical Models, Institute of Genomic Medicine
Columbia University Medical Center

Kristi L. Holmes, PhD (Chair)
Director, Galter Health Services Library & Learning Center
Professor of Preventive Medicine (Health & Biomedical Informatics)
Northwestern University Feinberg School of Medicine

Ani W. Manichaikul, PhD
Associate Professor, Center for Public Health Genomics
University of Virginia School of Medicine

Len Pennacchio, PhD
Senior Scientist
Lawrence Berkeley National Laboratory

Valerie Schneider, PhD (Executive Secretary)
Program Head, Sequence Enhancements, Tools and Delivery (SeqPlus)
HHS/NIH/NLM/NCBI

Kenneth Stuart, PhD
Professor, Center of Global Infectious Disease Research
Seattle Children’s Research Institute

Tandy Warnow, PhD
Grainger Distinguished Chair in Engineering
Associate Head of Computer Science
University of Illinois, Champaign-Urbana

Rick Woychik, PhD (NIH CGR Steering Committee Liaison)
Director, National Institute of Environmental Health Sciences (NIEHS) and the National Toxicology Program (NTP)

Cathy Wu, PhD
Unidel Edward G. Jefferson Chair in Engineering and Computer Science
Director, Center for Bioinformatics & Computational Biology
Director, Data Science Institute
University of Delaware

Dr. Schneider is the deputy director of Sequence Offerings and the head of the Sequence Plus program. In these roles, she coordinates efforts associated with the curation, enhancement, and organization of sequence data, as well as oversees tools and resources that enable the public to access, analyze, and visualize biomedical data. She also manages NCBI’s involvement in the Genome Reference Consortium, which is the international collaboration tasked with maintaining the value of the human reference genome assembly.

Dr. Holmes is dedicated to empowering discovery and equitable access to knowledge through the development of computational and social architectures to support these goals. She also serves on the leadership team of the Northwestern University Clinical and Translational Sciences Institute.

A New Frontier: The Impact of a 1959 Board Meeting

Guest blog by Ken Koyle, MA, Deputy Chief of the History of Medicine Division (HMD) at the NIH National Library of Medicine. This post celebrates the important work performed by our archival professionals and the archival collections held by the library, from which the source material was drawn, as NLM celebrates International Archives Week #IAW2022.

In November 1959, when construction of NLM’s current building at NIH was still underway and digital computing was in its infancy, the NLM Board of Regents convened on the third floor of the Old Red Brick building for a demonstration of the indexing process. When Board Chairman Michael E. DeBakey, MD, asked if computer technology could be used in indexing, NLM Director Col. Frank B. Rogers, MD, was ready with an answer. Dr. Rogers, clearly interested in the emerging technology of automated data processing (ADP), described an article by Robert S. Ledley, DDS, in that month’s issue of Science and noted that Dr. Ledley was already contracted with NLM to report on using computers in indexing.

Black-and-white photo of Dr. Rogers leaning on a stack of books with bookshelves in background.
Dr. Frank Rogers at NLM, 1962.

Dr. Rogers was instrumental in NLM’s first explorations of automated processes and had a clear vision of the potential of electronic computing, including how it could improve efficiency at NLM, but his optimism was tempered by prescient realism. Dr. Rogers recognized—and conveyed to the Board—that the potential benefits of ADP would require a commensurate investment of staff time and labor. “We should not forget that ‘automatically’ means ‘because we told it to do so beforehand,’ and this in itself may turn out to be quite a trick.” Dr. Rogers made it clear that the computer age would bring a change in work, but not necessarily a reduction in work. “Remarkable as the capacity of the computer may be for sustaining a long sequence of operations, it is nevertheless ultimately only the end-phase of that still longer sequence which must include as a first phase the human labor of input.”

Acknowledging the upfront labor investment in ADP was only part of Dr. Rogers’ insight. He also explained that the human work was not only substantial and necessary, but also incredibly complex: “The instructions [for a computer] are a thousand times more detailed, for the simplest task, than those required to be given to the . . . clerk.” Unleashing computers’ potential would require staff to think in new ways, conceive new methods of organizing data, and embark on a new journey of continuous learning and professional development.

Black-and-white photo of members of the NLM Board of Regents posing for a photo. Four members sit behind a table stacked with papers. 13 members stand in the background. Dr. Rogers is featured on the far right.
Dr. Frank Rogers (far right) with the NLM Board of Regents meeting in the “Old Red Brick,” 1957.

Along with the challenges of training staff to work with ADP equipment came the interminable problem of cost. Much as today’s public institutions are grappling with the costs of cloud computing, digitization, and increasing storage requirements, Dr. Rogers had to balance the potential benefits with the considerable costs of computer equipment. The type of computer necessary to realize Dr. Rogers’ vision would cost about $1.5 million in 1960—98% of NLM’s total budget of $1,566,000.

Undeterred, Dr. Rogers found an answer to the funding problem by collaborating with another agency that would benefit from the increased processing speed of scientific literature that the envisioned system could provide: the National Heart Institute. They provided the initial funding, NLM did the legwork, and in 1963, the new MEDLARS computer went into service. Dr. Rogers had realized his vision of bringing automated indexing to NLM. As Surgeon General Luther Terry said at the Board meeting in April 1961, “If any institution ever stood on the borderland of a new frontier it is the National Library of Medicine.”

Computer operators working with the Honeywell 800 mainframe computer, originally acquired by NLM in the 1960s.

Dr. Rogers was very clear about the issues of cost, labor, and expectations in his 1960 presentation to the Board, including his overarching concern about balancing NLM’s core mission with these potential new directions:

[The] purpose of the Library is not to operate a particular machine system, however great an acrobatic achievement that might be in itself. It is not to publish and distribute a particular index in a particular way, however ingenious and successful that operation may be deemed to be. It is not even just to be a good library, however great and distinguished that library may be. It is rather, by virtue of being a library, to use every available bibliothecal means to promote awareness of and access to the subject content of recorded medical knowledge, to the end that the science of medicine will advance and prosper.

More than 60 years later, NLM still holds fast to that purpose. As stated in our statutory mission and reiterated in our current strategic plan, we are here “to assist the advancement of medical and related sciences and to aid in the dissemination and exchange of scientific and other information important to the progress of medicine and to the public health.” Our continued pioneering work in data science is just one way we accomplish that mission.

Mr. Koyle joined HMD in the NLM Division of Library Operations in 2012. Before joining NLM, Ken served as a medical evacuation helicopter pilot and a historian in the U.S. Army. He is the co-editor with Jeffrey Reznick of Images of America: U.S. National Library of Medicine, a collaborative work with HMD staff.

%d bloggers like this: