When You Stand on the Shoulders of a Giant, What Do You See?

This blog contains my remarks from the 2022 Lindberg-King Lecture and Scientific Symposium: Science, Society, and the Legacy of Donald A.B. Lindberg, M.D., which took place on September 1, 2022. Watch a recording of the event here.

I had the great fortune of becoming the director of the National Library of Medicine immediately following the 30-plus-year tenure of Donald A.B. Lindberg, MD. I am sure that each of you here today treasures your own recollection of Don, maybe from a conversation or a laugh you may have had with this great leader, teacher, visionary, and colleague (and husband to Mary, father, grandfather, and friend). I am both proud and humbled to stand on the shoulders of this giant as I lead this incredible organization.

I know more viscerally than most about Don’s legacy as NLM director. I sit in the office he occupied, I walk the halls he walked, I work with the people he hired, and I see and experience the fruits of his judgement, investments, and vision.

I now sit where Don once sat, representing NLM at the leadership table of NIH with the other Institute and Center directors. With Don paving the way, I have a platform to extend NLM’s thought leadership and technical knowledge to guide NIH’s continued efforts to advance data-driven discovery. The good will and collaborative spirit engendered by Don across NIH opened doors for me and helped me continue his legacy to deliver on the promise of science accelerated by broad access to literature and data.

Don and I share a deep commitment to ensuring that the public benefits from NLM’s efforts to assemble, organize, preserve, and disseminate biomedical knowledge for society. It was his early vision that made MedlinePlus a trusted resource for consumer health information and ensured that the PubMed citation database and the PubMed Central full-text literature repository were open and accessible to everyone, everywhere, with an Internet connection, at any time and place.  

Don’s commitment to the public was also evident in his efforts to educate the next generation of biomedical informatics scholars. Frankly, I believe that of all of the aspects of his job, engagement with trainees was his favorite!

When you stand on the shoulders of a giant, you have a great advantage. The foundation Don built and the relationships he established provided me, the 4th appointed director of NLM, with a playbook right out of the gate. It is not enough to solely rely on his vision to guide our future as Don also inspired innovation; in one of our last conversations, he said to me, “This is your game—make sure you play it well!” In order to do that, I cannot simply stand on the shoulders of a giant; I must also keep my head up and my eyes forward to the future to envision new pathways and find new opportunities to bring forward the riches of NLM to the future benefit of science and society.

I close by inviting all of you to stand on the shoulders of this giant and meld your sights with his, for it is not by holding tight to that which he could see, but by using his vision as a stepping-off point for our own that will serve his legacy.

The Next Normal: Supporting Biomedical Discovery, Clinical Practice, and Self-Care

As we start year three of the COVID-19 pandemic, it’s time for NLM to take stock of the parts of our past that will support the next normal and what we might need to change as we continue to fulfill our mission to acquire, collect, preserve, and disseminate biomedical literature to the world.

Today, I invite you to join me in considering the assumptions and presumptions we made about how scientists, clinicians, librarians and patients are using critical NLM resources and how we might need to update those assumptions to meet future needs. I will give you a hint… it’s not all bad—in fact, I find it quite exciting!

Let’s highlight some of our assumptions about how people are using our services, at least from my perspective. We anticipated the need for access to medical literature across the Network of the National Library of Medicine and created DOCLINE, an interlibrary loan request routing system that quickly and efficiently links participating libraries’ journal holdings. We also anticipated that we were preparing the literature and our genomic databases for humans to read and peruse. Now we’re finding that more than half of the accesses to NLM resources are generated and driven by computers through application programming interfaces. Even our MedlinePlus resource for patients now connects tailored electronic responses through MedlinePlus Connect to computer-generated queries originating in electronic health records.

Perhaps, and most importantly, we realize that while sometimes the information we present is actually read by a living person, other times the information we provide—for example, about clinical trials (ClinicalTrials.gov) or genotype and phenotype data (dbGaP)—is actually processed by computers! Increasingly, we provide direct access to the raw, machine-readable versions of our resources so those versions can be entered into specialized analysis programs, which allow natural-language processing programs to find studies with similar findings or machine-learning models to determine the similarities between two gene sequences. For example, NLM makes it possible for advocacy groups to download study information from all ClinicalTrials.gov records so anyone can use their own programs to point out trials that may be of interest to their constituents or to compare summaries of research results for related studies.

Machine learning and artificial intelligence have progressed to the point that they perform reasonably well in connecting similar articles—to this end, our LitCovid open-resource literature hub has served as an electronic companion to the human curation of coronavirus literature. NLM’s LitCovid is more efficient and has a sophisticated search function to create pathways that are more relevant and are more likely to curate articles that fulfill the needs of our users. Most importantly, innovations such as LitCovid help our users manage the vast and ever-growing collection of biomedical literature, now numbering more than 34 million citations in NLM’s PubMed, the most heavily used biomedical literature citation database.

Partnerships are a critical asset to bring biomedical knowledge into the hands (and eyes) of those who need it. Over the last decade, NLM moved toward a new model for managing citation data in PubMed. We released the PubMed Data Management system that allows publishers to quickly update or correct nearly all elements of their citations and that accelerates the delivery of correct and complete citation data to PubMed users.

As part of the MEDLINE 2022 Initiative, NLM transitioned to automated Medical Subject Headings (MeSH) indexing of MEDLINE citations in PubMed. Automated MeSH indexing significantly decreases the time for indexed citations to appear in PubMed without sacrificing the quality MEDLINE is known to provide. Our human indexers can focus their expertise on curation efforts to validate assigned MeSH terms, thereby continuously improving the automated indexing algorithm and enhancing discoverability of gene and chemical information in the future.

We’re already preparing for the next normal—what do you think it will be like?

I envision making our vast resources increasingly available to those who need them and forging stronger partnerships that improve users’ ability to acquire and understand knowledge. Imagine a service, designed and run by patients, that could pull and synthesize the latest information about a disease, recommendations for managing a clinical issue, or help a young investigator better pinpoint areas ripe for new interrogation! The next normal will make the best use of human judgment and creativity by selecting and organizing relevant data to create a story that forms the foundation of new inquiry or the basis of new clinical care. Come along and help us co-create the next normal!

We Can’t Go It Alone!

In February, I received the Miles Conrad Award from the National Information Standards Organization (NISO). NISO espouses a wonderful vision: “. . . a world where all can benefit from the unfettered exchange of information.” As the Director of the National Library of Medicine (NLM), this is music to my ears.

Standards are essential to NLM’s mission! Standards bring structure to information, assure common understanding, and make the products of scientific efforts—including literature and data—easier to discover. NLM’s efforts are devoted to the creation, dissemination, and use of terminology and messaging standards. These efforts include attaching indexing terms to citations in PubMed, our biomedical literature database housing over 34 million citations; using reference models to describe genome sequences; and serving as the HHS repository for the clinical terminologies needed to support health care delivery. NLM improves health and accelerates biomedical discovery by advancing the availability and use of standards. Standards are dynamic tools that must capture the context of biomedicine and health care at a given moment yet reflect the scientific development and changes in community vernacular.

By their very nature, standards create consensus across two or more parties on how to properly name, structure, or label phenomena. No single entity can create a standard all by itself! Standards are effective because they shape the conversation between and among entities, achieving a common goal by drawing on a common representation.

NLM alone cannot create, promulgate, or enforce standards. We work in partnership with professional societies, standards development organizations, and other federal entities, including the Office of the National Coordinator for Health Information Technology, to foster interoperability of clinical data. We support the development and distribution of SNOMED CT (the Systematized Nomenclature of Medicine – Clinical Terms) and the specific extension of SNOMED in the United States. We developed the MeSH (Medical Subject Headings) thesaurus, a controlled vocabulary used to index articles in PubMed. We also support the development and distribution of LOINC (the Logical Observation Identifiers Names and Codes), a common language—that is, a set of identifiers names and codes—used to identify health measurements, observations, and documents. Finally, we maintain RxNorm, a normalized naming system for generic and branded drugs and their uses, to support message exchanges across pharmacy management and drug interaction software.

Partnerships help us create and deploy standard ways to make scientific literature discoverable and accessible. To this end, we were instrumental in the adoption of NISO’s JATS (Journal Article Tag Suite), an XML format for describing the content of published articles, which we encourage journals to use when submitting citations to PubMed so users can efficiently search the literature and articles as they are described. MeSH RDF (Resource Description Framework) is a linked data representation of the MeSH vocabulary on the web, and the BIBFRAME (Bibliographic Framework) Initiative—a data exchange format initiated by the Library of Congress—adds MeSH RDF URIs (Uniform Resource Identifiers) to link data that will support complete bibliographic descriptions and foster resource sharing across the web and through the networked world.

Standards provide the resources necessary to understand complex phenomena and share scientific insights. Leveraging partnerships in order to develop and deploy these standards both allows efficiencies and produces a more connected, interoperable, understandable world of knowledge. Given the speed at which biomedical knowledge is growing, leveraging these partnerships assures that the institutions charged with acquiring and disseminating all the knowledge relevant to biomedicine and health can successfully and effectively meet their missions.

Your Privacy is an NLM Priority

Patient privacy — you might be scratching your head right now. NLM is a research enterprise and a LIBRARY for heaven’s sake! What does a library have to do with patient privacy? NLM protects the privacy of all people who use our resources, which are free and accessible 24/7. NLM complies with requirements for privacy and security established by the Office of Management and Budget, Department of Health and Human Services, and NIH. I encourage you to visit our Privacy and Security Policy guidelines.

No personal identifying information is required to search and access our vast data repositories and library resources. Anyone, sick or well, who wants trusted information about a disease, illness, or health condition can search through our MedlinePlus online health information service. With data available in English and Spanish, MedlinePlus offers high quality, relevant health information for patients and their families on more than 1,000 topics such as children’s growth and development, gene therapies, and self-care after surgery.

We do not link search strategies to any specific patron without their permission. NLM only links information for those patrons who sign up for My NCBI, which is a service that allows patrons to save and return to previous search results. This information is held in a safe, secure part of our computer systems open only to the individual.

NLM also provides expert guidance to other federal agencies for the most effective approaches to preserving patient privacy. Clem Mc Donald, MD, our Chief Health Data Standards Officer, serves as a member of the Health Information Technology Advisory Committee, which is an advisory committee to the Office of the National Coordinator for Health Information Technology that oversees a range of issues from promoting health IT excellence in communities to collaborations among federal agencies. We recently participated in the federal response to Executive Order 13994, Ensuring a Data-Driven Response to COVID-19 and Future High-Consequence Public Health Threats, leveraging our expertise in protecting patient data and preventing inadvertent re-identification from genomic information.

Patient participation in clinical trials and other research efforts advances science and creates the pathways to discover new clinical therapies and interventions. Sometimes, data generated in one study becomes useful in future studies; for example, when trying to understand how different groups of patients respond to the same therapy. NLM provides technical assistance to the National Institutes of Health in creating ways to store participant-level study information safely and securely making information useable for other researchers while making sure that personally identifiable information is not disclosed. We also help NIH create safe, secure data repositories of research data and implement mechanisms and oversight measures to ensure that data is available to researchers and managed in a way consistent with the original agreements for use of the data. We helped establish NIH’s Researcher Auth Service Initiative, a single sign-on for researchers that allows access to specific data sets in a controlled manner.

Our researchers also develop computational methods to protect patient privacy. This includes research investigating how to remove traces of identifying data from clinical records, while making those records useful for researchers to better understand the course of disease and determine the effectiveness of treatments. NLM’s Dr. Mehmet Kayaalp develops ways to let approved researchers use clinical records for clinical studies in a way that protects patient privacy. He describes his work this way:

Narrative clinical reports contain a rich set of clinical knowledge that could be invaluable for clinical research. However, they usually also contain personal identifiers that are considered protected health information and are associated with use restrictions and risks to privacy. Computational de-identification seeks to remove all of the identifiers in such narrative text in order to produce de-identified documents that can be used in research while protecting patient privacy. Computational de-identification uses natural language processing tools and techniques to recognize patient-related individually identifiable information (e.g., names, addresses, and telephone and social security numbers) in the text and redacts them. In this way, patient privacy is protected, and clinical knowledge is preserved.

Dr. Mehmet Kayaalp

So – we’re more than a library. We are a partner in preserving patient privacy while making sure that researchers and clinicians can discover the best new ways for taking care of patients.

How do you think NLM can better serve scientists and society?

Investing in a Sustained Partnership: A Data-Driven Human Approach to Social Justice and Equity

Guest post by Patricia Matthews-Juarez, PhD, Chair of the Environmental Health Information Partnership (EnHIP) and Rueben C. Warren, DDS, MPH, DrPH, MDIV, Scientific Advisor for EnHIP

In 1989, after many successful years of developing scientific and technical databases, the National Library of Medicine (NLM) started its first long-term outreach plan to train health professionals how to use NLM’s suite of digital tools. While these efforts helped large medical schools and hospital centers, institutions comprised of substantial minority populations struggled to maintain access to online databases and keep up with rapidly evolving technologies.

As a result, NLM sponsored a one-year pilot project to increase the capacity of historically black colleges and universities (HBCUs), Hispanic-serving institutions, minority-serving institutions, and tribal colleges to access NLM’s toxicological and chemical databases. This program was designed not only to benefit the institutions, but also to investigate environmental toxins commonly found in minority and socio-economically disadvantaged communities, particularly in the southern United States. In 1991, the pilot project grew into a partnership called the Environmental Health Information Partnership (EnHIP).

EnHIP unites heads of the various universities and colleges with NLM leadership and staff. In addition to examining environmental hazards, this program also calculates the impact of hazardous waste on the lives of African Americans using data, technology, and scientific resources.

This single investment made more than 30 years ago to strengthen the capacity of HBCUs resulted in a tremendous payback in terms of education and research. As NLM and EnHIP have evolved, so have the demands for access to complex technology that capture and interpret data as a pathway to scientific explorations, interventions, research endeavors, and discoveries. The return on investment is the systemic organizational change at the member schools of EnHIP and listening channels at NLM. These opportunities create community-based projects in local communities and enhance the capacity of EnHIP member institutions to reduce health disparities in ways never imagined. These opportunities, driven by consistent investments from NLM, are linked to the practice and process of social justice and fairness, trustworthiness, and truth telling.

NLM continues to bring high standards and innovative ideas to the acquisition and management of biomedical data as scientists unravel the impact of the social determinants of health, health disparities, and health equity. The NIH UNITE initiative to end structural racism offers new opportunities to invest in equitable research and determine how data is collected, managed, and accessed with justice and equality in mind. Three decades of collaboration in data science, open access publications, and community/citizen science are paying off. Shared values and networks have been amplified at the international, national, regional, state, and local levels, and across populations. Years of consistently shared and common agendas have led to a strong and effective partnership with the current participating 23 HBCUs, Hispanic-serving institutions, minority-serving institutions, and tribal colleges. These dividends of trust, open communication, and transparency are reflected in the success of our nation in its efforts to reach for equity in science, education, and service.

Dr. Matthews-Juarez is the Senior Vice President for Strategic Initiatives and Innovation and Professor in the Department of Family and Community Medicine at Meharry Medical College. Her work focuses on the social determinants of health, health disparities, and equity in primary care education and community engagement in both the United States and Africa.

Dr. Warren is Director of the National Center for Bioethics in Research and Health Care and Professor at Tuskegee University. He previously served as Associate Director for Minority Health and Associate Director for Environment Justice at the Centers for Disease Control and Prevention and Director of Infrastructure Development at the NIH National Institute on Minority Health and Health Disparities

%d bloggers like this: