Defining the Path Forward for NLM’s New Office of Engagement and Training

Guest post by Amanda J. Wilson, Chief, Office of Engagement and Training, NLM.

During the NLM Board of Regents (BOR) meeting held last week, I had the distinct honor of introducing the new Office of Engagement and Training (OET). This office brings together many of the outreach, training, and capacity-building staff, programs, and services from across the Library.

Since OET was established in June 2019, our team has been occupied with moving into our new space, getting to know one another, exploring the depth and capacity of the resources we have to accomplish our goals, discussing what the future holds for our role in coordinating engagement activities, and reflecting as a team on the niche we fill for NLM. In the midst of this summer flurry of activity and, quite frankly, the more mundane tasks of figuring out the fastest way to answer the door to our offices and the mechanics of mail distribution, some themes surrounding what we can, and hope to become rose to the top.

Our vision for OET is a resource that will serve the NLM community as a strategic connector between NLM and our audiences, as well as across the Library, as a trusted authority on the NLM experience when engaging with Library resources. We are also an incubator for new approaches to engagement.

What, exactly, does that mean?

It means we understand the broad range of both new and existing NLM users, their needs, and the most effective pathways to reach them. And it also means we are closely connected to NLM researchers, developers, information professionals, program managers, and product owners, including knowing what information is most important to them and has the greatest impact on their work.

This vision also involves knowing how all segments of NLM’s audiences respond to different types of engagement activities. That knowledge will position OET to use our expertise, capabilities, and connections to bring NLM’s trusted resources to communities when and where those resources are needed most. And, considering our unique position, it means we can be a catalyst for exploring novel, effective ways to connect, build, and enhance opportunities for all audiences to engage with NLM.

But that’s not all.

As we started working toward these goals and aspirations, we asked the BOR for advice and thoughts to guide us. For some activities that we currently engage in, such as surveys, webinars, meetings, and exhibits, the BOR provided encouragement for us to continue. The BOR also challenged OET to explore new strategies for engagement, such as working with U.S. Public Health Service Commissioned Corps officers who are part of the Prevention through Active Community Engagement (PACE), in the Office of the Surgeon General. Another suggestion was to engage in community theater productions to help convey our message.

The possibilities that BOR members provided, as well as input from our colleagues at NLM and other partners, have given OET much to consider as we chart our path forward.

What does this vision of OET mean to you?

I’ve been called corny by one of my colleagues (said with a smile) for my obvious enthusiasm about the future of OET. But I absolutely embrace that sentiment! I’m enthusiastic because I have an opportunity to lead a wonderful team of experienced, knowledgeable colleagues dedicated to our mission. I’m also enthusiastic because OET has the support of NLM leadership and the BOR to continue creating an office that supports NLM’s goals with evidence-based engagement and training, built on collaboration and inclusivity and with an eye to the future.

This is an exciting time, and I look forward to all that we can do together! I invite you to join us along the way.

Photo of Amanda Wilson, Chief of the Office of Engagement and Training.

Amanda J. Wilson is Chief of the NLM Office of Engagement and Training (OET), bringing together general engagement, training, and outreach staff from across NLM to focus on the Library’s presence across the U.S. and internationally. OET is also home to the Environmental Health Information Partnership for NLM and coordinates the National Network of Libraries of Medicine. Wilson first came to NLM when appointed Head, National Network Coordinating Office, in January 2017.

NLM Scientists Contribute to AI for Medical Image Interpretation

Guest post by Sameer Antani, PhD, Staff Scientist, Acting Branch Chief for the Communications Engineering Branch and Computer Science Branch at the National Library of Medicine’s Lister Hill National Center for Biomedical Communications, National Institutes of Health.

Artificial intelligence (AI) has become one of the hottest fields of the 21st century. But AI isn’t a new concept. It’s older than I am!

AI—or, more specifically, machine learning-based automated intelligent decision support—is making inroads in applications that we could only dream about just a few decades ago, such as automated check recognition, movie and video recommendations, and self-driving vehicles.

And in the near term, the role of AI may be as computer-based applications that use data-derived knowledge to support or advance human activities that are tedious, repetitive, and relatively deterministic, especially where expert resources are lacking. In other words, AI may not only help solve budget issues, it may also help reduce boredom.

The idea of an artificial brain was initially promoted by a handful of scientists from different fields, resulting in the founding of AI research as an academic discipline in 1956. After some initial discoveries, and a clearer understanding of the challenges involved, the field lost steam during the last decades of the 20th century. However, advances continued in the form of various statistical pattern recognition and machine learning techniques.

Then, in 2012, a breakthrough in deep learning was published. The image-classification error rate had been cut in half for the ImageNet dataset in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). By 2017 the best AI algorithms were detecting and recognizing objects in photographic images at an impressive accuracy rate of more than 97%, surpassing human performance.

Since then, AI in imaging has become a relatively mature field. But the use of AI in medical imaging continues to challenge us. We need to recognize that much of the field’s success in medical imaging has been within a narrow focus on specific tasks for which AI has been trained, and that this success depends on the data to which AI has been exposed.

Here at NLM, we’ve been working on image informatics research and advancing computational science techniques and information retrieval using traditional machine-learning methods for many years, even before the advent of deep learning. 

Some of AI’s most exciting applications are happening in underserved and under-resourced regions, and imaging-based AI can help fill the gaps where medical expertise may be limited. My fellow NLM scientists and I have applied and contributed to advancing AI techniques to predict tuberculosis and other pulmonary diseases in digital chest X-ray images, screen for malaria parasites in microscopic blood smear images, and detect age-related eye diseases. A recent landmark paper showed that an AI algorithm was superior to human experts in identifying cervical precancer in women.

These findings are consistent with other AI advances in medical imaging reported in the scientific literature, including reading CT scans for lung-cancer screening, detecting brain tumors, screening for diabetic retinopathy, digital pathology applications for precision oncology, and performing radiologist-level pneumonia detection on chest X-rays. While many of these exemplify amazing advances in medical imaging AI, some are built on or have humble beginnings in outcomes from ImageNet’s object localization and recognition challenge.

NLM’s strategic plan for building a platform for data-driven discovery and health guides our research efforts. We’re developing novel AI algorithms; gaining a deeper understanding of AI decision-making (also known as explainable AI); measuring the impact of data variety, volume, and quality; and identifying more ways to address gaps in translating technical advances to have a positive impact on biomedical research and clinical care.

Our research interests also include intelligent ensembles of deep learning networks where each type of network learns something different from the data, and the learned knowledge is then transferred and fused into other sets. This effort is particularly important for rare diseases, where the number of samples in the population tends to be smaller. Unlike humans, it’s a challenge for AI to learn key patterns from a few samples. But we’re trying to develop this capacity.

Breakthroughs in modern AI techniques in medical imaging are empowering, but these are still early days. Yet achieving AI’s potential as smart assistive technologies appears to be more imminent than replacing human expertise with AI.

I continue to dream of a future in which AI makes our lives healthier and our health care delivery more effective.

 

Photos of Sameer Antani, PhDDr. Antani is a versatile lead researcher advancing the role of computational sciences and automated decision-making in biomedical research, education, and clinical care. His research interests include topics in medical imaging and informatics, machine learning, data science, artificial intelligence, and global health. His primary areas of research and development include cervical cancer, HIV/TB, and visual information retrieval, among others.

Taking Flight: NLM’s Data Science Journey

Guest post by the Data Science @NLM Training Program team.

Data science at NLM is ready to soar!

In 2018, we embarked on a journey to build a workforce ready to take on the challenges of data-driven research and health, and earlier this year we shared our plans for accelerating data science expertise at NLM. Now, it’s time to reflect on our progress and recognize our accomplishments.

Our Data Science @NLM Training Program Open House, held last week, showcased some of the great data science work happening across the Library. We learned from each other and discovered new opportunities to strengthen the Library’s proficiencies in working with data and using analytic tools, furthering NLM’s research practices and services.

Data Science @NLM Poster Gallery

A poster gallery featuring 77 research posters and data visualizations provided a snapshot of the many ways that NLM staff apply data science to their work. It was great to see so many NLM staff sharing their work and engaging in stimulating conversations about innovation.

Three “lightning” presentations gave a glimpse of how NLM staff use data science. NLM Data Science and Open Science Librarian, Lisa Federer, PhD, MLIS, talked about building a librarian workforce to engage with researchers on open science and data science. NLM’s Rezarta Islamaj, PhD, and Donald Comeau, PhD, presented their perspectives on enriching gene and chemical links in PubMed and PubMedCentral and evaluating Medical Subject Headings, or MeSH in indexing for literature retrieval in PubMed.

The open house was also an opportunity for NLM staff who participated in an intensive 120-hour data science fundamentals course to share what they learned and how they’re applying their new skills.  

But this event was more than a celebration of accomplishments. It provided space to reflect on lessons learned, how to use what we’ve learned on a daily basis, and hopes for the future of data science at NLM. Dina Demner-Fushman, MD, PhD, of NLM dove into data science methodologies in her discussion of the Biomedical Citation Selector (BmCS), a high-recall machine-learning system that identifies articles that require indexing for MEDLINE selectively-indexed journals.

Data Science @NLM Ideas Booth

NLM staff brainstormed over 60 ideas to bring data science solutions to new and ongoing projects and talked with data science experts at the open house “ideas booth.” Staff also shared how they will learn, or continue to use, data science in support of their individual career goals.

We were delighted to see over 300 NLM staff participating in the open house, which is just one of the ways that NLM is working to achieve goal 3 of the NLM strategic plan to “build a workforce for data-driven research and health.”

The Data Science @NLM Training Program has helped increase NLM staff awareness of and expertise in data science. NLM staff are now better prepared than ever to demonstrate the Library’s commitment to accelerating biomedical discovery and data-powered health.   

Our data science journey continues, as does the growth of the data science community at NLM. For a recap of the day, follow the experience at #datareadynlm.

We’re taking off!


Photos of the Data Science at NLM Training Program Team; Dianne Babski, Peter Cooper, Lisa Federer and Anna Ripple
Data Science @NLM Training Program team (left to right):
Dianne Babski, Deputy Associate Director, Library Operations
Peter Cooper, Strategic Communications Team Lead, National Center for Biotechnology Information
Lisa Federer, Data Science and Open Science Librarian, Office of Strategic Initiatives
Anna Ripple, Information Research Specialist, Lister Hill National Center for Biomedical
Communications

Engaging Users to Support the Modernization of ClinicalTrials.gov

Guest post by Rebecca Williams, PharmD, MPH, acting director of ClinicalTrials.gov at the National Library of Medicine, National Institutes of Health.

ClinicalTrials.gov is the largest public clinical research registry and results database in the world – providing patients, health care providers, and researchers with information on more than 300,000 clinical studies of a wide range of diseases and conditions. More than 145,000 unique visitors use the public website daily to find and learn about clinical studies, resulting in an average of 215 million pageviews each month.

Recognizing the value of ClinicalTrials.gov to millions of users, the Board of Regents of the National Library of Medicine (NLM) described in the 2017-2027 strategic plan the importance of ensuring the long-term sustainability of this resource. NLM is committed to this goal and aims to modernize ClinicalTrials.gov to deliver a modern user experience on a flexible, extensible, scalable, and sustainable platform that will accommodate growth and enhance efficiency.

We are undertaking this effort to make ClinicalTrials.gov an even more valuable resource with a renewed commitment to engage with and serve the people who rely on it.

These users include the sponsors and investigators who submit clinical trial information for inclusion on the site through the submission portal. They also include patients, health care providers, and researchers who access listed information on ClinicalTrials.gov, whether directly or indirectly through other sites and services that use the ClinicalTrials.gov application programming interface.

Over the past several years, we have conducted testing with users and have already made some improvements in response to this feedback. With modernization, we will continue to support key functions identified by users of ClinicalTrials.gov while also seeking ways to make it an even more valuable resource.

To continue the modernization process, we are now seeking broader engagement with users to further help us determine how to evolve ClinicalTrials.gov. We are spending this summer looking inward by engaging our fellow National Institutes of Health Institutes and Centers to understand how ClinicalTrials.gov could better help in fulfilling NIH’s goals of clinical trial stewardship and transparency

This fall, we plan to expand our reach outward and are proposing to establish a working group of the NLM Board of Regents to focus on the modernization of ClinicalTrials.gov. This working group will provide a transparent forum for communicating and receiving input about efforts to enrich and modernize ClinicalTrials.gov. We want to ensure that we understand and consider changing needs while simultaneously maximizing the value of the growing amount of available information and preserving the integrity of ClinicalTrials.gov as a trusted resource.

We’ve already taken some steps to be more proactive in communicating with our users. We just launched “Hot Off the PRS!” (sign up to receive email announcements), a new informational bulletin for users of the ClinicalTrials.gov Protocol Registration and Results System (PRS). These updates provide timely announcements about new PRS features, relevant regulations (42 CFR Part 11) and policies, and information about other offerings such as the PRS Guided Tutorials (BETA), a new training resource with step-by-step instructions for submitting results information.

We’re excited about how greater user engagement will enrich and modernize ClinicalTrials.gov, improving its value for everyone throughout the clinical research lifecycle.

Please let us know what else we can do to make ClinicalTrials.gov the best it can be.

Photo of Rebecca Williams, PharmD, MPH

Rebecca Williams, PharmD, MPH, oversees the technical, scientific, policy, regulatory and outreach activities related to the operation of ClinicalTrials.gov. Her research interests relate to improving the quality of reporting of clinical research and evaluating the clinical research enterprise.

On the Ethics of Using Social Media Data for Health Research

Guest post by Dr. Graciela Gonzalez-Hernandez, associate professor of informatics at the Perelman School of Medicine, University of Pennsylvania.

Social media has grown in popularity for health-related research as it has become evident that it can be a good source of patient insights. Be it Twitter, Reddit, Instagram, Facebook, Amazon reviews or health forums, researchers have collected and processed user comments and published countless papers on different uses of social media data.

Using these data can be a perfectly acceptable research practice, provided they are used ethically and the research approach is solid. I will not discuss solid scientific principles and statistically sound methods for social media data use here, though. Instead, I will focus on the much-debated ethical principles that should guide observational studies done with social media data.

To help frame our discussion, let’s consider why the ethics of social media data use is called into question. Almost invariably when I present my work in this area or submit a proposal or paper, someone raises the question of ethics, often despite my efforts to address it upfront. I believe this reticence or discomfort comes from the idea that the data can be traced back to specific people and the fear that using the data could result in harm. Some research with social media data might seem innocuous enough. One might think no harm could possibly come from making available the collected data or specific tweets on topics like smoking cessation and the strategies people find effective or not. But consider data focusing on topics such as illegal substance use, addiction recovery, mental health, prescription medication abuse, or pregnancy. Black and white can quickly turn to gray.

Before going further, it is important to understand the fundamental rules for this type of research in an academic setting. In general, researchers who want to use social media data apply to their institutional review board (IRB) for review. Research activities involving human subjects and limited to one or more of the exempt categories defined by federal regulations receive an “exempt determination” rather than “IRB approval.” In the case of social media data, the exemption for existing data, documents, records, and specimens detailed in 45 CFR 46.101(b)(4) generally applies, as long as you don’t contact individual users as part of the research protocol and the data to be studied are openly and publicly available. If you will be contacting individual users, the study becomes more like a clinical trial, needing “informed consent” and full IRB review. (See the National Institutes of Health’s published guidelines for this case.)

Furthermore, exempt studies are so named because they are exempt from some of the federal regulations that apply to human-subjects research. They are not exempt from state laws, institutional policies, or the requirements for ethical research. Most of all, they are not exempt from plain old common sense.

But when it comes to the existing-data exemption, which data are “openly and publicly available” is open to question. To be safe, use only data available to all users of the platform without any extra permissions or approvals. No data from closed forums or groups that would require one to “join” within the platform should be considered “openly and publicly available.” After all, members of such groups generally expect their discussions are “private,” even if the group is large.

Beyond that, when deciding how to use the data or whether to publish the data directly, ask yourself whether revealing the information in a context other than where it was originally posted could result in harm to the people who posted it, either now or later. For example, you could include specific social media posts as examples in a scientific paper, but, if the topic was delicate, you might choose not to publish a post verbatim, instead changing the wording so a search of the host platform would not lead someone to the user. In the case of platforms like Reddit that are built around anonymity, this language modification would not be necessary. If possible, use aggregate data (e.g., counts or topics discussed) rather than individual social media posts.

However you approach your research, datasets used for automatic language processing experiments need to be shared for the results to be reproducible. Which format this takes depends on the data source, but reproducibility does not take a back seat just because these are social media data. To help you further consider the question of how to use or share these data, check out the guidelines published by the Association of Internet Researchers. These guidelines include a comprehensive set of practical questions to help you decide on an ethical approach, and I highly recommend them. In their study of the ethics of social media use, Moreno et al. also address some practical considerations and offer a good summary of the issues.

We are now ready to consider what constitutes ethical research. Ethics, or principles of right conduct, apply to institutions that conduct research, whether in academia or industry. Although ethics is sometimes used interchangeably with morals, what constitutes ethical behavior is less subjective and less personal, defining correct behavior within a relatively narrow area of activity. While there will likely never be a generally agreed upon code of ethics for every area of scientific activity, a number of groups have established principles relevant to social media-based research, including the American Public Health Association, the American Medical Informatics Association, and the previously mentioned Association of Internet Researchers. Principles of research ethics and ethical treatment of persons focus around the policy of “do no harm,” but it falls to IRBs to determine if harm could result from your approach and whether your proposed research is ethical. Even so, however, review boards might have discrepant opinions, as recent work looking into attitudes toward the use of social media data for health research has shown.

So where does that leave those of us looking to conduct health research using social media data?

Take a “stop and think” and “when in doubt, ask” approach before finalizing a study and investing time. Help ensure the researcher’s interests are balanced against those of the people involved (i.e., the users who posted the data) by putting yourself in their shoes. Be cognizant of the needs and concerns of vulnerable communities who might require greater protection, but don’t assume that research involving social media data should not be done or that the data cannot be shared. If the research was ethically conducted, then social media data can and should be shared as part of the scientific process to ensure reproducibility, and there is a lot that can be gained from pursuing it.

headshot of Dr. Graciela Gonzalez HernandezGraciela Gonzalez-Hernandez, MS, PhD, is a recognized expert and leader in natural language processing applied to bioinformatics, medical/clinical informatics, and public health informatics. She is an associate professor with tenure at the Perelman School of Medicine, University of Pennsylvania, where she leads the Health Language Processing Lab within the Institute for Biomedical Informatics and the Department of Biostatistics, Epidemiology, and Informatics.

Socio-legal Barriers to Data Reuse

Envisioning a sustainable data trust

Guest post by Melissa Haendel, PhD, a leader of and advocate for open science initiatives.

The increasing volume and variety of biomedical data have created new opportunities to integrate data for novel analytics and discovery. Despite a number of clinical success stories that rely on data integration (e.g., rare disease diagnostics, cancer therapeutic discovery, drug repurposing), within the academic research community, data reuse is not typically promoted. In fact, data reuse is often considered “not innovative” in funding proposals and has even come under attack. (See the now infamous “research parasites” editorial in The New England Journal of Medicine.)

The FAIR data principles—Findable, Accessible, Interoperable, and Reusable—are a terrific set of goals for all of us to strive for in our data sharing, but they detail little about how to realize effective data reuse. If we are to grow innovation from our collective data resources, we must look to pioneers in data harmonization for insight into the specific advantages and challenges of data reuse at scale. Current data-licensing practices for most public data resources severely hamper data reuse, especially at scale. Integrative platforms such as the Monarch Initiative, the NCATS Biomedical Data Translator, the Gabriella Miller Kids First Data Resource Portal, and myriad other cloud data platforms will be able to accelerate scientific progress more effectively if licensing issues can be resolved. As a member of these various consortia, I want to facilitate the legal use and reuse of increasingly interconnected, derived, and reprocessed data. The community has previously raised this concern in a letter to NIH.

How reusable are most data resources? In our recently published manuscript, we created a rubric for evaluating the reusability of a data resource from the licensing standpoint. We applied this rubric to more than 50 biomedical data and knowledge resources. These assessments and the evaluation platform are openly available at the (Re)usable Data Project (RDP). Each resource was scored on a scale of zero to five stars on the following measures:

  • findability and type of licensing terms
  • scope and completeness of the licensing
  • ability to access the data in a reasonable way
  • restrictions on how the data may be reused, and
  • restrictions on who may reuse the data.

We found that 57% of the resources scored three stars or fewer, indicating that license terms may significantly impede the use, reuse, and redistribution of the data.

Custom licenses constituted the largest single class of licenses found in these data resources. This suggests the resource providers either did not know about standard licenses or believed the standard licenses did not meet their needs. Moreover, while the majority of custom licenses were restrictive, just over two-thirds of the standard licenses were permissive, leading us to wonder whether some needs and intentions are not being met by the existing set of standard permissive licenses. In addition, about 15% of resources had either missing or inconsistent licensing. This ambiguity and lack of clear intent requires clarification and possibly legal counsel.

A total of 61.8% of data resources use nonpermissive licenses.

Putting this all together, a majority of resources would not meet basic criteria for legal frictionless use for downstream data integration and redistribution, despite the fact that most of these resources are publicly funded, which should mean the content is freely available for reuse by the public.

If we in the United States have a hard time understanding how we may reuse data given these legal restrictions, we must consider the rest of the world—which presumably we aim to serve—and how hard it would be for anyone in another country to navigate this legalese. I hope the RDP’s findings will encourage the worldwide community to work together to improve licensing practices to facilitate reusable data resources for all.

Given what I have learned from the RDP and a wealth of experience in dealing with these issues, I recommend the following actions:

  • Funding agencies and publishers should ensure that all publicly funded databases and knowledge bases are evaluated against licensing criteria (whether the RDP’s or something similar).
  • Database providers should use these criteria to evaluate their resources from the perspective of a downstream data user and update their licensing terms, if appropriate.
  • Downstream data re-users should provide clear source attribution and should always confirm it is legal to redistribute the data. It is very often the case that it is legal to use the data but not to redistribute it. In addition, many uses are actually illegal.
  • Database providers should guide users on how to cite the resource as a whole, as individual records, or as portions of the content when mashed up in other contexts (which can include schemas, ontologies, and other non-data products). Where relevant, providers should follow best practices declared by a community, for example the Open Biological Ontologies citation policy, which supports using native object identifiers rather than creating new digital objects.
  • Data re-users should follow best practices in identifier provisioning and reference within the reused data so it is clear to downstream users what the license actually applies to.

To be useful and sustainable, data repositories and curated knowledge bases need to clearly credit their sources and specify the terms of reuse and redistribution.

I believe that, to be useful and sustainable, data repositories and curated knowledge bases need to clearly credit their sources and specify the terms of reuse and redistribution. Unfortunately, these resources are currently and independently making noncompatible choices about how to license their data. The reasons are multifold but often include the requirement for sustainable revenue that is counter to integrative and innovative data science.

Based on the productive discussions my collaborators and I have had with data resource providers, I propose the community work together to develop a “data trust.” In this model, database resource providers could join a collective bargaining organization (perhaps organized as a nonprofit), through which they could make their data available under compatible licensing terms. The aggregate data sources would be free and redistributable for research purposes, but they could also have commercial use terms to support research sustainability. Such a model could leverage value- or use-based revenue to incentivize resource evolution and innovation in support of emerging needs and new technologies, and would be governed by the constituent member organizations.

casual headshot of Melissa Haendel, PhD Melissa Haendel, PhD, leads numerous local, national, and global open science initiatives focused on semantic data integration and disease mechanism discovery and diagnosis, namely, the Monarch Initiative, the Global Alliance for Genomics and Health (GA4GH), the National Center for Data to Health (CD2H), and the NCATS Biomedical Data Translator.

Next Up for the NLM Biomedical Informatics Training Program

Guest post by Katherine Majewski, NLM Librarian.

How are librarians applying informatics?

This is the question we want to answer in re-envisioning the NLM Biomedical Informatics training program. The survey-style course, most recently hosted by Augusta University in Georgia, provided a sampling of the vast realms of informatics research and application in the health sciences. We want to build on the success of that course by targeting the specific skills and knowledge that librarians can use right now to tackle real-world challenges.

headshot of Barbara Platts
Barbara Platts

For example, Barbara Platts and her team provide clinical information services for Munson Healthcare in Traverse City, Michigan. Over the last several years, Barbara’s role at Munson has expanded into electronic health records (EHR). She now contributes to the policy and management of clinical information flow both within and outside the EHR system. As part of that effort, Barbara enhances the functionality of Munson’s EHR; increases the usable clinical content provided across multiple platforms; develops efficient knowledge management structures for hospital communities of practice; and trains hospital employees to use critical appraisal skills to find the best information services available.

How can NLM support this important work and help other librarians follow Barbara’s lead in using information tools to improve patient care?

In trying to answer that question, we’ve been exploring the connections between clinical librarians, informatics, and patient care to better understand NLM’s role. This past year we offered a webinar series entitled “Clinical Information, Librarians, and the NLM: From Health Data Standards to Better Health,” which focused on the roles and products of the National Library of Medicine related to applied clinical informatics, particularly within electronic health records systems and clinical research.  We devoted one of the six sessions in the series to discussing emerging roles and training needs for aspiring informatics librarians. In conjunction with the series, we solicited interviews, visited clinical sites, and polled webinar participants to learn about the specific skills and knowledge clinical librarians are using now or will need in the future.

Along the way we heard from many librarians like Barbara who are part of the clinical information flow, though not always integrated into clinical systems as much as they would like.

We learned that librarians are:

  • working with clinical teams to improve patient care and safety by improving the efficiency and effectiveness of information delivery;
  • connecting systems to systems, bridging the divide between clinicians and information technology staff;
  • crafting information policies and practices within and between health institutions to reduce waste and redundancy and improve patient care;
  • supporting research by:
    • framing research questions,
    • informing research design methods, and
    • managing research data;
  • conducting research in text mining, artificial intelligence and machine learning;
  • selecting and licensing content, including patient education content; and
  • educating users.

How can NLM support these current and future roles for librarians?

Underlying any work related to health information must be a strong facility with the information services NLM provides. This should not be understated or undervalued: Librarians make significant contributions to health using their knowledge of information sources and retrieval techniques, and NLM resources are at the center.

But those librarians who managing data or making system-level connections between patients and health information need additional skills and knowledge from NLM. These fall into two general areas:

  • The ability to manage and direct access to NLM systems and data (e.g., through APIs), and
  • An understanding of the terminologies that can be used to connect systems.

What is the NLM plan for informatics training for librarians and other information professionals?

To support patient care, we are:

To support research, we will:

What about other realms of informatics?

We’re not done yet. Understanding additional areas where librarianship, informatics, and NLM intersect will require more communication with you. Look for opportunities to engage with us through the National Network of Libraries of Medicine and on our page Training on Biomedical Informatics, Data Science, and Data Management.

headshot of Katherine MajewskiKatherine Majewski is a trainer, instructional designer, and technical writer for NLM products. Kate received her master’s degree in library and information science from the State University of New York at Buffalo and has worked in libraries since 1989.