Calling on Librarians to Help Ensure the Credibility of Published Research Results

Guest post by Jennifer Marill, Kathryn Funk, and Jerry Sheehan.

The National Institutes of Health (NIH) took a simple but significant step Friday to protect the credibility of published findings from its funded research.

NIH Guide Notice OD-18-011 calls upon NIH stakeholders to help authors of scientific journal articles adhere to the principles of research integrity and publication ethics; identify journals that follow best practices promoted by professional scholarly publishing organizations; and avoid publishing in journals that do not have a clearly stated and rigorous peer review process. The notice identifies several resources authors can consult when considering publishing venues, including Think Check Submit, a publishing industry resource, and consumer information on predatory journals from the Federal Trade Commission.

Librarians have an especially important role to play in guiding researcher-authors to high-quality journals. Librarians regularly develop and apply rigorous collection criteria when selecting journals to include in their collections and make available to their constituents. Librarians promote high-quality journals of relevance to their local communities. As a result, librarians are extremely familiar with journal publishers and the journals their constituents use for research and publication.

The National Library of Medicine (NLM) is no exception. One of NLM’s important functions is to select journals for its collection. The journal guidelines from the NLM Collection Development Manual call for journals that demonstrate good editorial quality and elements that contribute to the objectivity, credibility, and scientific quality of its content. It expects journals and journal publishers to conform with guidelines and best practices promoted by professional scholarly publishing organizations, such as the recommendations of the International Committee of Medical Journal Editors and the joint statement of principles of the Committee on Publication Ethics, Directory of Open Access Journals, Open Access Scholarly Publishers Association and World Association of Medical Editors.

Criteria for accepting journals for MEDLINE or PubMed Central are even more selective, reflecting the considerable resources associated with indexing the literature and providing long-term preservation and public access to full-text literature. MEDLINE currently indexes some 5,600 journals; PubMed Central has about 2,000 journals that regularly submit their full content. PubMed Central is also the repository for the articles resulting from NIH-funded research.

For the most part, NIH-funded researchers do a good job of publishing in high-quality journals.  More than 815,000 journal articles reporting on NIH-funded research have been made publicly accessible in PubMed Central since the NIH Public Access policy became mandatory in 2008. More than 90 percent of these articles are published in journals currently indexed in MEDLINE. The remainder are distributed across thousands of journals, some 3,000 of which have only a single article in PubMed Central. While many are quality journals with sound editorial practices, effective peer review, and scientific merit, it can often be difficult for a researcher-author to evaluate these factors.

That’s where local librarians can be of great assistance. And many already are—helping researchers at their local institutions select publishing venues.

If you have a good practice in your library, let us know about it so we can all learn how best to protect the credibility of published research results.

Jennifer Marill serves as chief of NLM’s Technical Services Division and the Library’s collection development officer. Kathryn Funk is a program manager and librarian for PubMed Central. And Jerry Sheehan is the Library’s deputy director.

Mining for Treasure, Discovering MEDLINE

Reusing a noteworthy dataset to great result

Guest post by Joyce Backus and Kathel Dunn, both from NLM’s Division of Library Operations.

As shrinking budgets tighten belts at hospitals and academic institutions, medical libraries have come under scrutiny. In response, librarians have had to articulate the value they bring to the institution and to the customers—students, researchers, clinicians, or patients—they serve.

In 2011-2012, as such scrutiny swelled, Joanne Marshall and her team set out to study the very question these medical institutions faced: Do libraries add value? They collected 16,122 individual responses from health professionals at 118 hospitals served by 56 health libraries in the United States and Canada. The team sought to determine whether physicians, residents, and nurses perceived their libraries’ information resources as valuable and whether the information obtained impacted patient care.

The resulting article, “The Value of Library and Information Services in Patient Care,” published in 2013, gave medical librarians strong talking points, including the overall perceived value of libraries as time-savers that positively impact patient care.

Now the datasets from that study are being reused to great result.

Over the last year we teamed up with Joanne Marshall and Amber Wells, both from the University of North Carolina-Chapel Hill, to dive into the data.

Our goal: to understand the value and impact of MEDLINE in medical libraries.

We re-discovered (as has been written about before) the value of MEDLINE in changing patient care. We also found its preeminent role shines even more brightly in a dataset like this one that includes other sources. We saw the significance of MEDLINE as a single source of information but also as a source used in combination with full-text journals, books, drug databases, websites, and colleague consultations.

We were reminded, too, of the importance of the National Network of Libraries of Medicine (NNLM) to our work; the trust in the NNLM; each library’s connectedness to the other; and how the everyday web of relationships prompts cooperation and collaboration, including the successful implementation of the value of libraries study itself.

For us this re-discovery comes at a key time, when we’re examining NLM products and services as part of the strategic planning process. We are actively identifying methodologies and tools to elevate all our collections—from datasets to incunabula—and make them greater engines of discovery in service of health.

But what about your library’s resources?

The data mining challenge we gave ourselves is our guide for medical librarians everywhere: look at your data, what’s in front of you, and then others’ data. What can they tell you about what’s happening now, what will likely happen in the future, what’s being used, and how it’s being used?

If you don’t know where to start, check out the Medical Library Association’s Research Training Institute, recommended research skills, and mentoring program. In addition, the NNLM’s site on program evaluation includes tools for determining cost benefit and return on investment.

Librarians positively impact health care and health care research. Now it’s time to have that same impact on our own profession. The data are there. It’s time we see what they have to tell us.

More information

Value of Library and Information Services in Patient Care Study

References

Lindberg DA, Siegel ER, Rapp BA, Wallingford KT, Wilson SR. Use of MEDLINE by physicians for clinical problem solving. JAMA. 1993; 269: 3124-9.

Demner-Fushman D, Hauser SE, Humphrey SM, Ford GM, Jacobs JL, Thoma GR. MEDLINE as a source of just-in-time answers to clinical questions. AMIA Annual Symposium Proceedings. 2006:190-4.

Sneiderman CA, Demner-Fushman D, Fiszman M, Ide NC, Rindflesch TC. Knowledge-based methods to help clinicians find answers in MEDLINE. Journal of the American Medical Informatics Association. 2007 Nov-Dec; 14(6):772-80.


Joyce Backus serves as the associate director for Library Operations at NLM. Kathel Dunn is the NLM Associate Fellowship coordinator.

Photo credit (ammonite, top): William Warby [Wikimedia Commons (CC BY 2.0)]

Addressing Health Disparities to the Benefit of All

Guest post by Lisa Lang, head of NLM’s National Information Center on Health Services Research and Health Care Technology

Singer-actress Selena Gomez shocked her fans this past September with the announcement that she had received a kidney transplant to combat organ damage caused by lupus.

Lupus, an autoimmune condition, strikes women much more than men, with minority women especially vulnerable. Not only is lupus two to three times more common in African American women than in Caucasian women, but recent studies funded by the CDC suggest that, like Ms. Gomez, Hispanic and non-Hispanic Asian women are more likely to have lupus-related kidney disease (lupus nephritis)—a potentially fatal complication.

Documenting such health disparities is crucial to understanding and addressing them. Significantly, the studies mentioned above are the first registries in the United States with sufficient Asians and Hispanics involved to measure the number of people diagnosed with lupus within these populations.

Investment in research examining potential solutions for health care disparities is essential.

In 2014, The Lancet featured a study that examined patterns, gaps, and directions of health disparity and equity research. Jointly conducted by the American Academy of Medical Colleges and AcademyHealth, a non-profit dedicated to enhancing and promoting health services research and a long-time NLM partner, the study examined changes in US investments in health equities and disparities research over time. Using abstracts in the NLM database HSRProj (Health Services Research Projects in Progress), the researchers found an overall shift in disparities-focused projects. From 2007 to 2011, health services research studies seeking to document specific disparities gave way to studies examining how best to alleviate such disparities. In fact, over half of the disparities-focused health services research funded in 2011 “aimed to reduce or eliminate a documented inequity.” The researchers also found significant differences in the attention given to particular conditions, groups, and outcomes. An update by AcademyHealth (publication forthcoming) found these differences continue in more recently funded HSR projects.

A more nuanced appreciation of affected groups is also critical to addressing health disparities. For example, the designation “Hispanic” is an over-simplification, an umbrella construct that obscures potentially important cultural, environmental, and even genetic differences we must acknowledge and appreciate if we are to maximize the benefits promised by personalized medicine. Reviews such as “Hispanic health in the USA: a scoping review of the literature” and “Controversies and evidence for cardiovascular disease in the diverse Hispanic population” highlight questions and conditions that would be informed by richer, more granular, data.

Lupus is one such condition. Research into this disease’s prevalence and impact among Hispanics is underway, but more attention may be warranted. There are almost 100 active clinical studies in the US targeting lupus currently listed in ClinicalTrials.gov and, of these, 15 address lupus nephritis. And while about 5% of ongoing or recently completed projects in the HSRProj database explicitly focus on Hispanic populations, only one, funded by the Patient-Centered Outcomes Research Institute, specifically addresses lupus. (You can see this study’s baseline measures and results on ClinicalTrials.gov.)

Perhaps a celebrity like Ms. Gomez publicly discussing her experience with lupus will spark more attention from both researchers and the public seeking to contribute to knowledge and cures.

After all, we are all both fundamentally unique and alike. Reducing—or better yet, eliminating—health disparities benefits us all.

Guest blogger Lisa Lang is Assistant Director for Health Services Research Information and also Head of NLM’s National Information Center on Health Services Research and Health Care Technology (NICHSR).

Photo credit (The Scales of Justice, top): Darius Norvilas [Flickr (CC BY-NC 2.0)] | altered background

The Rise of Computational Linguistics Geeks

Guest post by Dina Demner-Fushman, MD, PhD, staff scientist at NLM.

“So, what do you do for a living?”

It’s a natural dinner party question, but my answer can prompt that glazed-over look we all dread.

I am a computational linguist, also known (arguably) as a specialist in natural language processing (NLP), and I work at the National Library of Medicine.

If I strike the right tone of excitement and intrigue, I might buy myself a few minutes to explain.

My work combines computer science and linguistics, and since I focus on biomedical and clinical texts, it also requires adding some biological, medical, and clinical know-how to the mix.

I work specifically in biomedical natural language processing (BioNLP). The definition of BioNLP has varied over the years, with the spotlight shifting from one task to another—from text mining to literature-based discovery to pharmacovigilance, for example—but the core purpose has remained essentially unchanged: training computers to automatically understand natural language to speed discovery, whether in service of research, patient care, or public health.

The field has been around for a while. In 1969 NIH researchers Pratt and Pacak described the early hope for what we now call BioNLP in the paper, “Automated processing of medical English,” which they presented at a computational linguistics conference:

The development of a methodology for machine encoding of diagnostic statements into a file, and the capability to retrieve information meaningfully from [a] data file with a high degree of accuracy and completeness, is the first phase towards the objective of processing general medical text.

NLM became involved in the field shortly thereafter, first with the Unified Medical Language System (UMLS) and later with tools to support text processing, such as MetaMap and TextTool, all of which we’ve improved and refined over the years. The more recent Indexing Initiative combines these tools with other machine learning methods to automatically apply MeSH terms to PubMed journal articles. (A human checks the computer’s work, revising as needed.)

These and NLM’s other NLP developments help improve the Library’s services, but they are also freely shared with the world, broadening our impact but more importantly, helping to handle the global proliferation of scientific and clinical text.

It’s that last piece that makes NLP so hot right now.

NLP, we’re finding, can take in large numbers of documents and locate relevant content, summarize text, apply appropriate descriptors, and even answer questions.

It’s every librarian’s—and every geek’s—dream.

But how can we use it?

Imagine, for example, the ever-expanding volume of health information around patients’ adverse reactions to medications. At least four different—and prolific—content streams feed into that pool of information:

  • the reactions reported in the literature, frequently in pre-market research (e.g., in the results of clinical trials);
  • the labeled reactions, i.e., the reactions described in the official drug labels provided by manufacturers;
  • the reactions noted in electronic health records and clinical progress notes; and
  • the reactions described by patients in social media.

NLM’s work in NLP—and its funding of extramural research in NLP—is helping develop approaches and resources for extracting and synthesizing adverse drug reactions from all four streams, giving a more complete picture of how people across the spectrum are responding to medications.

It’s a challenging task. Researchers must address different vocabularies and language structures to extract the information, but NLP, and my fellow computational linguists, will, I predict, prove up to it.

Now imagine parents seeking health information regarding their sick child.

NLP can answer their question, first by understanding key elements in the incoming question and then by providing a response, either by drawing upon a database of known answers (e.g., FAQs maintained by the NIH institutes) or by summarizing relevant PubMed or MedlinePlus articles. Such quick access to accurate and trustworthy health information has the potential to save time and to save lives.

We’re not fully there yet, but as our research continues, we get closer.

Maybe it’s time I reconsider how I answer that perennial dinner party question: “I’m a computational linguist, and I help improve health.”

headshot of Dr. Demner-FushmanDina  Demner-Fushman, MD, PhD is a staff scientist in NLM’s Lister Hill National Center for Biomedical Communications. She leads research in information retrieval and natural language processing focused on clinical decision-making, answering clinical and consumer health questions, and extracting information from clinical text.

Larry Weed’s Legacy and the Next Generation of Clinical Decision Support

Guest post by Lincoln Weed, son of the late Dr. Lawrence L. Weed and co-author with him of the book Medicine in Denial  and other publications. Dr. Weed, who died June 3, 2017, was the originator of “knowledge coupling” tools for clinical decision support and the problem-oriented medical record, including its problem list and SOAP note components.

“Patients are sitting on a treasure trove of data about their own medical conditions.”

My late father, Dr. Lawrence L. Weed (LLW), made this point the day before he died. He was talking about the lost wealth of neglected patient data—readily available, richly detailed data that too often go unidentified and unexamined. Why does that happen, and what can be done about it?

The risk of missed information

From the very outset of medical problem-solving, LLW argued, patients and practitioners face greater risk of loss and harm than they may realize. The risk arises as soon as a patient starts an internet search about a medical problem, or as soon as a practitioner starts questioning the patient about the problem (whether diagnostic or therapeutic).

This gap creates high risk that information crucial to solving the patient’s problem will be missed.

Ideally, these initial inquiries would somehow take into account the entire universe of collectible patient data and vast medical knowledge about what the data mean. But such thoroughness is more than the human mind can deliver.

This gap creates high risk that information crucial to solving the patient’s problem will be missed. And whatever information the mind does deliver is not recorded and harvested in a manner that permits organized feedback and continuous improvement.

Guidance tools set standard of care

The only secure way to proceed, LLW concluded, is to begin investigation of medical problems (the “initial workup”) using guidance tools external to the mind. These tools must couple patient-specific data with general knowledge as follows:

  • Link the initial data point (i.e., the patient’s presenting problem) with (1) medical knowledge about potentially relevant options and (2) readily available data for identifying those options (see the outer circle in the diagram below);
  • Link the data in (2), once collected, with the knowledge in (1) to show how well the data match up with the combinations of data points defining each relevant option—this matching indicates which options are worth considering for the individual (see the middle circle in the diagram below); and
  • Organize this information (data coupled with knowledge) into options and evidence—that is, diagnostic possibilities or therapeutic alternatives, the combined findings (positive, negative, or uncertain) on each alternative, and additional knowledge useful for assessing the best option to pursue (see the inner circle in the diagram below).
Three concentric circles showing (outside) potentially relevant options; (middle) options worth investigating; and (center) best options for this individual
For further explanation of the above diagram, see pp. 72-74 of the book Medicine in Denial.

Tools to carry out these steps would define best practices and make them enforceable as high standards of care for the initial workup (i.e., patient history, physical exam, and basic lab tests). That threshold task is pivotal. It lays the informational foundation for follow-up thought and action by the patient and practitioner. That foundation is also needed for feedback activities to and from third parties. (See the diagram on p. 13 of Medicine in Denial.)

Patient-driven tools

In carrying out the initial workup, the patient’s role is always central. The tools should enable patients to enter history data, which is often the most detailed component of the initial workup. Moreover, the patient necessarily participates in the physical exam conducted by the practitioner, and reviews history, physical, and lab findings with the practitioner.

Tools for the initial workup must thus be used by patients and practitioners jointly. But patients must be able to initiate use of the tools unilaterally. They can’t rely on practitioners to recognize when serious medical investigation is needed. Patients are the ones who experience symptoms—who notice changes from what feels normal. To investigate whether these symptoms might be medically significant, patients need web-based tools for problem-specific inquiries. So do healthy persons who may simply require periodic screening checkups for unidentified problems (plus initial workup of any problems discovered).

Overcoming the medical Tower of Babel

Whether it is patients or practitioners seeking guidance for the initial workup, traditional medical practice leaves them both in a vacuum. Once that vacuum was filled solely by practitioners’ idiosyncratic judgments. Now the vacuum is also being filled with a plethora of practice guidelines and clinical decision support tools, not to mention internet search engine tools.

But the very multiplicity of all these resources defeats the purpose of defining generally accepted, enforceable best practices for initial workups. And the multiplicity is increasing with new patient-generated health data from sensors, wearables, and smartphone-connected devices for physical exam data.  Moreover, the universe for needed guidance is expanding with vast new genomic/molecular data and knowledge.

The outcome of this multiplicity is not useful diversity but a Tower of Babel.

What we need instead are information tools with a unified design and trustworthy medical content, tools that guide users through the basic steps for inquiry into all medical problems, tools that take into account relevant information from all specialties without intellectual or financial biases. Users should not have to switch back and forth among different tools and interfaces for different medical problems, different specialties, different practice settings, different data types, different vendors, and different classes of users. The medical content captured in the tools must be problem-specific, but the tools’ basic design (see the three bullets above) should generalize to all problems in all contexts, as much as possible. This generality enables intuitive ease-of-use at the user level and powerful synergies at the software development level.

NLM’s role for the 21st century

LLW saw NLM as key to developing tools of this kind.

Drawing on its uniquely comprehensive electronic repository of medical content, NLM could create a new repository of distilled, structured knowledge. Drawing on its connections with the NIH research institutes and federal health agencies such as the CDC and FDA, NLM could rapidly incorporate new knowledge into that specialized repository. Outside parties and NLM itself could use that repository to build user-level tools with a unified design for conducting initial workups on specific medical problems.

Drawing on its uniquely comprehensive electronic repository of medical content, NLM could create a new repository of distilled, structured knowledge.

By enabling creation of such a knowledge infrastructure for the public, NLM would seize an “opportunity to modernize the conceptualization of a ‘library.’” Beyond its current electronic repository, NLM could be “demonstrating how information and knowledge can best be developed, assimilated, organized, applied, and disseminated in the 21st century.”  [NIH Advisory Committee to the Director, NLM Working Group, Final Report, p. 12 (June 11, 2015).]

This new infrastructure will encounter a barrier to its use—the medical practice status quo. Not all practitioners (or their overseers) will accept the data collection demands defined by the tool.

Patients at the center

Here we return to the central role of patients.

Patients who unilaterally use NLM tools to complete the history portion of the initial workup can then seek out practitioners who are willing (and permitted) to use the same tools for the physical exam and basic lab test portions. By creating demand for those innovative practitioners and using the tools jointly with them, patients can drive medical practice toward a foundational reform.

* * *

book cover for Medicine in Denial by Lawrence and Lincoln WeedReaders who have questions about the above are referred to the fuller discussion of these ideas in the book Medicine in Denial (PDF | published work), especially parts IV.E, F, and G, pages 192-194, and the diagram on page 13. The author also invites comments below.


Lincoln Weed, JD, Dr. Lawrence Weed’s son, practiced employee benefits law in Washington, DC for 26 years. He then joined a consulting firm where he specialized in health information privacy. He is now retired.