Education, Health, and Basketball

Guest post by David L. Nash, NLM’s Education and Outreach Liaison.

A few weeks ago, in observance of African American History Month, five former Harlem Globetrotters spoke at a program in Silver Spring, Maryland associated with a screening of the documentary “The Game Changers: How the Harlem Globetrotters Battled Racism.”

Following the short documentary and a brief ball-handling demonstration, we sat down to discuss our current careers and how we each got to where we are.

Those participating were:

  • David Naves of Bowie, Maryland, currently an engineer at NASA Goddard Space Flight Center;
  • Bobby Hunter from Harlem, New York, a businessman and fundraiser for charitable events, cancer awareness, and community basketball;
  • Larry Rivers from Atlanta, Georgia, who directs an organization that provides clothing, housing, career opportunities, and other services to temporarily disadvantaged people in the greater Atlanta area;
  • Charles Smith of Baltimore, Maryland, the president of a non-profit that provides a haven for urban youth to learn and enjoy sports; and
  • me, David L. Nash, NLM’s Education and Outreach Liaison.
David Nash slam dunks as a Harlem Globetrotter in the early 1970s
David Nash in action as a Harlem Globetrotter.

As we each shared our journeys from basketball to the boardroom, we focused on messages of health and education, driving home the idea that education is the key that unlocks the door to whatever you want to be.

I spoke about my experiences as a colon cancer survivor, emphasizing the need for early screening and regular doctor’s visits. And I noted the importance of family history as a risk factor for colon cancer.

I also gave out copies of NIH Medline Plus magazine featuring such health topics as cancer, diabetes, and asthma.

The crowd numbered well over 600 people, about double what we expected, with many of the adults bringing along their children and grandchildren. They were receptive and attentive.

Those in attendance appreciated the focus on education and wellness, and I enjoyed working with people of color to improve their understanding of important health information.

casual headshot of David NashDavid L. Nash serves as the Education and Outreach Liaison at the National Library of Medicine. After finishing his collegiate basketball career at the University of Kansas, he was drafted by the Chicago Bulls in the 1969 NBA Draft and played with the Harlem Globetrotters from 1970-72. He has worked at NLM since 1990.

Connecting Computing Research with National Priorities

Guest post by Mark D. Hill from the Computing Community Consortium and the University of Wisconsin-Madison. The content originally appeared in The CCC Blog on January 23, 2018. It is reprinted with permission.

symposium panelists sit at a table
Jim Kurose talks to the audience about CS+X.

For weeks [The CCC Blog has] been recapping the Computing Community Consortium (CCC) Symposium from the perspective of the researchers and industry representatives who presented their work on each panel.

This week, we are getting a different perspective. The goal of the final panel, called Connecting Computing Research with National Priorities and moderated by CCC Vice Chair Mark D. Hill, was to get a perspective from people who have or are currently serving in government.

The panelists included:

  • Will Barkis, from Orange Silicon Valley, shared a Silicon Valley perspective and called for increasing investment in basic research and development to benefit society as well as support innovation in industry. He emphasized that collaboration between academia, the public sector, and the private sector is critical for long-term impact.
  • Patti Brennan, from the National Institutes of Health (NIH), talked about a number of healthcare issues in the country that we need to be aware of and start addressing, such as the accelerated mental health crisis. If we develop computational services and fine-grained access control we might be able to address some of these issues sooner rather than later.
  • Jim Kurose, from the National Science Foundation (NSF), discussed smart and connected communities and how it serves people in their communities. He also highlighted the importance of interdisciplinary work and gave the example of biologists and computer scientists coming together in the field of bioinformatics.
  • Bill Regli, from Defense Advanced Research Projects Agency (DARPA), explained the Heilmeier Catechism. George H. Heilmeier, a former DARPA director, crafted a set of questions to help Agency officials think through and evaluate proposed research programs.
symposium panelists sit at a table
Bill Regli explains the DARPA Heilmeier Catechism.

During the Q&A session, one audience member asked if we should have computational specialists in all science fields since many are becoming more interdisciplinary. Dr. Brennan said that if we put computation in all fields then we run the risk of losing its impact. She does think that some of the training programs are a start, but it takes time for it to run smoothly enough. Dr. Kurose praised a number of CS+X programs around the country. These programs are trying to reach out to a different set of students who are interested in computing but are currently in other disciplines. They understand that if they take computational classes in their discipline only more doors will open.

To read all the recaps from each panel, see below:

Intelligent Infrastructure for our Cities and Communities

AI and Amplifying Human Abilities

Security and Privacy for Democracy

Data, Algorithms, and Fairness Panel

See the videos from all panels here.

headshot of Mark HillMark D. Hill is the Computing Community Consortium (CCC) Vice Chair and the John P. Morgridge Professor and Gene M. Amdahl Professor of Computer Sciences at the University of Wisconsin-Madison. 

ClinicalTrials.gov Moves Toward Increased Transparency

Guest post by Kevin M. Fain, JD, MPH, DrPH, Senior Advisor for Policy and Research, ClinicalTrials.gov.

ClinicalTrials.gov is the largest public clinical research registry and results database in the world—and  the most heavily used. As of today, it contains registration information for more than 260,000 studies in 202 different countries and results information on more than 29,000 of those studies. Each week, the content grows by approximately 560 new registrations and 110 new results submissions. The system averages more than 162 million page views per month and 93,000 unique visitors daily.

ClinicalTrials.gov enables users to: (1) search for clinical trials of drugs, biologics, devices, and other interventions; (2) obtain summary information about these studies (e.g., purpose, design, and facility locations); (3) track the progress of a study from initiation to completion; and (4) obtain summary results, often before they are published elsewhere.

In addition, the unique ClinicalTrials.gov identifier assigned to each registered trial (commonly referred to as the “NCT Number”) has become the de facto standard for referencing trials and is widely and routinely used in medical journal articles, MEDLINE citations, and the mass media.

Federal law underlies the ClinicalTrials.gov database requirements and content. NIH launched the database in 2000 after the Food and Drug Administration Modernization Act of 1997.  The FDA Amendments Act of 2007 then expanded the database’s scope and purpose by requiring registration and results reporting for certain clinical trials of FDA-regulated drugs, biological products, and medical devices. Importantly, the 2007 law included legal consequences for noncompliance, including civil monetary penalties.

More recently, in an effort to make information about clinical trials more widely available to the public, the US Department of Health and Human Services issued a final rule in September 2016 that specifies requirements for registering certain clinical trials and submitting summary results information to ClinicalTrials.gov. The rule’s final form was shaped by over 900 public comments.

The new rule, which became effective one year ago (January 18, 2017), clarifies and expands the reporting requirements for clinical trials, including trial results for drug, biologic, and device products not approved by FDA. At the same time, NIH issued a policy establishing the expectation that all investigators conducting clinical trials funded in whole or in part by NIH will ensure these trials are registered at ClinicalTrials.gov and that results information for these trials is submitted to ClinicalTrials.gov.

The expanded reporting requirements are expected to yield important scientific, medical, and public health benefits—from improving the clinical research enterprise itself to maintaining the public’s trust in clinical research. Having access to complete study results, including negative or inconclusive data, can help counteract publication bias, reduce duplication in research, improve the focus and design of future studies, and protect patients from undue risk or ineffective interventions. That additional information, in the context of other research, can also help inform health care providers and patients regarding medical decisions.

As a repository for study results, ClinicalTrials.gov helps deliver those benefits.

Recent research indicates that the results of many clinical trials—including those funded by NIH—are never published. And even when results are published, they can be limited, focusing on findings of most interest rather than all outcomes. In contrast, studies have found results reported in ClinicalTrials.gov are more complete than in the published literature. The new reporting requirements are expected to strengthen that characteristic and enhance the benefits ClinicalTrials.gov brings.

It is important to understand that listing a study on ClinicalTrials.gov does not mean it has been evaluated by the US Federal Government. The ClinicalTrials.gov website emphasizes this point for the public through prominent disclaimer statements, including one on the importance of discussing any clinical trial with a health care provider before participating.

ClinicalTrials.gov allows for the registration of any human biomedical study that conforms with prevailing laws and regulations, including an indication that recruiting studies were approved by an ethics review committee. As a result, the database is more comprehensive, which can better serve the public in critical ways. For example, potential participants can see the full range of studies being conducted, not just those funded or sponsored by NIH. Ethics committees, funders, and others can also view the wider scope of studies, which can help them more effectively oversee new research.

Aside from legislative and policy changes, ClinicalTrials.gov has also focused on enhancing the site’s usability, addressing design and layout issues and improving the ability to search, display, and review information about the studies registered on the site. The latest set of updates, released last month, included new search options (such as by recruitment status and distance from a geographic location), refinements to the display of search results, and additional information regarding study results and key record dates.  These changes, plus those brought about by the final rule, will help maximize the value of clinical trials, and by extension, advance knowledge and improve health.

From finding trials actively recruiting participants to identifying new experimental drugs or device interventions to analyzing study design and results, ClinicalTrials.gov delivers key benefits to patients, clinicians, and researchers and puts into action NIH’s core mission: turning discovery into health. It also reflects one more way NLM makes medical and health information available for public use and patient health.

headshot of Kevin FainKevin Fain, JD, MPH, DrPH, has served as senior advisor for policy and research at ClinicalTrials.gov since 2015. He was an attorney with the FDA from 1995-2010, specializing in clinical trial and drug regulatory matters. He earned his doctorate in epidemiology from Johns Hopkins University in 2015.

Exploring the Brave New World of Metagenomics

See last week’s guest post, “Adventures of a Computational Biologist in the Genome Space,” for Part 1 of Dr. Koonin’s musings on the importance of computational analysis in biomedical discovery.

While the genomic revolution rolls on, a new one has been quietly fomenting over the last decade or so, only to take over the science of microbiology in the last couple of years.

The name of this new game is metagenomics.

Metagenomics is concerned with the complex communities of microbes.

Traditionally, microbes have been studied in isolation, but to do that, a microbe or virus has to be grown in a laboratory. While that might sound easy, only 0.1% of the world’s microbes will grow in artificial media, with the success rate for viruses even lower.

Furthermore, studying microbes in isolation can be somewhat misleading because they commonly thrive in nature as tightly knit communities.

Metagenomics addresses both problems by exhaustively sequencing all the microbial DNA or RNA from a given environment. This powerful, direct approach immensely expands the scope of biological diversity accessible to researchers.

But the impact of metagenomics is not just quantitative. Over and again, metagenomic studies—because they look at microbes in their natural communities and are not restricted by the necessity to grow them in culture—result in discoveries with major biological implications and open up fresh experimental directions.

In virology, metagenomics has already become the primary route to new virus discovery. In fact, in a dramatic break from tradition, such discoveries are now formally recognized by the International Committee on Taxonomy of Viruses. This decision all but officially ushers in a new era, I think.

Here is just one striking example that highlights the growing power of metagenomics.

In 2014, Rob Edwards and colleagues at San Diego State University achieved a remarkable metagenomic feat. By sequencing multiple human gut microbiomes, they managed to assemble the genome of a novel bacteriophage, named crAssphage (for cross-Assembly). They then went on to show that crAssphage is, by a wide margin, the most abundant virus associated with humans.

This discovery was both a sensation and a shock. We had been completely blind to one of the key inhabitants of our own bodies—apparently because the bacterial host of the crAssphage would not grow in culture. Thus, some of the most common microbes in our intestines, and their equally ubiquitous viruses, represent “dark matter” that presently can be studied only by metagenomics.

But the crAssphage genome was dark in more than one way.

Once sequenced, it looked like nothing in the world. For most of its genes, researchers found no homologs in sequence databases, and even those few homologs identified shed little light on the biology of the phage. Furthermore, we had been unable to establish any links to other phages, nor could we tell which proteins formed the crAssphage particle.

Such results understandably frustrate experimenters, but computational biologists see opportunity.

A few days after the crAssphage genome was published, Mart Krupovic of Institut Pasteur visited my lab, where we attempted to decipher the genealogies and functions of the crAssphage proteins using all computational tools available to us at the time. The result was sheer disappointment. We detected some additional homologies but could not shed much light on the phage evolutionary relationships or reproduction strategy.

We moved on. With so many other genomes to analyze, crAssphage dropped from our radar.

Then, in April 2017, Anca Segall, a sabbatical visitor in my lab, invited Rob Edwards to give a seminar at NCBI about crAssphage. After listening to Rob’s stimulating talk—and realizing that the genome of this remarkable virus remains a terra incognita—we could not resist going back to the crAssphage genome armed with some new computational approaches and, more importantly, vastly expanded genomic and metagenomic sequence databases.

This time we got better results.

After about eight weeks of intensive computational analysis by Natalya Yutin, Kira Makarova, and myself, we had fairly complete genomic maps for a vast new family of crAssphage-related bacteriophages. For all these phages, we predicted with good confidence the main structural proteins, along with those involved in genome replication and expression. Our work led to a paper we recently published in the journal Nature Microbiology. We hope and believe our findings provide a roadmap for experimental study of these undoubtedly important viruses.

Apart from the immediate importance of the crAss-like phages, this story delivers a broader lesson. Thanks to the explosive growth of metagenomic databases, the discovery of a new virus or microbe does not stop there. It brings with it an excellent chance to discover a new viral or microbial family. In addition, analyzing the gene sequences can yield interesting and tractable predictions of new biology. However, to take advantage of the metagenomic treasure trove, we must creatively apply the most powerful sequence analysis methods available, and novel ones may be required.

Put another way, if you know where and how to look, you have an excellent chance to see wonders.

As a result, I cannot help being unabashedly optimistic about the future of metagenomics. Fueled by the synergy between increasingly high-quality, low-cost sequencing, improved computational methods, and emerging high-throughput experimental approaches, the prospects appear boundless. There is a realistic chance we will know the true extent of the diversity of life on earth and get unprecedented insights into its ecology and evolution within our lifetimes. This is something to work for.

casual headshot of Dr. KooninEugene Koonin, PhD, has served as a senior investigator at NLM’s National Center for Biotechnology Information since 1996, after working for five years as a visiting scientist. He has focused on the fields of computational biology and evolutionary genomics since 1984.

Adventures of a Computational Biologist in the Genome Space

Guest post by Dr. Eugene Koonin, NLM National Center for Biotechnology Information.

More than 30 years ago, when I started my research in computational biology (yes, it has been a while), it was not at all clear that one could do biological research using computers alone. Indeed, the common perception was that real insights into how organisms function and evolve could only be gained in the lab or in the field.

As so often happens in the history of science, that all changed when a new type of data arrived on the scene. Genetic sequence information, the blueprint for building all organisms, gave computational biology a foothold it has never relinquished.

As early as the 1960s, some prescient researchers—first among them Margaret Dayhoff at Georgetown University—foresaw genetic sequences becoming a key source of biological information, but this was far from  mainstream biology at the time. But through the 1980s, the trickle of sequences grew into a steady stream, and by the mid-1990s, the genomic revolution was upon us.

I still remember as if it were yesterday the excitement that overwhelmed me and my NCBI group in the waning days of 1995, when J. Craig Venter’s team released the first couple of complete bacterial genomes. Suddenly, the sequence analysis methods on which we and others had been working in relative obscurity had a role in trying to understand the genetic core of life. Soon after, my colleague, Arcady Mushegian, and I reconstructed a minimal cellular genome that attracted considerable attention, stimulating experiments that confirmed how accurate our purely computational effort had been.

Now, 22 years after the appearance of those first genomes, GenBank and related databases contain hundreds of thousands of genomic sequences encompassing millions of genes, and the utility and importance of computational biology are no longer a matter of debate. Indeed, biologists cannot possibly study even a sizable fraction of those genes experimentally, so, at the moment, computational analysis provides the only way to infer their biological functions.

Indeed, computational approaches have made possible many crucial biological discoveries. Two examples in which I and my NCBI colleagues have been actively involved are elucidating the architecture of the BRCA1 protein that, when impaired, can lead to breast cancer, and predicting the mode of action of CRISPR systems. Both findings sparked extensive experimentation in numerous laboratories all over the world. And, in the case of CRISPR, those experiments culminated in the development of a new generation of powerful genome-editing tools that have opened up unprecedented experimental opportunities and are likely to have major therapeutic potential.

But science does not stand still. Quite the contrary, it moves at an ever-accelerating pace and is prone to taking unexpected turns. Next week, I’ll explore one recent turn that has set us on a new path of discovery and understanding.

casual headshot of Dr. KooninEugene Koonin, PhD, has served as a senior investigator at NLM’s National Center for Biotechnology Information since 1996, after working for five years as a visiting scientist. He has focused on the fields of computational biology and evolutionary genomics since 1984.

Calling on Librarians to Help Ensure the Credibility of Published Research Results

Guest post by Jennifer Marill, Kathryn Funk, and Jerry Sheehan.

The National Institutes of Health (NIH) took a simple but significant step Friday to protect the credibility of published findings from its funded research.

NIH Guide Notice OD-18-011 calls upon NIH stakeholders to help authors of scientific journal articles adhere to the principles of research integrity and publication ethics; identify journals that follow best practices promoted by professional scholarly publishing organizations; and avoid publishing in journals that do not have a clearly stated and rigorous peer review process. The notice identifies several resources authors can consult when considering publishing venues, including Think Check Submit, a publishing industry resource, and consumer information on predatory journals from the Federal Trade Commission.

Librarians have an especially important role to play in guiding researcher-authors to high-quality journals. Librarians regularly develop and apply rigorous collection criteria when selecting journals to include in their collections and make available to their constituents. Librarians promote high-quality journals of relevance to their local communities. As a result, librarians are extremely familiar with journal publishers and the journals their constituents use for research and publication.

The National Library of Medicine (NLM) is no exception. One of NLM’s important functions is to select journals for its collection. The journal guidelines from the NLM Collection Development Manual call for journals that demonstrate good editorial quality and elements that contribute to the objectivity, credibility, and scientific quality of its content. It expects journals and journal publishers to conform with guidelines and best practices promoted by professional scholarly publishing organizations, such as the recommendations of the International Committee of Medical Journal Editors and the joint statement of principles of the Committee on Publication Ethics, Directory of Open Access Journals, Open Access Scholarly Publishers Association and World Association of Medical Editors.

Criteria for accepting journals for MEDLINE or PubMed Central are even more selective, reflecting the considerable resources associated with indexing the literature and providing long-term preservation and public access to full-text literature. MEDLINE currently indexes some 5,600 journals; PubMed Central has about 2,000 journals that regularly submit their full content. PubMed Central is also the repository for the articles resulting from NIH-funded research.

For the most part, NIH-funded researchers do a good job of publishing in high-quality journals.  More than 815,000 journal articles reporting on NIH-funded research have been made publicly accessible in PubMed Central since the NIH Public Access policy became mandatory in 2008. More than 90 percent of these articles are published in journals currently indexed in MEDLINE. The remainder are distributed across thousands of journals, some 3,000 of which have only a single article in PubMed Central. While many are quality journals with sound editorial practices, effective peer review, and scientific merit, it can often be difficult for a researcher-author to evaluate these factors.

That’s where local librarians can be of great assistance. And many already are—helping researchers at their local institutions select publishing venues.

If you have a good practice in your library, let us know about it so we can all learn how best to protect the credibility of published research results.

Jennifer Marill serves as chief of NLM’s Technical Services Division and the Library’s collection development officer. Kathryn Funk is a program manager and librarian for PubMed Central. And Jerry Sheehan is the Library’s deputy director.

Mining for Treasure, Discovering MEDLINE

Reusing a noteworthy dataset to great result

Guest post by Joyce Backus and Kathel Dunn, both from NLM’s Division of Library Operations.

As shrinking budgets tighten belts at hospitals and academic institutions, medical libraries have come under scrutiny. In response, librarians have had to articulate the value they bring to the institution and to the customers—students, researchers, clinicians, or patients—they serve.

In 2011-2012, as such scrutiny swelled, Joanne Marshall and her team set out to study the very question these medical institutions faced: Do libraries add value? They collected 16,122 individual responses from health professionals at 118 hospitals served by 56 health libraries in the United States and Canada. The team sought to determine whether physicians, residents, and nurses perceived their libraries’ information resources as valuable and whether the information obtained impacted patient care.

The resulting article, “The Value of Library and Information Services in Patient Care,” published in 2013, gave medical librarians strong talking points, including the overall perceived value of libraries as time-savers that positively impact patient care.

Now the datasets from that study are being reused to great result.

Over the last year we teamed up with Joanne Marshall and Amber Wells, both from the University of North Carolina-Chapel Hill, to dive into the data.

Our goal: to understand the value and impact of MEDLINE in medical libraries.

We re-discovered (as has been written about before) the value of MEDLINE in changing patient care. We also found its preeminent role shines even more brightly in a dataset like this one that includes other sources. We saw the significance of MEDLINE as a single source of information but also as a source used in combination with full-text journals, books, drug databases, websites, and colleague consultations.

We were reminded, too, of the importance of the National Network of Libraries of Medicine (NNLM) to our work; the trust in the NNLM; each library’s connectedness to the other; and how the everyday web of relationships prompts cooperation and collaboration, including the successful implementation of the value of libraries study itself.

For us this re-discovery comes at a key time, when we’re examining NLM products and services as part of the strategic planning process. We are actively identifying methodologies and tools to elevate all our collections—from datasets to incunabula—and make them greater engines of discovery in service of health.

But what about your library’s resources?

The data mining challenge we gave ourselves is our guide for medical librarians everywhere: look at your data, what’s in front of you, and then others’ data. What can they tell you about what’s happening now, what will likely happen in the future, what’s being used, and how it’s being used?

If you don’t know where to start, check out the Medical Library Association’s Research Training Institute, recommended research skills, and mentoring program. In addition, the NNLM’s site on program evaluation includes tools for determining cost benefit and return on investment.

Librarians positively impact health care and health care research. Now it’s time to have that same impact on our own profession. The data are there. It’s time we see what they have to tell us.

More information

Value of Library and Information Services in Patient Care Study

References

Lindberg DA, Siegel ER, Rapp BA, Wallingford KT, Wilson SR. Use of MEDLINE by physicians for clinical problem solving. JAMA. 1993; 269: 3124-9.

Demner-Fushman D, Hauser SE, Humphrey SM, Ford GM, Jacobs JL, Thoma GR. MEDLINE as a source of just-in-time answers to clinical questions. AMIA Annual Symposium Proceedings. 2006:190-4.

Sneiderman CA, Demner-Fushman D, Fiszman M, Ide NC, Rindflesch TC. Knowledge-based methods to help clinicians find answers in MEDLINE. Journal of the American Medical Informatics Association. 2007 Nov-Dec; 14(6):772-80.


Joyce Backus serves as the associate director for Library Operations at NLM. Kathel Dunn is the NLM Associate Fellowship coordinator.

Photo credit (ammonite, top): William Warby [Wikimedia Commons (CC BY 2.0)]