Education, Health, and Basketball

Guest post by David L. Nash, NLM’s Education and Outreach Liaison.

A few weeks ago, in observance of African American History Month, five former Harlem Globetrotters spoke at a program in Silver Spring, Maryland associated with a screening of the documentary “The Game Changers: How the Harlem Globetrotters Battled Racism.”

Following the short documentary and a brief ball-handling demonstration, we sat down to discuss our current careers and how we each got to where we are.

Those participating were:

  • David Naves of Bowie, Maryland, currently an engineer at NASA Goddard Space Flight Center;
  • Bobby Hunter from Harlem, New York, a businessman and fundraiser for charitable events, cancer awareness, and community basketball;
  • Larry Rivers from Atlanta, Georgia, who directs an organization that provides clothing, housing, career opportunities, and other services to temporarily disadvantaged people in the greater Atlanta area;
  • Charles Smith of Baltimore, Maryland, the president of a non-profit that provides a haven for urban youth to learn and enjoy sports; and
  • me, David L. Nash, NLM’s Education and Outreach Liaison.
David Nash slam dunks as a Harlem Globetrotter in the early 1970s
David Nash in action as a Harlem Globetrotter.

As we each shared our journeys from basketball to the boardroom, we focused on messages of health and education, driving home the idea that education is the key that unlocks the door to whatever you want to be.

I spoke about my experiences as a colon cancer survivor, emphasizing the need for early screening and regular doctor’s visits. And I noted the importance of family history as a risk factor for colon cancer.

I also gave out copies of NIH Medline Plus magazine featuring such health topics as cancer, diabetes, and asthma.

The crowd numbered well over 600 people, about double what we expected, with many of the adults bringing along their children and grandchildren. They were receptive and attentive.

Those in attendance appreciated the focus on education and wellness, and I enjoyed working with people of color to improve their understanding of important health information.

casual headshot of David NashDavid L. Nash serves as the Education and Outreach Liaison at the National Library of Medicine. After finishing his collegiate basketball career at the University of Kansas, he was drafted by the Chicago Bulls in the 1969 NBA Draft and played with the Harlem Globetrotters from 1970-72. He has worked at NLM since 1990.

Connecting Computing Research with National Priorities

Guest post by Mark D. Hill from the Computing Community Consortium and the University of Wisconsin-Madison. The content originally appeared in The CCC Blog on January 23, 2018. It is reprinted with permission.

symposium panelists sit at a table
Jim Kurose talks to the audience about CS+X.

For weeks [The CCC Blog has] been recapping the Computing Community Consortium (CCC) Symposium from the perspective of the researchers and industry representatives who presented their work on each panel.

This week, we are getting a different perspective. The goal of the final panel, called Connecting Computing Research with National Priorities and moderated by CCC Vice Chair Mark D. Hill, was to get a perspective from people who have or are currently serving in government.

The panelists included:

  • Will Barkis, from Orange Silicon Valley, shared a Silicon Valley perspective and called for increasing investment in basic research and development to benefit society as well as support innovation in industry. He emphasized that collaboration between academia, the public sector, and the private sector is critical for long-term impact.
  • Patti Brennan, from the National Institutes of Health (NIH), talked about a number of healthcare issues in the country that we need to be aware of and start addressing, such as the accelerated mental health crisis. If we develop computational services and fine-grained access control we might be able to address some of these issues sooner rather than later.
  • Jim Kurose, from the National Science Foundation (NSF), discussed smart and connected communities and how it serves people in their communities. He also highlighted the importance of interdisciplinary work and gave the example of biologists and computer scientists coming together in the field of bioinformatics.
  • Bill Regli, from Defense Advanced Research Projects Agency (DARPA), explained the Heilmeier Catechism. George H. Heilmeier, a former DARPA director, crafted a set of questions to help Agency officials think through and evaluate proposed research programs.
symposium panelists sit at a table
Bill Regli explains the DARPA Heilmeier Catechism.

During the Q&A session, one audience member asked if we should have computational specialists in all science fields since many are becoming more interdisciplinary. Dr. Brennan said that if we put computation in all fields then we run the risk of losing its impact. She does think that some of the training programs are a start, but it takes time for it to run smoothly enough. Dr. Kurose praised a number of CS+X programs around the country. These programs are trying to reach out to a different set of students who are interested in computing but are currently in other disciplines. They understand that if they take computational classes in their discipline only more doors will open.

To read all the recaps from each panel, see below:

Intelligent Infrastructure for our Cities and Communities

AI and Amplifying Human Abilities

Security and Privacy for Democracy

Data, Algorithms, and Fairness Panel

See the videos from all panels here.

headshot of Mark HillMark D. Hill is the Computing Community Consortium (CCC) Vice Chair and the John P. Morgridge Professor and Gene M. Amdahl Professor of Computer Sciences at the University of Wisconsin-Madison. Moves Toward Increased Transparency

Guest post by Kevin M. Fain, JD, MPH, DrPH, Senior Advisor for Policy and Research, is the largest public clinical research registry and results database in the world—and  the most heavily used. As of today, it contains registration information for more than 260,000 studies in 202 different countries and results information on more than 29,000 of those studies. Each week, the content grows by approximately 560 new registrations and 110 new results submissions. The system averages more than 162 million page views per month and 93,000 unique visitors daily. enables users to: (1) search for clinical trials of drugs, biologics, devices, and other interventions; (2) obtain summary information about these studies (e.g., purpose, design, and facility locations); (3) track the progress of a study from initiation to completion; and (4) obtain summary results, often before they are published elsewhere.

In addition, the unique identifier assigned to each registered trial (commonly referred to as the “NCT Number”) has become the de facto standard for referencing trials and is widely and routinely used in medical journal articles, MEDLINE citations, and the mass media.

Federal law underlies the database requirements and content. NIH launched the database in 2000 after the Food and Drug Administration Modernization Act of 1997.  The FDA Amendments Act of 2007 then expanded the database’s scope and purpose by requiring registration and results reporting for certain clinical trials of FDA-regulated drugs, biological products, and medical devices. Importantly, the 2007 law included legal consequences for noncompliance, including civil monetary penalties.

More recently, in an effort to make information about clinical trials more widely available to the public, the US Department of Health and Human Services issued a final rule in September 2016 that specifies requirements for registering certain clinical trials and submitting summary results information to The rule’s final form was shaped by over 900 public comments.

The new rule, which became effective one year ago (January 18, 2017), clarifies and expands the reporting requirements for clinical trials, including trial results for drug, biologic, and device products not approved by FDA. At the same time, NIH issued a policy establishing the expectation that all investigators conducting clinical trials funded in whole or in part by NIH will ensure these trials are registered at and that results information for these trials is submitted to

The expanded reporting requirements are expected to yield important scientific, medical, and public health benefits—from improving the clinical research enterprise itself to maintaining the public’s trust in clinical research. Having access to complete study results, including negative or inconclusive data, can help counteract publication bias, reduce duplication in research, improve the focus and design of future studies, and protect patients from undue risk or ineffective interventions. That additional information, in the context of other research, can also help inform health care providers and patients regarding medical decisions.

As a repository for study results, helps deliver those benefits.

Recent research indicates that the results of many clinical trials—including those funded by NIH—are never published. And even when results are published, they can be limited, focusing on findings of most interest rather than all outcomes. In contrast, studies have found results reported in are more complete than in the published literature. The new reporting requirements are expected to strengthen that characteristic and enhance the benefits brings.

It is important to understand that listing a study on does not mean it has been evaluated by the US Federal Government. The website emphasizes this point for the public through prominent disclaimer statements, including one on the importance of discussing any clinical trial with a health care provider before participating. allows for the registration of any human biomedical study that conforms with prevailing laws and regulations, including an indication that recruiting studies were approved by an ethics review committee. As a result, the database is more comprehensive, which can better serve the public in critical ways. For example, potential participants can see the full range of studies being conducted, not just those funded or sponsored by NIH. Ethics committees, funders, and others can also view the wider scope of studies, which can help them more effectively oversee new research.

Aside from legislative and policy changes, has also focused on enhancing the site’s usability, addressing design and layout issues and improving the ability to search, display, and review information about the studies registered on the site. The latest set of updates, released last month, included new search options (such as by recruitment status and distance from a geographic location), refinements to the display of search results, and additional information regarding study results and key record dates.  These changes, plus those brought about by the final rule, will help maximize the value of clinical trials, and by extension, advance knowledge and improve health.

From finding trials actively recruiting participants to identifying new experimental drugs or device interventions to analyzing study design and results, delivers key benefits to patients, clinicians, and researchers and puts into action NIH’s core mission: turning discovery into health. It also reflects one more way NLM makes medical and health information available for public use and patient health.

headshot of Kevin FainKevin Fain, JD, MPH, DrPH, has served as senior advisor for policy and research at since 2015. He was an attorney with the FDA from 1995-2010, specializing in clinical trial and drug regulatory matters. He earned his doctorate in epidemiology from Johns Hopkins University in 2015.

Exploring the Brave New World of Metagenomics

See last week’s guest post, “Adventures of a Computational Biologist in the Genome Space,” for Part 1 of Dr. Koonin’s musings on the importance of computational analysis in biomedical discovery.

While the genomic revolution rolls on, a new one has been quietly fomenting over the last decade or so, only to take over the science of microbiology in the last couple of years.

The name of this new game is metagenomics.

Metagenomics is concerned with the complex communities of microbes.

Traditionally, microbes have been studied in isolation, but to do that, a microbe or virus has to be grown in a laboratory. While that might sound easy, only 0.1% of the world’s microbes will grow in artificial media, with the success rate for viruses even lower.

Furthermore, studying microbes in isolation can be somewhat misleading because they commonly thrive in nature as tightly knit communities.

Metagenomics addresses both problems by exhaustively sequencing all the microbial DNA or RNA from a given environment. This powerful, direct approach immensely expands the scope of biological diversity accessible to researchers.

But the impact of metagenomics is not just quantitative. Over and again, metagenomic studies—because they look at microbes in their natural communities and are not restricted by the necessity to grow them in culture—result in discoveries with major biological implications and open up fresh experimental directions.

In virology, metagenomics has already become the primary route to new virus discovery. In fact, in a dramatic break from tradition, such discoveries are now formally recognized by the International Committee on Taxonomy of Viruses. This decision all but officially ushers in a new era, I think.

Here is just one striking example that highlights the growing power of metagenomics.

In 2014, Rob Edwards and colleagues at San Diego State University achieved a remarkable metagenomic feat. By sequencing multiple human gut microbiomes, they managed to assemble the genome of a novel bacteriophage, named crAssphage (for cross-Assembly). They then went on to show that crAssphage is, by a wide margin, the most abundant virus associated with humans.

This discovery was both a sensation and a shock. We had been completely blind to one of the key inhabitants of our own bodies—apparently because the bacterial host of the crAssphage would not grow in culture. Thus, some of the most common microbes in our intestines, and their equally ubiquitous viruses, represent “dark matter” that presently can be studied only by metagenomics.

But the crAssphage genome was dark in more than one way.

Once sequenced, it looked like nothing in the world. For most of its genes, researchers found no homologs in sequence databases, and even those few homologs identified shed little light on the biology of the phage. Furthermore, we had been unable to establish any links to other phages, nor could we tell which proteins formed the crAssphage particle.

Such results understandably frustrate experimenters, but computational biologists see opportunity.

A few days after the crAssphage genome was published, Mart Krupovic of Institut Pasteur visited my lab, where we attempted to decipher the genealogies and functions of the crAssphage proteins using all computational tools available to us at the time. The result was sheer disappointment. We detected some additional homologies but could not shed much light on the phage evolutionary relationships or reproduction strategy.

We moved on. With so many other genomes to analyze, crAssphage dropped from our radar.

Then, in April 2017, Anca Segall, a sabbatical visitor in my lab, invited Rob Edwards to give a seminar at NCBI about crAssphage. After listening to Rob’s stimulating talk—and realizing that the genome of this remarkable virus remains a terra incognita—we could not resist going back to the crAssphage genome armed with some new computational approaches and, more importantly, vastly expanded genomic and metagenomic sequence databases.

This time we got better results.

After about eight weeks of intensive computational analysis by Natalya Yutin, Kira Makarova, and myself, we had fairly complete genomic maps for a vast new family of crAssphage-related bacteriophages. For all these phages, we predicted with good confidence the main structural proteins, along with those involved in genome replication and expression. Our work led to a paper we recently published in the journal Nature Microbiology. We hope and believe our findings provide a roadmap for experimental study of these undoubtedly important viruses.

Apart from the immediate importance of the crAss-like phages, this story delivers a broader lesson. Thanks to the explosive growth of metagenomic databases, the discovery of a new virus or microbe does not stop there. It brings with it an excellent chance to discover a new viral or microbial family. In addition, analyzing the gene sequences can yield interesting and tractable predictions of new biology. However, to take advantage of the metagenomic treasure trove, we must creatively apply the most powerful sequence analysis methods available, and novel ones may be required.

Put another way, if you know where and how to look, you have an excellent chance to see wonders.

As a result, I cannot help being unabashedly optimistic about the future of metagenomics. Fueled by the synergy between increasingly high-quality, low-cost sequencing, improved computational methods, and emerging high-throughput experimental approaches, the prospects appear boundless. There is a realistic chance we will know the true extent of the diversity of life on earth and get unprecedented insights into its ecology and evolution within our lifetimes. This is something to work for.

casual headshot of Dr. KooninEugene Koonin, PhD, has served as a senior investigator at NLM’s National Center for Biotechnology Information since 1996, after working for five years as a visiting scientist. He has focused on the fields of computational biology and evolutionary genomics since 1984.

Adventures of a Computational Biologist in the Genome Space

Guest post by Dr. Eugene Koonin, NLM National Center for Biotechnology Information.

More than 30 years ago, when I started my research in computational biology (yes, it has been a while), it was not at all clear that one could do biological research using computers alone. Indeed, the common perception was that real insights into how organisms function and evolve could only be gained in the lab or in the field.

As so often happens in the history of science, that all changed when a new type of data arrived on the scene. Genetic sequence information, the blueprint for building all organisms, gave computational biology a foothold it has never relinquished.

As early as the 1960s, some prescient researchers—first among them Margaret Dayhoff at Georgetown University—foresaw genetic sequences becoming a key source of biological information, but this was far from  mainstream biology at the time. But through the 1980s, the trickle of sequences grew into a steady stream, and by the mid-1990s, the genomic revolution was upon us.

I still remember as if it were yesterday the excitement that overwhelmed me and my NCBI group in the waning days of 1995, when J. Craig Venter’s team released the first couple of complete bacterial genomes. Suddenly, the sequence analysis methods on which we and others had been working in relative obscurity had a role in trying to understand the genetic core of life. Soon after, my colleague, Arcady Mushegian, and I reconstructed a minimal cellular genome that attracted considerable attention, stimulating experiments that confirmed how accurate our purely computational effort had been.

Now, 22 years after the appearance of those first genomes, GenBank and related databases contain hundreds of thousands of genomic sequences encompassing millions of genes, and the utility and importance of computational biology are no longer a matter of debate. Indeed, biologists cannot possibly study even a sizable fraction of those genes experimentally, so, at the moment, computational analysis provides the only way to infer their biological functions.

Indeed, computational approaches have made possible many crucial biological discoveries. Two examples in which I and my NCBI colleagues have been actively involved are elucidating the architecture of the BRCA1 protein that, when impaired, can lead to breast cancer, and predicting the mode of action of CRISPR systems. Both findings sparked extensive experimentation in numerous laboratories all over the world. And, in the case of CRISPR, those experiments culminated in the development of a new generation of powerful genome-editing tools that have opened up unprecedented experimental opportunities and are likely to have major therapeutic potential.

But science does not stand still. Quite the contrary, it moves at an ever-accelerating pace and is prone to taking unexpected turns. Next week, I’ll explore one recent turn that has set us on a new path of discovery and understanding.

casual headshot of Dr. KooninEugene Koonin, PhD, has served as a senior investigator at NLM’s National Center for Biotechnology Information since 1996, after working for five years as a visiting scientist. He has focused on the fields of computational biology and evolutionary genomics since 1984.

%d bloggers like this: