What Will 2020 Bring?

I don’t have a crystal ball, but as director of NLM, I need to keep an eye to the future.

Last month, I highlighted a few of NLM’s many accomplishments in 2019. Today, I want to devote some time to musing about what might happen at NLM in 2020.

I know that I’ll be in a new office, but I don’t know where just yet! No, I’m not leaving NLM, but as we prepare for major renovations to our Building 38, most of the staff in the building, including me, will move to other office space on campus for about two years. That will be enough time to implement a major redesign of the first floor of our 60-year-old, architecturally dramatic but not really fit-for-purpose workspace to make more efficient use of the space, add modern office layouts and meeting spaces, and modernize our HVAC systems. I’ll keep musing throughout the renovations; I just won’t be sitting on the mezzanine while I do it.

I know that NLM will continue to grow our Intramural Research Program (IRP), which focuses on computational biomedical and health sciences. We hired two new tenure-track investigators this past year and expect to add one or two more in 2020. The IRP brings together two NLM divisions, the National Center for Biotechnology Information, specifically the Computational Biology Branch, and the Lister Hill National Center for Biomedical Communications, which emphasize discovery based on molecular phenomena and clinical information. I also expect to see greater alignment of our training efforts, including an expansion of the public-facing parts of our training.

I know that we’ll continue to make biomedical and health information literature available to the public, scientists, and clinicians. I anticipate a greater emphasis on public access and open science. Our entire PubMed Central (PMC) repository of full-text literature is already freely available to the world, and with the increasing interest in open access to government-supported research findings, I expect that this repository will grow. PMC will grow in new ways, too, such as enhancing the discoverability of data sets in support of published results made available with articles as supplementary material or in open repositories, and supporting greater transparency in scientific communication through the archiving of peer review documents.

I know that we’ll move many NLM resources to the cloud and continue to support efforts to make strides through the National Institutes of Health (NIH) Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative to accelerate discovery by harnessing the power of commercial cloud computing. This will not only offer some logistical savings, it will also increase the discoverability of our resources.

I know that NLM will play a bigger and more vital role in big science as it unfolds at NIH. Our intramural researchers are expanding the application of deep learning technologies to clinical, biological, and image data. In collaboration with the NIH Office of Data Science Strategy, we’ll build and release new tools to help researchers leverage the FHIR standard to make clinical data more accessible for research, and to improve phenotype characterization. These initiatives will accelerate data sharing by advancing standard approaches to research data representation.

I know that NLM will advance its impact on and outreach to professional and lay communities around the country. Our National Network of Libraries of Medicine has exciting plans to expand its training in research data management and to provide local health information education and support to help health care providers working with American Indian and Alaska Native populations address challenges such as mental health and HPV-related cancer.

I know that we’ll continue to improve health by improving access to data and information. Stay tuned to my Musings posts in 2020 to see what we accomplish!

Celebrating 20 Years of ClinicalTrials.gov and Looking to the Future

Guest post by Rebecca Williams, PharmD, MPH, acting director of ClinicalTrials.gov at the National Library of Medicine, National Institutes of Health.

As ClinicalTrials.gov celebrates its 20th anniversary on February 29, 2020, we’re asking for your input on how it can best continue to serve your needs for many more years to come.

ClinicalTrials.gov is the world’s largest public clinical research registry and results database, giving patients, families, health care providers, researchers, and others easy access to information on clinical studies relating to a wide range of diseases and conditions. This online resource, which has more than 145,000 unique visitors every day, is operated by NLM and makes available information provided directly by the sponsors and investigators conducting the research.

NLM has launched an effort to modernize ClinicalTrials.gov to deliver an improved user experience on an updated platform that will accommodate growth and enhance efficiency. Creating a roadmap for modernization requires feedback from a wide array of stakeholders on how to continue serving, balancing, and prioritizing their varied information needs. These stakeholders include sponsors and investigators who submit clinical trial information to the site, academic institutions, nonprofit and advocacy organizations, government agencies, and the public, all of whom can access and use the information that ClinicalTrials.gov contains free of charge.

To obtain timely, detailed, and actionable input, we have issued a Request for Information (RFI) to solicit comments on the following topics: website functionality, information submission processes, and use of data standards.

Recognizing that ClinicalTrials.gov supports a network of stakeholders who contribute to, and rely on, clinical research, our aim is to understand how the system can better support this network and to identify opportunities for improving its compatibility with existing clinical trial management tools and processes. It is important to note that this RFI focuses on the functionality of ClinicalTrials.gov and is not intended to modify existing legal and policy requirements for clinical trial registration and results submission.

Over its 20-year history, ClinicalTrials.gov has helped shape the way in which clinical trial information is made transparent and discoverable to the public (see figure 1). In 2000, sponsors and investigators began submitting structured summaries of clinical trial protocols for the public to view. Over time, new policies and laws reinforced this practice, and ClinicalTrials.gov now contains over 320,000 study listings, with 56,000 studies currently seeking participants.

In 2008, ClinicalTrials.gov added its results database for sponsors and investigators to share summary results after trial completion. There are now over 40,000 results summaries posted on ClinicalTrials.gov, providing the public with timely access to information that may not be available in the peer-reviewed literature.

Figure 1: Total number of posted study records per year on ClinicalTrials.gov and timeline of major events from 1997 to 2019

Sharing information throughout the life cycle of a clinical trial (see figure 2) supports conduct of a landscape analysis prior to conducting new research and advances important public health goals, including supporting people who are looking to participate in clinical research, tracking the progress of clinical trials, allowing for the evaluation of the integrity of reported research, and providing more complete clinical trial information to help inform patient care. The modernization effort currently underway builds on the solid foundation established during the last 20 years.

Figure 2: Role of ClinicalTrials.gov in supporting use and sharing of information throughout the study life cycle

Share Your Feedback!

Responses to the RFI must be received by March 14, 2020. We expect a wide range of comments and are taking steps to manage and share the feedback. We will summarize the responses during a public meeting on April 30 on the main campus of the National Institutes of Health in Bethesda, Maryland, that will also be accessible by webcast. Details on the meeting will be available soon. In addition, we are engaging the NLM Board of Regents to provide input as we develop a roadmap for modernization, including establishing priorities and identifying the roles that various stakeholders might play in modernizing ClinicalTrials.gov.

Want to Learn More?

To learn more about the RFI and how to share your feedback, please join us for a webinar on January 22. We look forward to working with you to learn more about — and consider how to meet — your needs as we embark on this multiyear modernization effort.

Photo of Rebecca Williams, PharmD, MPH

Rebecca Williams, PharmD, MPH, oversees the ClinicalTrials.gov program. Her research interests involve improving the quality of reporting of clinical research and evaluating the clinical research enterprise.

“What 2019 NLM Accomplishment Makes You Most Proud?”

I was asked this question during a recent “brown bag” conversation with NLM staff. While it’s tempting to launch into my list of accomplishments, I turned the question back to those present. I was surprised, proud and intrigued by what they had to say.

First, let me tell you a little bit about our lunchtime brown bag conversations. We have a large staff (almost 1,700 women and men) and use a variety of formal and informal approaches to foster discussion: Town Hall meetings held twice a year; regular email messages to share timely information; our NLM In Focus blog, which provides a look inside NLM; and supervisor-led meetings. I host brown bag conversations about once a month and am usually joined by 2-3 members of the NLM leadership team. Almost always, staff from various parts of the Library attend – mingling together our scientists, librarians, administrators and communications staff. Conversations are lively, and I get to learn a lot about what is on the minds of our staff.

So, it was instructive, and enjoyable, to hear different views about NLM accomplishments. Some people talked about greater engagement with and accountability by NLM leadership, while others focused on specific scientific advances. Still, others noted our many advances with data science, particularly in upskilling our workforce.

I want to point out a few of these accomplishments.

Teresa Przytcka, PhD, senior investigator in NLM’s National Center for Biotechnology Information, shared her team’s accomplishment with the creation of a new algorithm called scPopCorn (acronym for single-cell sub-Populations Comparison) to understand the differences between populations of cells from single-cell experiments. This approach helps researchers identify different cell types and helps to differentiate between sexes, disease status, animal type, and more.

Olivier Bodenreider, MD, PhD, senior scientist and chief of the Cognitive Science Branch of NLM’s Lister Hill National Center for Biomedical Communications (LHNCBC) described how he is leading the re-envisioning of the research and research and development efforts within his center. One LHNCBC staff scientist, Vojtech Huser, MD, PhD, described the success his team has had in generating new publications this year.

Several people talked about the journey to prepare NLM and its staff for data science. Our Data Science @NLM Training Program team set up a year-long process of preparing our workforce for the future. Over 750 people completed a data science skills assessment and developed individual learning plans. Over the summer, NLM staff participated in an intensive 120-hour data science fundamentals course, culminating in a wide variety of projects that were showcased during our Data Science Open House. Over 300 people attend this exciting and energizing showcase of our talents!

Several people talked about accomplishments that made our entire NLM operations work better, such as greater engagement with staff, better use of project management strategies to improve efficiencies, and smooth integration of staff into new work teams.

Taking the writer’s privilege of identifying more accomplishments, I am exceptionally proud of the efforts of staff across the NLM who designed or participated in the Data Science initiatives. I am honored to work with a great leadership team who are making bold and sometimes difficult decisions to prepare the NLM for its future. We made a huge advancement in open science by moving our entire Sequence Read Archive public data to the cloud, completing the first phase of an ongoing effort to better position these data for large-scale computing. This work represents both a technological feat as well as a major contribution to biological discovery.

As I reflect on our discussion about 2019 accomplishments, I learned that every person across the NLM has something that he or she is proud of. I also learned that some of us experience NLM as a tight-knit research team, while others take a more-broad-brush view of activities and events. Most importantly, I learned that there are many things to celebrate in this wonderful institution we call the National Library of Medicine! 

The Holiday Season — What Ever Way Works Best for You

As the Andy Williams song goes, “It’s the holiday season / So whoop-de-do and hickory dock / And don’t forget to hang up your sock.”

This song from my childhood matches my mood and warms my soul. It brings back memories of growing up in a house full of kids, making presents for parents and cards for grandparents, and enjoying the sounds and smells of the holiday season.

In high school, I learned that not every home had a Christmas like mine. My best friend’s grandfather died in the hospital on Christmas morning when we were freshman. For her, holidays became a poignant reminder of loss. And I began to realize that some families had other celebrations, or even multiple celebrations.

I entered high school in 1967, the year after Dr. Maulana Karenga created the festival called Kwanzaa, a pan-African holiday that celebrates family, community, and culture. As time went on, I developed an appreciation of the many ways that different people mark holidays, from the winter solstice celebrations of the Wiccans in central Wisconsin to the celebration of Diwali around the world.

Here at NLM, our resources offer interesting and helpful information related to holiday seasons.

If you enter the word “Christmas” or “solstice” in our PubMed search box, you’ll retrieve over 3,000 citations. One of these is Dr. Jori Bogetz’s article in the Journal of Palliative Medicine reflecting on why she works on Christmas. An article from the British Veterinary Association describes how to choose a holiday meal that supports animal health and welfare. A third, in the Medical Journal of Australia, warns about the risks inherent in Christmas celebrations, and the journal Nature provides an unusual description of a winter solstice celebration. Some investigators sought to uncover evidence of a Christmas spirit through functional magnetic resonance imaging, while others examined the surge in myocardial infarction during certain holiday periods.

Indeed, this time of year can be complicated.

Another of NLM’s resources, MedlinePlus, provides guidance on a range of health topics — everything from managing seasonal affective disorder to encouraging healthy holiday eating to coping with sadness and grief — both for the people affected and for those around them who are wondering how to help.

In many ways, holidays allow us time to pause amid our everyday lives. Ideally, we can use the moment to be more observant and more mindful, of both ourselves and others.

I hope you find the joy and peace that the season holds, and that you extend some of that joy and peace to those around you, throughout the holidays and beyond.

Meet Our Newest Investigator: Xiaofang Jiang, PhD, Seeks a Greater Understanding of the Human Microbiome To Improve Health

In this week’s installment of Musings, I’d like to introduce you to Xiaofang Jiang, PhD, who recently joined NLM’s Intramural Research Program as a tenure-track investigator.

Dr. Jiang’s research focuses on the development of computational methods to advance our understanding of the human microbiome, which plays a very important role in our health. Her lab is using bioinformatic methods to predict what the trillions of microbes living in and on the human body do, how they spread between people, and which kinds of genes the microbiome community shares.

Turning data on the human microbiome into usable insights is a challenge that demands both knowledge of the biological literature and skill in bioinformatics. Dr. Jiang’s lab is developing approaches intended to do just that — bridge the gap between information and action.

We are fortunate to have added another strong and curious investigator to our team. I know Dr. Jiang will play an important role in accelerating data-driven discovery here at NLM!


Video Transcript (below)

I’ve had a long interest in physics and math ever since I was in middle school. But, I was discouraged to choose math or physics as major when I went to college. That’s because my family and friends thought that I would have a hard time finding a good job as a female based on what they saw, at that time, in China.

In the end, I chose Biology as my major, which opened a new door for me. It provides the foundation for my current research and led me to a beautiful world of evolution and life science.

For my Ph.D., I chose computational biology as my major because it is a major that combines my passion in computer science as well as biology.

For a long time, I observed that, for computer scientists, if they wanted to understand biomedical data they needed to have a good understanding of biology. For biologists, if they wanted to speed discovery, they required the help of computer scientists. And my background sort of bridges this gap.

I think we’re at a great stage where we can actually have the ability to turn data into actionable items that can be directly applied to medical decision-making. Data science and the microbiome combined to improve our heath. 

NLM is one of the few places where I can start my research program in data science. There is a critical mass of truly exceptional and top-notch scientists here. And I also find people in NLM are approachable. From the Director to the top scientist, you can just knock on their door and talk with them, and they are always willing to help.

NLM is the place where I can do the research that I love and enjoy, and also make a difference at the same time.

Everyone’s Voice Matters: Making Science Open and Accessible to the Public

Last month, the National Institutes of Health (NIH) released its Draft NIH Policy for Data Management and Sharing and Supplemental Draft Guidance (Draft NIH Policy), making it available for public comment. Comments are due by January 10, 2020. Because everyone’s voice matters, I’m calling on the Musings audience to review the draft and offer your perspectives on this policy now! 

The Draft NIH Policy arises from NIH’s deep commitment to fostering a culture of scientific data stewardship.

Data stewardship is a research responsibility that includes systematically acquiring data, carefully documenting data, securely storing data, and, where possible, making data available for use by other scientists and society as a whole. This last activity, often referred to as “data sharing,” is essential for accelerating the translation of science into knowledge and ensuring that the full value of the data collected becomes the substrate for future discoveries.

NIH’s Long-Standing Commitment to Make Research Results Available

In 2003, NIH released its original data sharing policy, which established the expectation that research data from large NIH-supported awards will be shared to the extent allowed by scientific protocol and human subjects considerations. Since 2008, the NIH Public Access Policy has ensured that the public has free access to the published results of NIH-funded research. NLM’s PubMed Central, a free, full-text archive of peer-reviewed biomedical and life sciences journal literature, serves as the repository for these articles.

In 2014, NIH updated its Genome-Wide Association Studies Policy with an expanded NIH Genomic Data Sharing Policy to ensure the broad and responsible sharing of genomic research data. And in 2016, the NIH published the NIH Policy on the Dissemination of NIH-Funded Clinical Trial Information, which established expectations for registering and submitting the results of all NIH-funded clinical trials on ClinicalTrials.gov. Individual Institutes, Centers, and programs have also established expectations for managing and sharing data resulting from their funded research.

Data Sharing Principles

NIH recognizes that all scientific data need to be managed according to sound principles. The Draft NIH Policy would require researchers to develop explicit data management and sharing plans that describe their approaches for preserving and sharing data. Reasonable, allowable costs for data curation and preservation would be permitted as direct expenses for the project. Proposed guidance about allowable costs of data management and sharing, and the elements of a good data management and sharing plan was released along with the draft policy and can be found on the NIH Data Management and Sharing Activities Related to Public Access and Open Science web page.

While promoting broad sharing of data, the Draft NIH Policy is deliberately designed to be flexible and allow researchers to propose approaches that address legal, ethical, and other practical considerations that may limit data sharing. The policy proposes that data management and sharing plans be submitted “just in time” and evaluated by NIH program staff. Agreed plans will be incorporated into Terms and Conditions of the Award, and NIH staff will monitor compliance with the plans at regular reporting intervals.

Data Sharing Benefits the Scientific Community and the Public

For the scientific community, data sharing enables the validation of scientific results by both the originator of the data and other scientists, increasing transparency and accountability. Data sharing also strengthens collaborations, which allows for richer analyses. Strong data-sharing practices facilitate the reuse of hard-to-generate data, such as those acquired during complex experiments or once-in-a-lifetime events like natural disasters. And, finally, data sharing promotes scientific progress and accelerates future research.

For the public, sound data-sharing practices demonstrate good stewardship of taxpayer funds. Clear, well-written data-sharing and management plans promote transparency and accountability to society. And for research involving human subjects, data sharing honors participants’ efforts by maximizing the contribution of the data acquired through their participation.

Tell Us What You Think!

NIH acknowledges that this draft policy offers new opportunities for advancing science while also creating new expectations and responsibilities for librarians, scientists, trainees and graduate students, and institutional research management offices. And I’ve highlighted some of the benefits of data sharing to the scientific community and the public.

As I emphasized earlier in this post, everyone’s voice matters — so we’d like to hear from all of you about the approach NIH is proposing. You can share your comments on the purpose of the policy, its key definitions, the scope and requirements for the plans, and the effective dates until Friday, January 10, 2020.

Want to Learn More?

NIH is hosting an informational webinar on the Draft NIH Policy for Data Management and Sharing and Supplemental Draft Guidance on Monday, December 16, 2019, from 12:30 p.m. to 2:00 p.m. EST. The purpose of the webinar is to provide information on the draft policy and answer any questions about the public comment process.

Please note that public comments will not be accepted during the webinar; they must be submitted here.

Accessing the Webinar

If you would like to attend the December 16 webinar, please see the instructions below:

  • To view the webinar presentation, click here.
  • To join the webinar by phone:
    • U.S. and Canadian participants can dial 866-844-9416 and enter passcode 4009108.

Please note that while you will be able to view the webinar through Webex, you must use one of the specified phone lines to connect to the audio. You will not be able to dial in to the webinar via your computer.

You may also send questions in advance of the webinar to SciencePolicy@od.nih.gov.

The Pursuit and Power of Alignment

Guest post by Valerie Schneider, PhD, staff scientist at the National Library of Medicine’s National Center for Biotechnology Information, National Institutes of Health.

As a staff scientist at NLM, I’ve found that our strategic plan has become a valuable framework for organizing our mission and providing direction and focus—especially when we’re talking about data science.

A recent project at NLM’s National Center for Biotechnology Information (NCBI) highlights why it’s important to ensure alignment between projects and strategy.

As host to the world’s largest repository of biological sequence data, NCBI provides access to data that are critical to understanding and advancing human health. While users have been searching NCBI’s sequence databases long before the strategic plan was developed, it might be easy to overlook how an effort like the strategic plan has anything to do with the larger picture. When you look, though, it’s easy to see the relationship.

Providing a Common Search Experience

Connecting the resources of a digital research enterprise and advancing research and development in biomedical informatics and data science are just a few of the important objectives in NLM’s strategic plan. We’ve improved the experience of users searching for several types of common sequence-associated data by providing a more comprehensive interpretation of their queries and a new results interface that provides easy access to NCBI’s best results, regardless of the database in which they search.

Our team tackled this effort through extensive user interviews, iteratively developing solutions, and monitoring the usage of those solutions.

We improved searches for the reference set of genes and genomes in all species across multiple NCBI databases by supporting common language queries and using features like auto-suggest. We enhanced the ability to search and access clinically important datasets, such as human variations housed in ClinVar and dbSNP, NCBI’s variation databases, as well as resources with information about antimicrobial resistance genes and viral pathogens.

We also created displays that aggregate the results from different databases and enable easy downloads of data and access to analysis tools. Our new interactive graphics and web page displays allow for the visualization of sequences and the analysis of homologous gene sets. Knowing that NLM users rely on different technologies to access data, we ensured that the displays work on both traditional computers and mobile devices.

Since the first release of these search enhancements in late 2018, they are now triggered in a quarter of all searches in the scoped databases. We’ve seen a 300% increase in their use, with more than 300,000 users clicking on the content they offer in just the month of October 2019. These products have provided results for over 500,000 searches that previously would have returned no content. Regular monitoring of their use helps us make sure that we continue to facilitate search and deliver high-value data.

NLM’s strategic plan gave us the user-centered framework in which to execute the goals of this project. So much of the work we do at NLM is consistent with the goals and specific objectives of the plan — it provides a structure for evaluating our work and making sure that we continue to be forward-looking.

And the strategic plan helps me, as a staff scientist, to identify new areas for work that will best enable NLM to continue delivering a platform for biomedical discovery and data-powered health.

To stay up to date on NCBI projects and research, follow us on Twitter.

Photo (headshot) of Valerie Schneider, PhD
Valerie Schneider, Ph.D.

Valerie Schneider, PhD, is the deputy director of Sequence Offerings and the interim head of the Sequence Delivery Program. In these roles, she coordinates efforts associated with the curation, enhancement and organization of sequence data, as well as oversees tools and resources that enable the public to access, analyze, and visualize biomedical data. She also manages NCBI’s involvement in the Genome Reference Consortium, the international collaboration tasked with maintaining the value of the human reference genome assembly.