Meet the Next Generation of Leaders Advancing Data Science and Informatics at NLM

Guest post by Virginia Meyer, PhD, Training Director for the Intramural Research Program, National Library of Medicine, National Institutes of Health.

Working at NLM means being at the forefront of innovation in the rapidly evolving fields of data science and informatics. Within that environment, the NLM Intramural Research Program (IRP) is dedicated to supporting individuals looking to develop and apply computational approaches to a broad range of problems in biomedicine, molecular biology, and health.

NLM understands that contributions from people of diverse backgrounds, cultures, and histories enables research that has the greatest impact and reaches the widest possible audience. Such a workforce is necessary to drive innovation and scientific advancement and is imperative to ensuring that computational tools and data sets are free from bias. To that end, the Diversity in Data Science and Informatics (DDSI) Summer Internship, a program of the NLM IRP now in its inaugural year, was developed to support and engage young scientists who are dedicated to careers in computational biology and biomedical informatics. It is our hope that time spent in the DDSI program and Principal Investigators (PI) will encourage trainees to continue along the path toward becoming leaders in their chosen fields.

Meet four of this year’s DDSI interns and learn about the work they are doing in the NLM IRP!

Will Hibbard
Graduate Student in Biomedical Informatics
University of Buffalo

PI: Olivier Bodenreider, MD, PhD, Computational Health Research Branch, Lister Hill National Center for Biomedical Communications at NLM
Research Area: Natural Language Processing

What interested you most about the DDSI program?
I found out about the program when a teacher recommended it to me out of the blue, and after looking into it, I found a lot of fun research projects I could join. The program offered an opportunity to join research projects in familiar and unfamiliar fields. Ultimately, it was pleasantly outside of my comfort zone and presented the kind of challenge that makes me love research.

What research project are you working on and why?
I ended up working with Dr. Olivier Bodenreider using neural networks to better develop natural language processing in medical databases. I applied to this project because it involved two areas in which I had less experience: ontology and data structures. I pursued this research area because it allowed me the chance to improve in fields that I did not understand well at the time.

Why might someone want to apply to the DDSI program in the future?
This is the kind of experience with challenges that allow you to grow as a person and as a professional. Whether you know the area of research well or have trouble understanding it, this program will give you an opportunity to learn through a practical research project.

What is next for you after you complete your internship?
I will be taking a gap year while I apply to medical school. I am hoping to work in my local oncology institute and medical corridor.

MG Hirsch
PhD Student in Computer Science
University of Maryland, College Park

PI: Teresa Przytycka, PhD, Computational Biology Branch, National Center for Biotechnology Information at NLM
Research Area: Evolutionary Genomics

What interested you most about the DDSI program?
Evolution of gene expression and modeling different modes of evolution is something that I had yet to explore in my PhD research. I thought a summer program would be perfect to learn about it. It also gives me the opportunity to get a feel for working at the NIH and if I would want to consider the NIH Graduate Partnerships Program.

What research project are you working on and why?
I am evaluating the possibility of different modes of gene expression evolution within a tumor. Previous work in the lab considers different models of gene expression evolution between animal species. Many models of evolution assume neutral evolution, that mutations occur and persist randomly; however, we know that mutations that change phenotypes undergo various selective pressures from the environment. Considering this, previous work, resulting in the software EvoGeneX, has fit computational models using Ornstein-Uhlenbeck processes to evaluate potential divergence of gene expression within fly species. My research project is applying this same concept to cancer tumors. After tumorigenesis, cancer cells rapidly accumulate further mutations and diversify into subclones within the same tumor. Owing to the different sets of mutations, these subclones evolve differently. We can hypothesize then that the evolution of the gene expression of subclones can be modeled using the same computational models.

Why might someone want to apply to the DDSI program in the future?
The DDSI program offers extra speaker talks and networking opportunities.

What is next for you after you complete your internship?
I will be finishing my PhD in computer science at UMD.

Sirisha Koirala
Undergraduate Student in Computer Science
University of Maryland, College Park

PI: Zhiyong Lu, PhD, Computational Biology Branch, National Center for Biotechnology Information at NLM
Research Area: Natural Language Processing and Computational Biology

What interested you most about the DDSI program?
I was most interested in the unique ongoing research projects that students had the opportunity to participate in, which I would not have been able to find at other programs. It was very interesting to learn about the ways that artificial intelligence (AI) could be applied to medical practices, and this stood out to me as medicine and AI are two of my main interests.

What research project are you working on and why?
I am working on AI in the prediction of progression in age-related macular degeneration. In my first year of college, I was on the pre-medicine track; however, while gaining greater exposure, I realized that I have a stronger passion for computer science. Within the field of computer science, I have a particular interest in AI, and this project specifically allowed me to combine both of my interests and backgrounds.

Why might someone want to apply to the DDSI program in the future?
The DDSI program provides students who come from underrepresented backgrounds a chance to gain real hands-on experience. As a student who came from a small, all-women’s university where I did not have the availability to engage in such opportunities, this program has helped me significantly. I have been able to get the real-world experience I need to help me excel further in my career preparations, and students who are in similar positions should consider applying for this reason.

What is next for you after you complete your internship?
After I complete my internship, I will be starting my second year of college at University of Maryland, College Park where I am pursuing a major in computer science.

Tochi Oguguo
Undergraduate Student in Computer Science and Information Systems
University of Maryland, Baltimore County

PI: Sameer Antani, PhD, Computational Health Research Branch, Lister Hill National Center for Biomedical Communications at NLM
Research Area: Bias in Machine Learning

What interested you most about the DDSI program?
What interests me the most about this program is the amount of experience you gain during the summer. You leave understanding concepts at a higher level and applying lessons to your life outside of research.

What research project are you working on and why?
My research project is about bias in machine learning. By using fair active learning, we teach the machine how to give accurate responses when diagnosing or classifying a dataset or image. Bias is one of the biggest issues in machine learning, especially in health care where inaccurate judgment can be dangerous.

Why might someone want to apply to the DDSI program in the future?
DDSI is a great program to help students and interns learn more about career paths out there for them to explore and to help you become a more resilient person and scientist outside of research.

What is next for you after you complete your internship?
I plan to apply again next summer and keep working in research and machine learning! Also, I will take more classes in information science to help me become a better programmer.

Help Us Modernize NIH’s Genomic Data Sharing Policy

Guest post by Taunton Paine, MA, Director of the Division of Scientific Data Sharing Policy, NIH Office of Science Policy

Behind the NIH Genomic Data Sharing Policy

In November 2021, NIH published a request for information seeking public input on the future of the NIH Genomic Data Sharing (GDS) Policy. Originally published in 2014, the NIH GDS Policy expanded and refined an existing framework for the broad and responsible sharing of genomic research data originally created for genome-wide association studies. Since this policy framework was first implemented, NIH has accepted data from more than 1,200 studies in the NIH database of Genotypes and Phenotypes (dbGaP) hosted by NLM and facilitated more than 64,000 additional research uses. Many more studies involving non-human data and human data with study participant consent for full public access have been shared as a result of the GDS Policy through a variety of additional NIH repositories, such as GenBank and the Sequence Read Archive, which are also hosted by NLM.

While the GDS Policy has been remarkably successful at spurring the timely, productive, and secure sharing of genomic data, NIH has devoted substantial effort to maintaining the relevance of this framework by issuing updates as needed. NIH has provided substantial guidance to account for trends in science, technology, and society. For example, the policy and related guidance evolved to accommodate a growing shift toward cloud computing in genomic research.

Evolving Priorities: Help Us Shape the Future of Genomic Data Sharing

In October 2020, NIH issued the Final NIH Policy for Data Management and Sharing. The final policy will be effective on January 25, 2023. To better align the GDS and the NIH Policy for Data Management and Sharing policies, NIH is soliciting input about proposed changes to the GDS policy. Described below are some of the key proposed issues for which NIH is seeking comment in the request for information.

The use of genomic data in research continues to evolve. Specifically, there is growing interest in the use of human data elements that might be considered identifiable, which cannot currently be submitted to NIH genomic data repositories, and in the ability to match participants’ data across repositories or with data from other sources. The request for information seeks comment on whether NIH should permit these activities, and if so, what additional protections may be necessary.

To reduce the technical burden of analyzing genomic data, NIH has developed additional resources for storing, sharing and analyzing human genomic data in addition to dbGaP, resulting in an increasingly federated landscape of platforms and repositories hosted by NIH and awardee institutions. To ensure consistency of operations and protections, NIH is proposing core principles for NIH-supported genomic data repositories and platforms.

NIH frequently receives questions about other types of high-dimensional “omics” data, such as microbiomic or proteomic data, which describes new and comprehensive approaches for analyzing molecular profiles of humans and other organisms. In some cases, non-genomic data types may pose similar risks of re-identification as large-scale genomic data but may not be subject to the GDS Policy in all scenarios. Furthermore, the GDS Policy may not apply even when genomic data are generated in some scenarios, such as for very small studies. As a longer-term consideration, NIH is soliciting views on whether the more specific sharing expectations of the GDS Policy or the protective framework it offers should be adjusted to account for these other data types or scenarios.

We are Listening!

We are working to ensure that the framework established by the GDS Policy keeps pace with the needs of the research enterprise, research participants, and the patients it is ultimately intended to benefit. This RFI may result in updates to the GDS Policy, related guidance, or implementation. That’s why we’re asking you, the community, for your input. Please visit the request for information page today; comments are due by February 28. We look forward to hearing your input and appreciate your efforts!

Taunton Paine, MA is the Director of the Scientific Data Sharing Policy Division in the Office of Science Policy at the NIH. Taunton has been with the Office of Science Policy since 2011. His division is responsible for issues relating to data sharing policy, including issuance of the recent NIH Data Management and Sharing Policy, oversight of the NIH Genomic Data Sharing Policy, and management of the Data Science Policy Council. He holds a dual master’s degree from Columbia University and the London School of Economics and Political Science where he studied the history of international relations

The Science of SARS-CoV-2 Testing: What Tests Are Available and What This May Mean for You

Guest post by Clem McDonald, MD, Chief Health Data Standards Officer at the National Library of Medicine

COVID-19 testing equips individuals with the information they need to protect themselves and others, and arms public health professionals with data that can inform response efforts.

Recently, leadership across NIH articulated why widespread testing is necessary, important, and achievable. Equally important is understanding the different types of testing available. As a leader and pioneer in the development of clinical data standards, NLM supports the electronic exchange of clinical health information data, including those related to COVID-19 testing, for approved purposes and with appropriate privacy protections.

Three types of testing are available to identify COVID-19 (the disease caused by the SARS-CoV-2 virus).

1) Nucleic acid amplification tests (NAAT), also called molecular tests, detect the virus’s genetic material;

2) Antigen tests detect parts of specific proteins produced by the virus; and

3) Antibody tests detect COVID-19 antibodies in the blood (serum) that infected people develop to fight off the virus.


NAAT tests are dependent upon a method used to multiply the relatively few copies of viral nucleic acid that might be present in a specimen into a very large number of copies — making it much easier detect the virus. At present, most NAAT tests use an amplification method called polymerase chain reaction (PCR).

PCR uses small segments of DNA, called primers, to pick out the DNA that it needs to multiply. The PCR instruments process the sample in repeated cycles of heating and cooling. During each cycle, the number of copies of the targeted nucleic acid doubles. From a few original copies, it can generate up to a billion new copies to make the virus easier to see in the final detection step.

The FDA recently authorized a different NAAT test method called loop-mediated isothermal amplification (LAMP). This test method warms the sample to a constant temperature and uses six different primers to drive the replication of different segments of the novel coronavirus’s genome. It does not require multiple cycles of heating and cooling. By many accounts, this method is faster and easier to use than real-time PCR. Other methods of COVID-19 detection are under development.  

Different SARS-CoV-2 NAAT testing products target different parts of the virus, use different primers to start the PCR reaction, apply to different specimens, and differ in the ability to detect the virus.

The primary methods for collecting a sample are through nasal, throat, and saliva (spit). Nasopharyngeal (NP) samples are believed to be the most sensitive for detecting the virus, but pushing the swab through the nostril into the nasopharynx at the base of the skull can be uncomfortable. The collection of other samples from nasal swabs and saliva can be easier on the person being tested and are becoming increasingly accessible.

The spread of SARS-CoV-2 is particularly challenging to manage because people can be contagious and spread the infection to others, even before they begin to show symptoms. NAAT tests can sometimes detect the virus in early stages before symptoms appear, but not always, and do not necessarily turn positive immediately with the onset of symptoms.

One strategy with NAAT tests involves the use of pooled samples. Pooled sampling involves mixing several samples together in a batch, or pooled sample, then analyzing the pooled sample with a diagnostic test. If the test on the pooled specimen is negative, then all the individuals who contributed to the pool are considered negative for COVID-19. If the pooled sample is positive, the lab must run separate tests on each of the samples to determine who is positive and who is negative. When the prevalence of COVID-19 in a population is low (in the 1-2% range), the total number of tests needed is reduced, and an organization’s testing capacity increases.


Antigen tests for COVID-19 detect the presence of a protein that is part of the SARS-CoV-2 virus. Today, the NP and mid-nasal samples are the primary sampling methods used for antigen testing, but the development of antigen tests for saliva are underway.

Antigen tests are relatively inexpensive and provide results almost immediately. These tests perform best in the early days after an infection begins. While they are not as sensitive as NAAT tests, some have suggested that repeated testing with a fast, although less sensitive test, may do more to help end the epidemic more quickly than perfect tests done infrequently.


Antibody SARS-CoV-2 tests detect the antibodies, or the “virus fighting proteins”, that a person’s immune system produces to fight infection. Antibody testing is generally done on the serum component of a blood sample. Antibodies may appear just a week or so after symptoms of SARS-CoV-2 infection appear. Antibody tests are not used to diagnose an active COVID-19 infection; however, they are useful for detecting whether someone has had a past infection.

Two different kinds of antibodies can be measured: IgM (immunoglobulin M) and IgG (immunoglobulin G). IgM antibodies appear early after infection (usually after the first week or so). Somewhat later, IgG antibodies, a more durable antibody, is produced. Today, there is no clear advantage of IGM or IgG antibody testing and not everyone will develop antibodies after a known COVID-19 infection. Importantly, scientists do not know how well or for how long antibody levels might protect someone against a future infection.

If you’re trying to picture antibodies, here’s some help. Imagine these helping to fight off disease.

All three types of tests can be evaluated locally with a point-of-care (POC) machine or sent to laboratory for processing (in-lab testing). POC tests are carried out in close proximity to a patient and typically take 5-15 minutes, but only one or a handful of samples can be processed at a time. Not all POC machines have the capability to communicate electronically to public health and other reporting systems. In-lab testing machines can process hundreds of samples at time and, with the right safeguards, can deliver results electronically to patients, providers and public health reporting systems. However, in-lab testing has built-in delays due to its batch testing nature and the time it can take to deliver samples to laboratories.

There are many opportunities for innovation in testing methods to improve upon the efficiency, specificity, and scalability of currently available tests. Having a good set of well performing tests for SARS-CoV-2 is very important, but we also need to be able to deliver the results of such tests accurately and quickly (electronically) to the responsible care providers and to public health authorities.

To facilitate electronic delivery of such content, NLM has long supported the development of formal health care terminologies including LOINC (Logical Observation Identifiers Names and Codes), RxNorm, along with SNOMED CT, and more recently, communication structures such as HL7 FHIR(R). These capabilities are especially important during this time of COVID-19. In the last six months, the FDA has authorized more than 80 SARS-CoV-2 test products for emergency use, the CDC has defined a COVID-19 Case Report Form, and the Centers for Medicare & Medicaid Services has specified content that should accompany every SARS-CoV-2 test. NLM-supported LOINC codes have been defined for all of this content, as well as SNOMED CT codes for coded test values. The FDA, CDC, and industry have produced a compendium of the all SARS-CoV-2 tests and their standard codes. The use of standardized test codes for test results is essential to smooth delivery of test results into electronic health records and for the aggregation of test results for research and public health purposes.

Testing for COVID-19 is important, safe, and easy. Getting tested early and often and following best practices, such as wearing a mask, washing hands often, and limiting social contact will help get us back to normal.

Did you learn something new about testing methods? How else can NLM help support testing activities?

Clem McDonald, MD, is the Chief Health Data Standards Officer at NLM. In this role, he coordinates standards efforts across NLM and NIH, including the FHIR interoperability standard and vocabularies specific to clinical care (LOINC, SNOMED CT, and RxNorm). Dr. McDonald developed one of the nation’s first electronic medical record systems and the first community-wide clinical data repository, the Indiana Network for Patient Care. Dr. McDonald previously served 12 years as Director of the Lister Hill National Center for Biomedical Communications and as scientific director of its intramural research program.

Women in Tech at NIH: Togetherness Enables Transformation

Guest post by Susan Gregurick, PhD, associate director for data science and director of the Office of Data Science Strategy, National Institutes of Health.

There is an African proverb that says, “If you want to go fast, go alone. If you want to go far, go together.”

As I approach my first anniversary as the associate director for data science at NIH, this statement could not ring truer for me. By going together, NIH has made astonishing progress during this past year to enable more advanced data science, impressive data and computational infrastructure advances, and better FAIR data sharing.

Togetherness means collaboration that harnesses the power and strength of a diverse team. At NIH, women are using their expertise in data science and their teamwork skills to rapidly enable transformative programs.

Andrea Norris, director of the Center for Information Technology, said it well last year:

“This is such an exciting time for innovation at the intersection of biomedical, medical, and technology domains. It’s dynamic and fast moving. Whether you have scientific skills, business expertise or know technology, there’s a role — an important role — for you in this space, especially here at NIH.”

I spoke with 11 women who are significantly impacting data science activities at NIH about how they enable data science; their advice for young, aspiring women data scientists; and the data science accomplishments that make them proud.

Collaboration and the role that NIH has played in responding to the COVID-19 pandemic were common themes in our discussions. These women also spoke about the importance of having a mentor, the four antidotes to challenging times, and the necessity of diverse perspectives.

To get to know these women even better, read their full responses on the Women in Data Science  page.

Jessica Mazerik, PhD, Data Science Workforce Director, Office of Data Science Strategy (ODSS)

Leads the Coding it Forward Civic Digital Fellows at NIH and NIH DATA Scholars programs

Bringing diverse talent to NIH.

I lead central fellowship programs to bring talented computer and data scientists to NIH. Our external outreach efforts encourage women and other minorities to apply for the programs we support. And, internally, we support engagement across NIH to place students in diverse positions.

Breaking down silos to advance data science.

Talented and driven staff across NIH have mobilized to lead implementation tactics under the strategic plan for data science, and we’ve built a forum for discussion in monthly town hall meetings. Most importantly, teams across NIH are working together and communicating widely to break down silos to continue advancing data science. 

Teresa Zayas Cabán, PhD, Coordinator, Fast Healthcare Interoperability Resources (FHIR) Acceleration, National Library of Medicine (NLM)

Co-leads the NIH FHIR Working Group

Advancing data standards within and beyond NIH. 

I’m leading efforts to enable the use of standardized clinical and research data sharing to advance discovery. We’re not only working collaboratively within NIH to advance data science, but also across departments, government offices, and the field itself. Together, we are leading the field in a new direction with the use in research, as appropriate, of the same standards used in health care. 

Be confident in what you know.

Don’t sell yourself short — speak up about what you know. Find good mentors who can advise you and be in your corner throughout your career. Find a good cohort of colleagues to collaborate and commiserate with. 

Belinda Seto, PhD, Deputy Director, ODSS

Co-leads the NIH FHIR Working Group

Women leading data science communities.

We all have varying perspectives and visions for data science. Nonetheless, we have become nuclei of the NIH data science community. Through our collaborations, we are emissaries for data science to extramural grantee communities. I see this as a concentric circle of expanding national and even global communities of data science.

Technical and sociocultural accomplishments in data science.

A sociocultural accomplishment is that many silos have been dismantled, and the willingness and readiness to collaborate are demonstrably strong. On the technical front, there are successful examples of progress toward an NIH data ecosystem, both at the foundational level and at the leading edge.

Lisa Federer, PhD, Data Science and Open Science Librarian, Office of Strategic Initiatives, NLM

Leads the NIH Data Science Training Committee

Be a lifelong learner.

Embrace lifelong learning — there’s always something new to learn! I’ve made it a priority to learn new things that I can bring to my work, including going back to school to get a PhD in information science with a focus on data science.

Open science practices advancing our understanding of COVID-19.

NIH has been doing impressive work in advancing our understanding of COVID-19 and has been a leader in making data related to SARS-CoV-2 widely available so that researchers around the world can help tackle this important issue. In the face of this global problem, open science practices will help us make progress toward therapies and vaccines more quickly.

Jennie Larkin, PhD, Deputy Director, Division of Neuroscience, National Institute on Aging

Co-leads the FAIR Data Repositories Team, which ran the one-year NIH Figshare instance pilot

Engage and embed data science in different programs.

Ask questions, learn, and engage. We need more bright people who can bring new perspectives, expertise, and energy to data science and help embed data science in different research programs.

Working with the community to address the COVID-19 pandemic.

The increasing breadth and depth of data science expertise across NIH and the larger biomedical enterprise has allowed us to rapidly accomplish much more than was possible just a few years ago. We have seen the best of our community, in the willingness to come together to meet the challenge of the COVID-19 pandemic.

Rebecca Rosen, PhD, Program Lead, NIMH Data Archive and Senior Advisor, Office of Technology Development and Coordination, National Institute of Mental Health

Leads the Researcher Auth Service Initiative

Learn from traditional and nontraditional resources.

I encourage young women in all biomedical science fields to incorporate data science into their career development plans. Look for data science educational resources from both traditional and nontraditional sources and network within those sources.

Collaboration to realize a data ecosystem.

The NIH data ecosystem has an increasingly tangible presence. We have growing numbers of researchers analyzing data across NIH cloud-based platforms, thanks in part to the new Office of Data Science Strategy, the NIH STRIDES Initiative, and a greater level of collaboration across NIH Institutes and Centers.

Heidi Sofia, PhD, Program Director, National Human Genome Research Institute (NHGRI)

Co-leads the Biomedical Information Science and Technology Initiative consortium and organized supplements to enhance software tools for open science (NOT-OD-20-073)

Beauty, awe, love, and humor.

I am never happier than when some brilliant young or established scientist in the community brings forward innovative, transformative science which I can endeavor to foster. In these instances, I find the first two of the four antidotes to our challenging times (beauty, awe, love, and humor). And my colleagues often provide the last one.

Use your power for good.

Among the first “computers” were women who performed the mathematical calculations needed to advance science, starting in 1757 in the search for Halley’s comet. Today, data science is a superpower for women in fields ranging from medicine to the natural sciences to business. So empower yourself, and use your power for good!

Maryam Zaringhalam, PhD, Data Science and Open Science Officer, Office of Strategic Initiatives, NLM

Organized the Webinar on Sharing, Discovering, and Citing COVID-19 Data and Code in Generalist Repositories

Women make data science better.

The lived experiences and perspectives of women — particularly women who are Black, Indigenous and People of Color (BIPOC); members of the LGBTQIA+ community; or members of the disability community — are critically important in ensuring that the products of data science have the greatest benefit for us all. Every chance I get, I tell women that they not only belong in data science, but that data science is better because of them.

Enabling researchers to make COVID-19 data available.

I was proud to be involved in quickly planning and organizing a joint NLM-ODSS webinar on sharing, discovering, and citing COVID-19 data and code using generalist repositories. It’s been inspiring to see the research community so eager to share the data and tools they’ve been generating, so this workshop felt like a timely and impactful contribution in support of researchers.

Valentina Di Francesco, MS, Lead Program Director, Computational Genomics and Data Science Program, NHGRI

Co-lead for the NIH Cloud Platform Interoperability Effort

Realizing a trans-NIH federated data ecosystem.

Among the variety of projects I am involved in, I am particularly enthusiastic about the NIH Cloud Platform Interoperability Effort, which aims to establish and implement guidelines and technical standards to empower end-user analyses across participating cloud-based platforms established across NIH in order to facilitate the realization of a trans-NIH federated data ecosystem.

Data science is a science at NIH.

After many years at NIH, only recently have I noticed a solid appreciation of the essential contributions of the statistical, mathematical, and computer science approaches to better understand biological systems. Finally, data science is respected as a field at NIH! I can’t think of a better time to join the ranks of women data scientists in biomedical research.

Kim Pruitt, PhD, Chief, Information Engineering Branch, National Center for Biotechnology Information, NLM

Co-leads the Lifecycle Metrics Working Group, which hosted the NIH Virtual Workshop on Data Metrics

Persevere, find a mentor, understand expectations, persevere.

My advice to someone entering this field is to persevere, to find an excellent mentor, to go into collaborations with a clear understanding of each member’s role and publication expectations, and to continually look for lessons learned when an analysis strategy fails (that is, cycle back to persevere).

Providing data access in the cloud

Providing access to data on the NIH STRIDES Initiative cloud-based platform is a prerequisite to supporting and growing the biomedical data science field. Most notable to me is the significant achievement of providing the complete Sequence Read Archive data (roughly 40 PB and growing) in two formats and ahead of the planned schedule.

Jennifer Couch, PhD, Chief, Structural Biology and Molecular Applications Branch, National Cancer Institute

NIH Citizen Science Coordinator

Bringing new approaches to biomedical research.

My focus is on bringing new, diverse, and often outsider perspectives, tools, approaches, and methods into the biomedical research space. Together with many talented colleagues and collaborators, I look for ways to bring new approaches to biomedical research. Sometimes that involves creating opportunities for different research communities to come together and find ways to collaborate.

On finding the right collaborators.

Hone your skills, don’t be afraid to try out new methods, and find collaborators with interesting questions who will know the answer when they see it. Find those collaborators who appreciate that your skills and insights are critical to your joint project’s success.

Dr. Gregurick leads the implementation of the NIH Strategic Plan for Data Science through scientific, technical, and operational collaborations with the Institutes, Centers, and offices that make up NIH. She has substantial expertise in computational biology, high-performance computing, and bioinformatics.

A 21st-Century Approach to Health Services Research: NLM Moves Forward with You in Mind

Guest post by Doug Joubert, head of User Services and the National Information Center on Health Services Research and Health Care Technology, National Library of Medicine.

NLM has a strong record of involving its stakeholders in the strategic decisions that drive the products we develop and the services we offer. As the world’s largest biomedical library, NLM is committed to thinking strategically about how we can promote discovery while supporting the 21st-century data, data science, and information needs of our diverse user community.

Looking forward

As we consider how to better address the needs of everyone who produces or uses health services research, we invite you to be part of the process by responding to this Request for Information (RFI).

Through this RFI, NLM is seeking input on future resource and program directions in support of information related to health services research, practice guidelines, and health technology, including technology assessment. Specifically, feedback is requested on the following:

  • Products that NLM currently offers in the areas of health services delivery or health services research
  • Information types necessary for organizations to successfully support health services research or public health
  • Tools, resources, or health services literature that are the most critical for NLM to collect or support
  • Any other comments that would enable NLM to support future work related to health services delivery or health services research

Taking stock

The health services research community is supported by NLM’s many databases, tools, and services, including PubMed and PubMed Central, Bookshelf, MedlinePlus, and ClinicalTrials.gov. Our Unified Medical Language System and clinical vocabulary and data standards resources are used by individuals in clinical research and health practice in the United States and globally. Through our intramural and extramural research and training investments in biomedical informatics, computational biology, and genomics, we are advancing projects that address real-world challenges in public health surveillance, opioid intervention, social determinants of health, and other domains. NLM also promotes the use and reuse of data for research and discovery from both research studies and clinical data sources through publicly available national health surveys, diagnostic images, administrative claims, and electronic health records. 

Since the early 1990s, with the establishment of the National Information Center on Health Services Research and Health Care Technology (NICHSR), NLM has developed a number of specialized information resources targeting producers and users of health services research. These specialized resources were designed to address some of the challenges of finding and accessing credible and authoritative health services research information.

At the core of NLM’s service model is meeting the information needs of all those who seek current and trusted biomedical information. To this end, NLM has continued to increase, refine, and evaluate the health services research resources of NICHSR. These efforts reflect the changing needs of users and the ways in which health services delivery is evaluated. Through our products, services, and programs, we continue to strive to support the information needs of researchers, clinicians, health care professionals, policymakers, librarians, and the public.

We hope you’ll take the time to share your expertise and vision for health services research information at NLM so that our NICHSR can continue evolving to meet your needs. We can’t wait to hear from you!

Doug Joubert is the head of Users Services and the and the product owner for the NLM Health Services Research product portfolio. He supports a team that provides research and information services to the public. He also supports the NLM Strategic Plan by leveraging NLM tools and services to facilitate the management of data throughout the entire lifecycle. Doug works collaboratively to develop and support data science training for NLM Reference and Web Services staff.

%d bloggers like this: