Guest post by Susan Gregurick, PhD, associate director for data science and director of the Office of Data Science Strategy, National Institutes of Health.
There is an African proverb that says, “If you want to go fast, go alone. If you want to go far, go together.”
As I approach my first anniversary as the associate director for data science at NIH, this statement could not ring truer for me. By going together, NIH has made astonishing progress during this past year to enable more advanced data science, impressive data and computational infrastructure advances, and better FAIR data sharing.
Togetherness means collaboration that harnesses the power and strength of a diverse team. At NIH, women are using their expertise in data science and their teamwork skills to rapidly enable transformative programs.
Andrea Norris, director of the Center for Information Technology, said it well last year:
“This is such an exciting time for innovation at the intersection of biomedical, medical, and technology domains. It’s dynamic and fast moving. Whether you have scientific skills, business expertise or know technology, there’s a role — an important role — for you in this space, especially here at NIH.”
I spoke with 11 women who are significantly impacting data science activities at NIH about how they enable data science; their advice for young, aspiring women data scientists; and the data science accomplishments that make them proud.
Collaboration and the role that NIH has played in responding to the COVID-19 pandemic were common themes in our discussions. These women also spoke about the importance of having a mentor, the four antidotes to challenging times, and the necessity of diverse perspectives.
To get to know these women even better, read their full responses on the Women in Data Science page.
Jessica Mazerik, PhD, Data Science Workforce Director, Office of Data Science Strategy (ODSS)
Bringing diverse talent to NIH.
I lead central fellowship programs to bring talented computer and data scientists to NIH. Our external outreach efforts encourage women and other minorities to apply for the programs we support. And, internally, we support engagement across NIH to place students in diverse positions.
Breaking down silos to advance data science.
Talented and driven staff across NIH have mobilized to lead implementation tactics under the strategic plan for data science, and we’ve built a forum for discussion in monthly town hall meetings. Most importantly, teams across NIH are working together and communicating widely to break down silos to continue advancing data science.
Teresa Zayas Cabán, PhD, Coordinator, Fast Healthcare Interoperability Resources (FHIR) Acceleration, National Library of Medicine (NLM)
Co-leads the NIH FHIR Working Group
Advancing data standards within and beyond NIH.
I’m leading efforts to enable the use of standardized clinical and research data sharing to advance discovery. We’re not only working collaboratively within NIH to advance data science, but also across departments, government offices, and the field itself. Together, we are leading the field in a new direction with the use in research, as appropriate, of the same standards used in health care.
Be confident in what you know.
Don’t sell yourself short — speak up about what you know. Find good mentors who can advise you and be in your corner throughout your career. Find a good cohort of colleagues to collaborate and commiserate with.
Belinda Seto, PhD, Deputy Director, ODSS
Women leading data science communities.
We all have varying perspectives and visions for data science. Nonetheless, we have become nuclei of the NIH data science community. Through our collaborations, we are emissaries for data science to extramural grantee communities. I see this as a concentric circle of expanding national and even global communities of data science.
Technical and sociocultural accomplishments in data science.
A sociocultural accomplishment is that many silos have been dismantled, and the willingness and readiness to collaborate are demonstrably strong. On the technical front, there are successful examples of progress toward an NIH data ecosystem, both at the foundational level and at the leading edge.
Lisa Federer, PhD, Data Science and Open Science Librarian, Office of Strategic Initiatives, NLM
Leads the NIH Data Science Training Committee
Be a lifelong learner.
Embrace lifelong learning — there’s always something new to learn! I’ve made it a priority to learn new things that I can bring to my work, including going back to school to get a PhD in information science with a focus on data science.
Open science practices advancing our understanding of COVID-19.
NIH has been doing impressive work in advancing our understanding of COVID-19 and has been a leader in making data related to SARS-CoV-2 widely available so that researchers around the world can help tackle this important issue. In the face of this global problem, open science practices will help us make progress toward therapies and vaccines more quickly.
Jennie Larkin, PhD, Deputy Director, Division of Neuroscience, National Institute on Aging
Co-leads the FAIR Data Repositories Team, which ran the one-year NIH Figshare instance pilot
Engage and embed data science in different programs.
Ask questions, learn, and engage. We need more bright people who can bring new perspectives, expertise, and energy to data science and help embed data science in different research programs.
Working with the community to address the COVID-19 pandemic.
The increasing breadth and depth of data science expertise across NIH and the larger biomedical enterprise has allowed us to rapidly accomplish much more than was possible just a few years ago. We have seen the best of our community, in the willingness to come together to meet the challenge of the COVID-19 pandemic.
Leads the Researcher Auth Service Initiative
Learn from traditional and nontraditional resources.
I encourage young women in all biomedical science fields to incorporate data science into their career development plans. Look for data science educational resources from both traditional and nontraditional sources and network within those sources.
Collaboration to realize a data ecosystem.
The NIH data ecosystem has an increasingly tangible presence. We have growing numbers of researchers analyzing data across NIH cloud-based platforms, thanks in part to the new Office of Data Science Strategy, the NIH STRIDES Initiative, and a greater level of collaboration across NIH Institutes and Centers.
Heidi Sofia, PhD, Program Director, National Human Genome Research Institute (NHGRI)
Co-leads the Biomedical Information Science and Technology Initiative consortium and organized supplements to enhance software tools for open science (NOT-OD-20-073)
Beauty, awe, love, and humor.
I am never happier than when some brilliant young or established scientist in the community brings forward innovative, transformative science which I can endeavor to foster. In these instances, I find the first two of the four antidotes to our challenging times (beauty, awe, love, and humor). And my colleagues often provide the last one.
Use your power for good.
Among the first “computers” were women who performed the mathematical calculations needed to advance science, starting in 1757 in the search for Halley’s comet. Today, data science is a superpower for women in fields ranging from medicine to the natural sciences to business. So empower yourself, and use your power for good!
Maryam Zaringhalam, PhD, Data Science and Open Science Officer, Office of Strategic Initiatives, NLM
Women make data science better.
The lived experiences and perspectives of women — particularly women who are Black, Indigenous and People of Color (BIPOC); members of the LGBTQIA+ community; or members of the disability community — are critically important in ensuring that the products of data science have the greatest benefit for us all. Every chance I get, I tell women that they not only belong in data science, but that data science is better because of them.
Enabling researchers to make COVID-19 data available.
I was proud to be involved in quickly planning and organizing a joint NLM-ODSS webinar on sharing, discovering, and citing COVID-19 data and code using generalist repositories. It’s been inspiring to see the research community so eager to share the data and tools they’ve been generating, so this workshop felt like a timely and impactful contribution in support of researchers.
Valentina Di Francesco, MS, Lead Program Director, Computational Genomics and Data Science Program, NHGRI
Co-lead for the NIH Cloud Platform Interoperability Effort
Realizing a trans-NIH federated data ecosystem.
Among the variety of projects I am involved in, I am particularly enthusiastic about the NIH Cloud Platform Interoperability Effort, which aims to establish and implement guidelines and technical standards to empower end-user analyses across participating cloud-based platforms established across NIH in order to facilitate the realization of a trans-NIH federated data ecosystem.
Data science is a science at NIH.
After many years at NIH, only recently have I noticed a solid appreciation of the essential contributions of the statistical, mathematical, and computer science approaches to better understand biological systems. Finally, data science is respected as a field at NIH! I can’t think of a better time to join the ranks of women data scientists in biomedical research.
Kim Pruitt, PhD, Chief, Information Engineering Branch, National Center for Biotechnology Information, NLM
Co-leads the Lifecycle Metrics Working Group, which hosted the NIH Virtual Workshop on Data Metrics
Persevere, find a mentor, understand expectations, persevere.
My advice to someone entering this field is to persevere, to find an excellent mentor, to go into collaborations with a clear understanding of each member’s role and publication expectations, and to continually look for lessons learned when an analysis strategy fails (that is, cycle back to persevere).
Providing data access in the cloud
Providing access to data on the NIH STRIDES Initiative cloud-based platform is a prerequisite to supporting and growing the biomedical data science field. Most notable to me is the significant achievement of providing the complete Sequence Read Archive data (roughly 40 PB and growing) in two formats and ahead of the planned schedule.
Jennifer Couch, PhD, Chief, Structural Biology and Molecular Applications Branch, National Cancer Institute
NIH Citizen Science Coordinator
Bringing new approaches to biomedical research.
My focus is on bringing new, diverse, and often outsider perspectives, tools, approaches, and methods into the biomedical research space. Together with many talented colleagues and collaborators, I look for ways to bring new approaches to biomedical research. Sometimes that involves creating opportunities for different research communities to come together and find ways to collaborate.
On finding the right collaborators.
Hone your skills, don’t be afraid to try out new methods, and find collaborators with interesting questions who will know the answer when they see it. Find those collaborators who appreciate that your skills and insights are critical to your joint project’s success.
Dr. Gregurick leads the implementation of the NIH Strategic Plan for Data Science through scientific, technical, and operational collaborations with the Institutes, Centers, and offices that make up NIH. She has substantial expertise in computational biology, high-performance computing, and bioinformatics.