Are we there yet?

On the road to data science at NLM

view through a car's windshield of the road and surrounding countryside

I can’t believe eight months have passed since I promised to “outline NLM’s plan to become what the ACD report recommended—the ‘epicenter of data science for the NIH.’” In June I sketched out a bit of that plan in our companion blog, DataScience@NIH, and recapped a handful of NLM efforts and accomplishments in data science.

I am immensely proud of these accomplishments, but I cannot take credit for them. As is often said, it takes a village…and we could achieve all that we have only through the combined efforts of staff across the entire Library.

How have we done it?

First, we try to be clear about the issue we’re addressing, clarifying and refining what we mean when we say “data science.” For me, data science comprises the principles and practices that underlie the effective use of data to glean insights and make new discoveries. To others it’s applying machine learning or biostatistics to investigate massively large data sets, as in our Lister Hill Center scientists’ exploration of the MIMIC data set or the Medicare claims data. And through Extramural Programs, we’re helping to fund yet another view of data science—the development of new analytical and methodological tools that can make personal medical data useful to patients and family members.

Second, we listen to the wisdom of our advisors. We’ve engaged over 150 experts and colleagues from across the country in NLM’s strategic planning process overseen by Drs. Dan Masys and Jill Taylor from our Board of Regents. Their report is still forthcoming, but in essence, they’ve advised us to build on our strengths, to remain true to our core mission, and to prepare for a future where data serve as a substrate to discovery.

To me their guidance translates to generating new methods at the intersection of library science, data science, and computer science to acquire, catalog, preserve, and make available data in the same way we’ve done for the scientific literature. If done well, that work will help accelerate the NIH “big science” initiatives (e.g., the BRAIN Initiative, All of Us Research Program, the Cancer Moonshot) while simultaneously ensuring the data can be applied to the broad range of environmental, behavioral, and social determinants of health.

Achieving a data-driven future will be—in fact, must be—a trans-NH accomplishment.

Third, we interact with colleagues across the NIH. Every month or so I convene a panel of directors of NIH institutes and centers to get their perspectives on the data science issues NIH faces and the ways NLM might respond to these challenges. Together we’re working to pair the institutes’ and centers’ domain-specific needs, which call for a certain degree of independence and flexibility, with the wisdom, benefits, and power of a collaborative response. It’s a balancing act, one made easier by a shared dedication to a data-powered future for health and biomedical research.

Fourth, we look within. Members of NLM’s leadership team are working with the women and men in each of their divisions to critically appraise what resources, processes, or practices we already have that can help resolve existing challenges in data science (e.g., the principles of curation or our investments in common data elements), as well as to figure out what new skills and resources we must develop. This whole process requires that we provide a safe space for staff to explore their future options while feeling confident and secure in their present positions—not a simple task.

And finally, we strive for efficiency. We’re sorting out the roles and responsibilities of a range of committees, work groups, and task forces, trying to avoid redundancies and respect boundaries while making effective use of the time and talents we have to apply to this exciting opportunity.

In summary, we are making progress in data science by talking (and listening) to each other—a lot—and by keeping in mind that achieving a data-driven future will be—in fact, must be—a trans-NH accomplishment.

So, are we there yet?

We know the future of NLM as the epicenter of data science will draw on our past and be shaped by the interactions of our present. We have over 180 years of tradition that serve as a solid platform from which to launch that future. We’re building the communication pathways to support visioning, accountability, and engagement. Most importantly, we’ve got the right people in the right discussions at the right time.

So, we may not be there yet, but we are well on our way!

But what more do we need to do to fuel our journey? What road hazards lie ahead? Let me know what you would do if you were at the wheel.

Author: Patti Brennan

Director, US National Library of Medicine

What's on your mind?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s