Guest post by Valerie Schneider, PhD, staff scientist at the National Library of Medicine’s National Center for Biotechnology Information, National Institutes of Health.
As a staff scientist at NLM, I’ve found that our strategic plan has become a valuable framework for organizing our mission and providing direction and focus—especially when we’re talking about data science.
A recent project at NLM’s National Center for Biotechnology Information (NCBI) highlights why it’s important to ensure alignment between projects and strategy.
As host to the world’s largest repository of biological sequence data, NCBI provides access to data that are critical to understanding and advancing human health. While users have been searching NCBI’s sequence databases long before the strategic plan was developed, it might be easy to overlook how an effort like the strategic plan has anything to do with the larger picture. When you look, though, it’s easy to see the relationship.
Providing a Common Search Experience
Connecting the resources of a digital research enterprise and advancing research and development in biomedical informatics and data science are just a few of the important objectives in NLM’s strategic plan. We’ve improved the experience of users searching for several types of common sequence-associated data by providing a more comprehensive interpretation of their queries and a new results interface that provides easy access to NCBI’s best results, regardless of the database in which they search.
Our team tackled this effort through extensive user interviews, iteratively developing solutions, and monitoring the usage of those solutions.
We improved searches for the reference set of genes and genomes in all species across multiple NCBI databases by supporting common language queries and using features like auto-suggest. We enhanced the ability to search and access clinically important datasets, such as human variations housed in ClinVar and dbSNP, NCBI’s variation databases, as well as resources with information about antimicrobial resistance genes and viral pathogens.
We also created displays that aggregate the results from different databases and enable easy downloads of data and access to analysis tools. Our new interactive graphics and web page displays allow for the visualization of sequences and the analysis of homologous gene sets. Knowing that NLM users rely on different technologies to access data, we ensured that the displays work on both traditional computers and mobile devices.
Since the first release of these search enhancements in late 2018, they are now triggered in a quarter of all searches in the scoped databases. We’ve seen a 300% increase in their use, with more than 300,000 users clicking on the content they offer in just the month of October 2019. These products have provided results for over 500,000 searches that previously would have returned no content. Regular monitoring of their use helps us make sure that we continue to facilitate search and deliver high-value data.
NLM’s strategic plan gave us the user-centered framework in which to execute the goals of this project. So much of the work we do at NLM is consistent with the goals and specific objectives of the plan — it provides a structure for evaluating our work and making sure that we continue to be forward-looking.
And the strategic plan helps me, as a staff scientist, to identify new areas for work that will best enable NLM to continue delivering a platform for biomedical discovery and data-powered health.
To stay up to date on NCBI projects and research, follow us on Twitter.
Valerie Schneider, PhD, is the deputy director of Sequence Offerings and the interim head of the Sequence Delivery Program. In these roles, she coordinates efforts associated with the curation, enhancement and organization of sequence data, as well as oversees tools and resources that enable the public to access, analyze, and visualize biomedical data. She also manages NCBI’s involvement in the Genome Reference Consortium, the international collaboration tasked with maintaining the value of the human reference genome assembly.