Guest post by Dianne Babski, Director of the NLM User Services and Collections Division (USCD), and Peter Seibert, USCD Librarian.
In a world of rapidly changing digital expectations, new formats to access and store information, and a dynamic biomedical landscape, users want to connect to data across an abundant, widely available, and growing ecosystem of biomedical research with one click. That is the future we are working to create by leveling up our dataset discovery technology to better understand user expectations and enhance the user experience.
To bring you closer to that one-click reality, NLM is excited to announce the launch of a beta version of a new online tool, the Dataset Catalog.
Search, Find, and Connect Biomedical Datasets
The Dataset Catalog is a catalog of biomedical datasets from publicly available repositories. The tool is designed to help improve the discoverability and reuse of research data by making it easier for users to find and connect biomedical datasets. This functionality aligns with NIH’s efforts to make available to the public the results of research it supports and conducts. Bringing disparate metadata into a standardized format empowers researchers to share and discover data in a broader environment and create relationships that might otherwise not be apparent.
Adhering to FAIR (Findable, Accessible, Interoperable, and Reusable) principles, the Dataset Catalog is an online, “all-in-one” tool that allows users to navigate among biomedical datasets by linking descriptive data. The system is modeled on the ease of use of PubMed, and like PubMed, it provides links out to datasets. So you could think of the Dataset Catalog as the “PubMed of datasets”!
How It Works
The Dataset Catalog is powered by an innovative NLM data model called the DATaset Metadata Model, or DATMM. Describing data in datasets and repositories, DATMM allows data to be interpreted and connected by computers across the biomedical ecosystem. DATMM, together with the Dataset Catalog, enhances access to and discovery of biomedical datasets through federated web search, thereby accelerating scientific research. This supports NIH’s responsible data management and sharing policies and practices by enabling more efficient validation of research results, providing access to high-value datasets, and promoting data reuse for future research studies. In this beta phase, users can search datasets from four repositories with limited functionality.
Your Feedback Matters!
We encourage you to check out the beta version of the Dataset Catalog and DATMM and to share feedback by clicking the vertical blue “Give Feedback” button on the right-hand side of the Dataset Catalog web pages. NLM will evaluate feedback obtained during this six-month beta phase to inform future product development and expansion, such as adding more repositories and functionality.
Be sure to let us know what would be most helpful for you to find the data you need to make new discoveries!
Learn More
If you are interested in learning more, NLM will host virtual office hours on Thursday, April 11, at 2:00 p.m. Eastern Time. Our team will demonstrate the tool’s functionality and features. It’s also another way to share your feedback. Click here to learn more and register: https://www.nnlm.gov/training/class/nlm-office-hours-dataset-catalog.
Dianne Babski
Director, User Services and Collections Division, NLMMs. Babski is the Director for the User Services and Collection Division at the National Library of Medicine. She is responsible for overall management of one of NLM’s largest divisions and oversees budget, facilities, administration, and operations, including a national network of more than 8,000 academic, health science and public libraries and community organizations to improve access to quality health information. Ms. Babski serves as the Scientific Review Administrator on the Literature Selection Technical Review Committee, a federal advisory committee responsible for recommending journals for inclusion to MEDLINE. Ms. Babski is currently leading a Generative AI (Gen AI) pilot at NLM to leverage the latest in Gen AI to unlock new pathways of biomedical discovery, increase operational efficiencies, and achieve better user experiences across NLM resources. She holds a BS in biology and a Master of Information Management.
Peter Seibert
Librarian, User Services and Collections Division, NLMMr. Seibert has served as a librarian at NLM for over four years. He currently leads development of the Dataset Catalog and DATMM products within the Controlled Vocabulary Services Program of the Health Data Standards Branch. Prior to NLM, Peter served as a Signal Officer in the US Amy until 2004 and worked as a program manager specializing in network support operations at the Army Research Laboratory. Peter earned a BA from Alfred University and an MLIS from Catholic University.