Guest post by Kathryn Funk, program manager for NLM’s PubMed Central.
Last week, National Library Week celebrated how libraries and library workers make our communities stronger. In the spirit of building strong communities, NLM has committed to “democratiz[ing] access to the products and processes of scientific research.”
NLM delivers on that commitment by supporting the NIH Public Access Policy. This policy, passed by Congress in 2008, requires authors funded by NIH to make publicly accessible in PubMed Central (PMC) any peer-reviewed paper accepted for publication. Now, over a decade after the NIH Public Access Policy went in to effect, PMC makes more than 1 million NIH-funded papers available to the research community and the public. This volume of publicly accessible, NIH-funded papers represents a clear return on investment for the public, but numbers alone don’t provide the full story.
A quick dive into NIH Research Matters, a weekly update of NIH research highlights, offers a much richer and more personal picture of how the NIH Public Access Policy and NLM’s support of it can strengthen and empower communities. Making NIH-funded papers publicly accessible in PMC means that the public has free and direct access to research that touches on some of the most critical public health concerns facing our community, including studies that:
Suggest a method for detecting breast tumors earlier and more often, creating a higher chance of survival for patients (NIH Research Matters | PMC);
Identify treatment options for reducing the risk of death for people who’d previously had a non-fatal opioid overdose (NIH Research Matters | PMC);
Explore how maternal nutrition supplements can increase infant birth size and potentially improve children’s life-long health (NIH Research Matters | PMC);
Identify young people with suicidal thoughts by using machine learning to analyze brain images (NIH Research Matters | PMC);
Gauge exercise’s impact on the growth of new nerve cells in the brains of mice, which could potentially reduce memory problems in people with Alzheimer’s disease (NIH Research Matters | PMC); and
These examples illustrate that access, while essential, is not the Library’s end goal. Improved health is.
NLM supports public access to research outputs to accelerate scientific discovery and advance the health of individuals and our communities. It is the best way we can honor the investment made by the American people in scientific research and the surest way to make our communities stronger.
Kathryn Funk is the program manager for PubMed Central. She is responsible for PMC policy as well as PMC’s role in supporting the public access policies of numerous funding agencies, including NIH. Katie received her master’s degree in library and information science from The Catholic University of America.
Guest post by the Data Science @NLM Training Program team.
Regular readers of this blog probably know that NLM staff are expanding their expertise beyond library science and computer science to embrace data science. As a result, NLM—in alignment with strategic plan Goal 3 to “build a workforce for data-driven research and health”—is taking steps to improve the entire staff’s facility and fluency with this field so critical to our future.
The Library is rolling out a new Data Science @NLM Training Program that will provide targeted training to all of NLM’s 1,700 staff members. We are also inviting staff from the National Network of Libraries of Medicine (NNLM) to participate so that everyone in the expanded NLM workforce has the opportunity to become more aware of data science and how it is woven in to so many NLM products and services.
For some of our staff, data science is already a part of their day-to-day activities; for others, data science may be only a concept, a phrase in the strategic plan—and that’s okay. Not everyone needs to be a data scientist, but we can all become more data savvy, learning from one another along the way and preparing to play our part in NLM’s data-driven future. (See NLM in Focus for a glimpse into how seven staff members already see themselves supporting data science.)
Over the course of this year, the data science training program will help strengthen and empower our diverse and data-centric workforce. The program will provide opportunities for all staff to participate in a variety of data science training events targeted to their specific interests and needs. These events range from the all-hands session we had in late January that helped establish a common data science vocabulary among staff to an intensive, 120-hour data science fundamentals course designed to give select NLM staff the skills and tools needed to use data to answer critical research questions. We’re also assessing staff members’ data science skill levels and creating skill development profiles that will guide staff in taking the steps necessary to build their capacity and readiness for working with data.
At the end of this process, we’ll better understand the range of data science expertise across the Library. We’ll also have a much clearer idea of what more we can do to develop staff’s facility and fluency with data science and how to better recruit new employees with the knowledge and skills needed to advance our mission.
In August, the training program will culminate with a data science open house where staff can share their data science journey, highlight group projects from the fundamentals course, and find partners with whom they can collaborate on emerging projects throughout the Library.
But that final phase of the training initiative doesn’t mean NLM’s commitment to data science is over. In fact, it will be just the beginning.
In the coming years, staff will apply their new and evolving skills and knowledge to help NLM achieve its vision of serving as a platform for biomedical discovery and data-powered health.
How you are supporting the data science development of your staff? Let’s share ideas to keep the momentum going!
Co-authored by the Data Science @NLM Training Program team (left to right):
Guest post by Derek Johnson, MLIS, Health Professionals Outreach Specialist for the National Network of Libraries of Medicine Greater Midwest Region
Recent articles in Preventing Chronic Disease and The Nation’s Health chronicle how public libraries can complement the efforts of public health workers in community outreach and engagement. Data tell us that more Americans visit public libraries in a year (1.39 billion) than they do health care providers (990 million). More so, over 40% of computer-using patrons report using libraries to search for health information. However, we also know many individuals struggle with accessing and understanding the health information they encounter every day.
This challenge begs the question, “How does the National Library of Medicine (NLM) increase access to trustworthy health information to improve the health of communities across the United States?”
Take, for example, Richland County Public Health in Ohio. Richland County is approximately 33% rural. Many rural areas have been identified as “internet deserts.” In addition, adults in the county have lower rates of high school and college-level education compared to state averages. Seeking to address these disparities, Richland County Public Health applied for a funding award from NNLM’s Greater Midwest Region to develop an Interactive Health Information Kiosk in partnership with the county public library system.
With funding in hand, Richland County Public Health loaded select NLM resources onto specially configured iPads and installed them in the nine branches of the Richland County Libraries. A health educator trained library staff, local healthcare providers, and the public on how to use those resources to access trustworthy health information. Moving forward, librarians will be able to help patrons use the health kiosks. As a result, Richland County Public Health is helping improve health literacy among adult residents and, ultimately, enabling them to make more informed decisions about their health.
Another example of a public health and public library collaboration comes from NNLM’s Middle Atlantic Region (MAR). The Philadelphia Department of Public Health recognized the need to engage individuals in neighborhoods most vulnerable to severe weather events to increase their knowledge of disaster and emergency preparedness.
With funding from MAR, the Philadelphia Department of Public Health partnered with four branches of the Free Library of Philadelphia to train both librarians and local residents on emergency preparedness. Participants learned how to make use of the NLM Disaster Information Management Research Center and where to find local resources during weather-related emergencies.
These are just two of the many projects that NNLM helps facilitate across the country through its network of more than 7,500 library, public health, community-based, and other organizational members.
And, while NNLM continues to identify partnerships for funding public health and library projects, it also engages health educators by offering continuing education credit for Certified Health Education Specialists (CHES). CHES-certified professionals work in a variety of health care and public health settings where they help community members adopt and maintain healthy lifestyles. Health educators can earn continuing education credits by attending specially designated NNLM webinars on topics such as health statistics and evidence-based public health, with more courses in the works.
As communities continue to rely on the public health workforce to sustain and build healthy environments, know that the National Library of Medicine and its National Network of Libraries of Medicine are here to support the work they do!
Derek Johnson, MLIS is the Health Professionals Outreach Specialist for the National Network of Libraries of Medicine Greater Midwest Region. In this capacity, he conducts training and outreach to public health professionals on a variety of topics, including evidence-based public health, health disparities, and community outreach.
Guest post by Kurt W. Rodarmer, a software security architect in NLM’s National Center for Biotechnology Information.
NLM is working to unleash the potential of data and information to accelerate and transform biomedical discovery. Foundational to that goal lie the data themselves. We assess their value, collect and curate them, and then make them accessible.
But access has its risks. Big risks. Especially when it comes to personal medical data or hard-earned, grant-funded proprietary data. We need to find a way to deliver access while simultaneously controlling and protecting the data.
That’s where security comes in.
We’re all familiar with “identity-based security,” evolution’s primitive mechanism that predates our species. It starts by using our eyes, ears, and nose to identify someone or something and ends with an immediate risk-assessment. Not surprisingly, this mechanism was modeled in modern cybersecurity and is virtually ubiquitous across consumer and industrial-grade systems.
For all their efforts though, these systems sure seem to fail—a lot. Common wisdom suggests breaches are inevitable, but that’s not entirely true. There are other approaches.
Authority-based security is one. With that, authority, permissions, and trust are explicitly modeled, and policy decisions are made up front. We create objects that embody these ethereal concepts and make them tangible. These objects can then be stored, transmitted, accessed, sub-divided, transferred, etc. The discipline of modeling and managing authority is called Authority Management.
Identity- and authority-based approaches achieve several common goals. They each have strengths and weaknesses. Where they differ, the stronger, more effective, and more elegant of the two is nearly always authority-based.
Both approaches grant permissions based upon security policy. Authority-based security captures the result of policy evaluation as permissions in unforgeable and unmodifiable tokens. Since these tokens come from a known source of authority and are tamper-evident, the permissions they contain require no further scrutiny. They are as trustworthy as the authority that issued them. A permission token typically contains only a small subset of the overall permissions available to an individual, ideally never more than are needed within the current dynamic context.
By contrast, identity-based techniques make permission decisions based upon global attributes or provide crude static mechanisms. In most cases, they reflect zero context sensitivity. That means, for example, that if I run a program on a stock Linux system, that program executes using 100% of my permissions, even though it may need only read access to one file and write access to one directory. For all I know the program could be surreptitiously stealing my most sensitive data in the background, and I’d have no awareness or protection against it. Without my permission? That’s the point—I just gave it ALL my permissions!
In an authority-managed system, I would have given that same program permissions to access only the file and directory needed, leaving it powerless to read other sensitive files, much less phone home and exfiltrate them.
So, if identity-based security is so far behind the curve, what accounts for its continued use? It has one highly prized strength: its ability to revoke permissions on the spot. Since permissions are granted at the moment they are going to be exercised, any permission can be immediately denied as the result of updating policy. Since this policy update is often reactive, coming about once damage has already occurred and possibly delayed by weeks or months, the value of its immediacy is questionable. Tokens have a built-in timeout making them self-revoking, and in practice perform similarly.
Here’s how it works. To do anything of substance in a system, you need permissions. You may have those permissions already stored on some device, such as your phone. Or, you may need to go through the process of identifying yourself to some part of the system that is storing permissions on your behalf, accessible once your identity has been authenticated. In either case, the first step is to get ahold of a token containing your set of pre-approved permissions.
The permission set you now hold represents the complete permissions you have within the system you have just entered, e.g., dbGaP, a grant administration system, etc. It is unlikely to represent all the permissions you have within every system you can access. Even so, it’s probably too permissive for what you have in mind. Your next step would typically be to subset your permissions to only those needed to limit the potential damage should the token fall into the wrong hands.
Sometimes you need to share your permissions, such as when a grant-funded investigator delegates most of the research documentation to lab assistants. She can take her permission tokens received with the grant, subset, and delegate them to her lab as appropriate, so everyone can work.
What else can you do with them? Literally anything that can be done in an information system! Beyond implementing the traditional security processes of Identity and Access Management (IAM, a proper subset of Authority Management), tokens are also used to protect resources in other ways. They can be used to model spending accounts and quotas, control access to consumable or metered resources, mitigate DOS attacks, provide audit trails, and eliminate the use of passwords and multiple logins.
Because tokens carry permissions whose source of authority is irrefutable, they are the mechanism for implementing the fundamental principles of security. We can bring some of their benefits to bear right now and help lay the groundwork for secure, accessible biomedical data.
Kurt Rodarmer started work on military-grade secure operating systems over 20 years ago in Silicon Valley, working with the architect of KeyKOS, Norman Hardy. He is an expert in secure software and language design and has formalized the field of Authority Management. Kurt previously worked for Apple and Oracle and was a consultant to IBM and Sun, among others.
Guest post by David Hale, Information Technology Specialist at NLM.
Did you know that each day more than four million people use NLM resources and that every hour a petabyte of data moves in or out of our computing systems?
Those mammoth numbers indicate to me how essential NLM’s array of information products and services are to scientific progress. But as we gain more experience with providing information, particularly clinical, biologic, and genetic datasets, we’re finding that how we share data is as critical as the data itself.
To fuel the insights and solutions needed to improve public health, we must ensure data flow freely to the researchers, industry innovators, patient communities, and citizen scientists who can bring new lenses to these rich repositories of knowledge.
One way we’re opening doors to our data is through an open data portal called Data Discovery. While agencies like the Centers for Disease Control and the Centers for Medicare and Medicaid Services are already utilizing the same platform with success, NLM is the first of NIH’s Institutes and Centers to adopt the platform. Our first datasets are already available, including content from such diverse resources as the Dietary Supplement Label Database, Pillbox, ToxMap, Disaster Lit, and HealthReach.
Why did NLM take this step? While many of our data resources have long been publicly available online, housing them within Data Discovery offers unconstrained access and delivers key benefits:
Powerful data exploration tools—By showing the dataset as a spreadsheet, the Data Discovery platform offers freedom to filter and interact with the data in novel ways.
Intuitive data visualizations—A picture is worth a thousand words, and nowhere is that truer than leveraging data visualizations to bring new perspectives on scientific questions.
Open data APIs—Open data alone isn’t enough to fuel a new generation of insights. Open APIs are critical to making the data understandable, accessible, and actionable, based on the unique needs of the user or audience.
What does this mean in practice?
Let’s look at the Office of Dietary Supplements’ (ODS) Dietary Supplement Label Database (DSLD) to illustrate the potential of leveraging Data Discovery.
More than half of all Americans take at least one dietary supplement a day. Reliable information about those supplements is critical to their appropriate use, making DSLD a timely and important dataset to make available in an open data platform. Through Data Discovery, researchers, academics, health care providers, and the public will be able to explore and derive insights from the labels of more than 85,000 dietary supplement products currently or formerly sold in the US.
Developers and technologists who support research, health, and medical organizations require APIs that are modern, interoperable, and standards-compliant. Data Discovery provides a powerful solution to these needs, supporting NLM’s role as a platform for biomedical discovery and data-powered health.
Beyond fueling scientific discovery, open access to data holds another benefit for advancing public health: contributing to the professional development of data and informatics specialists. An increasingly important part of the health care workforce, informaticists help researchers extract the most meaningful insights from data, driving new developments in the lab and better management of patients and populations.
I invite you to explore the new Data Discovery portal. It’s an exciting step forward in achieving key aspects of the NLM Strategic Plan—to advocate for open science, further democratize access to data, and support the training and development of the data science workforce.
David Hale is an Information Technology Specialist at the National Library of Medicine. In addition to leading Data Discovery, David is also project lead for NLM’s Pillbox, a drug identification, reference, and image resource. He received his Bachelor of Science in Physical Science from the University of Maryland.