Socio-legal Barriers to Data Reuse

Envisioning a sustainable data trust

Guest post by Melissa Haendel, PhD, a leader of and advocate for open science initiatives.

The increasing volume and variety of biomedical data have created new opportunities to integrate data for novel analytics and discovery. Despite a number of clinical success stories that rely on data integration (e.g., rare disease diagnostics, cancer therapeutic discovery, drug repurposing), within the academic research community, data reuse is not typically promoted. In fact, data reuse is often considered “not innovative” in funding proposals and has even come under attack. (See the now infamous “research parasites” editorial in The New England Journal of Medicine.)

The FAIR data principles—Findable, Accessible, Interoperable, and Reusable—are a terrific set of goals for all of us to strive for in our data sharing, but they detail little about how to realize effective data reuse. If we are to grow innovation from our collective data resources, we must look to pioneers in data harmonization for insight into the specific advantages and challenges of data reuse at scale. Current data-licensing practices for most public data resources severely hamper data reuse, especially at scale. Integrative platforms such as the Monarch Initiative, the NCATS Biomedical Data Translator, the Gabriella Miller Kids First Data Resource Portal, and myriad other cloud data platforms will be able to accelerate scientific progress more effectively if licensing issues can be resolved. As a member of these various consortia, I want to facilitate the legal use and reuse of increasingly interconnected, derived, and reprocessed data. The community has previously raised this concern in a letter to NIH.

How reusable are most data resources? In our recently published manuscript, we created a rubric for evaluating the reusability of a data resource from the licensing standpoint. We applied this rubric to more than 50 biomedical data and knowledge resources. These assessments and the evaluation platform are openly available at the (Re)usable Data Project (RDP). Each resource was scored on a scale of zero to five stars on the following measures:

  • findability and type of licensing terms
  • scope and completeness of the licensing
  • ability to access the data in a reasonable way
  • restrictions on how the data may be reused, and
  • restrictions on who may reuse the data.

We found that 57% of the resources scored three stars or fewer, indicating that license terms may significantly impede the use, reuse, and redistribution of the data.

Custom licenses constituted the largest single class of licenses found in these data resources. This suggests the resource providers either did not know about standard licenses or believed the standard licenses did not meet their needs. Moreover, while the majority of custom licenses were restrictive, just over two-thirds of the standard licenses were permissive, leading us to wonder whether some needs and intentions are not being met by the existing set of standard permissive licenses. In addition, about 15% of resources had either missing or inconsistent licensing. This ambiguity and lack of clear intent requires clarification and possibly legal counsel.

A total of 61.8% of data resources use nonpermissive licenses.

Putting this all together, a majority of resources would not meet basic criteria for legal frictionless use for downstream data integration and redistribution, despite the fact that most of these resources are publicly funded, which should mean the content is freely available for reuse by the public.

If we in the United States have a hard time understanding how we may reuse data given these legal restrictions, we must consider the rest of the world—which presumably we aim to serve—and how hard it would be for anyone in another country to navigate this legalese. I hope the RDP’s findings will encourage the worldwide community to work together to improve licensing practices to facilitate reusable data resources for all.

Given what I have learned from the RDP and a wealth of experience in dealing with these issues, I recommend the following actions:

  • Funding agencies and publishers should ensure that all publicly funded databases and knowledge bases are evaluated against licensing criteria (whether the RDP’s or something similar).
  • Database providers should use these criteria to evaluate their resources from the perspective of a downstream data user and update their licensing terms, if appropriate.
  • Downstream data re-users should provide clear source attribution and should always confirm it is legal to redistribute the data. It is very often the case that it is legal to use the data but not to redistribute it. In addition, many uses are actually illegal.
  • Database providers should guide users on how to cite the resource as a whole, as individual records, or as portions of the content when mashed up in other contexts (which can include schemas, ontologies, and other non-data products). Where relevant, providers should follow best practices declared by a community, for example the Open Biological Ontologies citation policy, which supports using native object identifiers rather than creating new digital objects.
  • Data re-users should follow best practices in identifier provisioning and reference within the reused data so it is clear to downstream users what the license actually applies to.

To be useful and sustainable, data repositories and curated knowledge bases need to clearly credit their sources and specify the terms of reuse and redistribution.

I believe that, to be useful and sustainable, data repositories and curated knowledge bases need to clearly credit their sources and specify the terms of reuse and redistribution. Unfortunately, these resources are currently and independently making noncompatible choices about how to license their data. The reasons are multifold but often include the requirement for sustainable revenue that is counter to integrative and innovative data science.

Based on the productive discussions my collaborators and I have had with data resource providers, I propose the community work together to develop a “data trust.” In this model, database resource providers could join a collective bargaining organization (perhaps organized as a nonprofit), through which they could make their data available under compatible licensing terms. The aggregate data sources would be free and redistributable for research purposes, but they could also have commercial use terms to support research sustainability. Such a model could leverage value- or use-based revenue to incentivize resource evolution and innovation in support of emerging needs and new technologies, and would be governed by the constituent member organizations.

casual headshot of Melissa Haendel, PhD Melissa Haendel, PhD, leads numerous local, national, and global open science initiatives focused on semantic data integration and disease mechanism discovery and diagnosis, namely, the Monarch Initiative, the Global Alliance for Genomics and Health (GA4GH), the National Center for Data to Health (CD2H), and the NCATS Biomedical Data Translator.

Next Up for the NLM Biomedical Informatics Training Program

Guest post by Katherine Majewski, NLM Librarian.

How are librarians applying informatics?

This is the question we want to answer in re-envisioning the NLM Biomedical Informatics training program. The survey-style course, most recently hosted by Augusta University in Georgia, provided a sampling of the vast realms of informatics research and application in the health sciences. We want to build on the success of that course by targeting the specific skills and knowledge that librarians can use right now to tackle real-world challenges.

headshot of Barbara Platts
Barbara Platts

For example, Barbara Platts and her team provide clinical information services for Munson Healthcare in Traverse City, Michigan. Over the last several years, Barbara’s role at Munson has expanded into electronic health records (EHR). She now contributes to the policy and management of clinical information flow both within and outside the EHR system. As part of that effort, Barbara enhances the functionality of Munson’s EHR; increases the usable clinical content provided across multiple platforms; develops efficient knowledge management structures for hospital communities of practice; and trains hospital employees to use critical appraisal skills to find the best information services available.

How can NLM support this important work and help other librarians follow Barbara’s lead in using information tools to improve patient care?

In trying to answer that question, we’ve been exploring the connections between clinical librarians, informatics, and patient care to better understand NLM’s role. This past year we offered a webinar series entitled “Clinical Information, Librarians, and the NLM: From Health Data Standards to Better Health,” which focused on the roles and products of the National Library of Medicine related to applied clinical informatics, particularly within electronic health records systems and clinical research.  We devoted one of the six sessions in the series to discussing emerging roles and training needs for aspiring informatics librarians. In conjunction with the series, we solicited interviews, visited clinical sites, and polled webinar participants to learn about the specific skills and knowledge clinical librarians are using now or will need in the future.

Along the way we heard from many librarians like Barbara who are part of the clinical information flow, though not always integrated into clinical systems as much as they would like.

We learned that librarians are:

  • working with clinical teams to improve patient care and safety by improving the efficiency and effectiveness of information delivery;
  • connecting systems to systems, bridging the divide between clinicians and information technology staff;
  • crafting information policies and practices within and between health institutions to reduce waste and redundancy and improve patient care;
  • supporting research by:
    • framing research questions,
    • informing research design methods, and
    • managing research data;
  • conducting research in text mining, artificial intelligence and machine learning;
  • selecting and licensing content, including patient education content; and
  • educating users.

How can NLM support these current and future roles for librarians?

Underlying any work related to health information must be a strong facility with the information services NLM provides. This should not be understated or undervalued: Librarians make significant contributions to health using their knowledge of information sources and retrieval techniques, and NLM resources are at the center.

But those librarians who managing data or making system-level connections between patients and health information need additional skills and knowledge from NLM. These fall into two general areas:

  • The ability to manage and direct access to NLM systems and data (e.g., through APIs), and
  • An understanding of the terminologies that can be used to connect systems.

What is the NLM plan for informatics training for librarians and other information professionals?

To support patient care, we are:

To support research, we will:

What about other realms of informatics?

We’re not done yet. Understanding additional areas where librarianship, informatics, and NLM intersect will require more communication with you. Look for opportunities to engage with us through the National Network of Libraries of Medicine and on our page Training on Biomedical Informatics, Data Science, and Data Management.

headshot of Katherine MajewskiKatherine Majewski is a trainer, instructional designer, and technical writer for NLM products. Kate received her master’s degree in library and information science from the State University of New York at Buffalo and has worked in libraries since 1989.

Can “Nudging” Help?: Improving Clinical Trial Access Using Artificial Intelligence for Standardization

Guest post by Presidential Innovation Fellows Justin Koufopoulos and Gil Alterovitz, PhD.

Getting into a clinical trial is challenging for patients. Researchers estimate that only 5% of patients eligible to participate in a cancer clinical trial actually take part in one.  Many factors impact this statistic, including how findable and accessible is information about the clinical trials.

Patients often learn about clinical trials from their doctors or through patient advocacy groups like the American Cancer Society. They then typically search for trials on the internet, often ending up on websites like the NIH-run ClinicalTrials.gov or trials.cancer.gov.

Once on these websites, patients still face challenges to access. Prime among them: what search terms to use to find relevant trials.

The terms a patient or doctor uses may not match how researchers running a trial describe the focus of their study, for example “breast cancer” vs. “ductal carcinoma.” While the NIH clinical trials databases track synonyms and work to make the proper matches, users cannot escape this recurring mismatch in language that challenges access.

This challenge becomes even more pronounced with clinical trial eligibility criteria. These criteria describe who can and cannot participate in a study. For example, an eligibility criterion might be “age 18 years or older” or “confirmed breast lesions that can proceed to removal without chemotherapy.” While a computer can easily match a patient to the first criterion, the second involves many more concepts that are harder to separate, understand, and match.

Artificial intelligence can be part of the solution, particularly “machine learning,” which leverages data to teach a program how to make predictions on chosen topics.

Various technology companies have already used machine learning to address language translation problems. As a result, computers can now translate English to Japanese with few errors, and speech-to-text applications can translate human speech to computer inputs and can even reply.

We adopted a similar, albeit scaled back, approach to translate diverse clinical trials eligibility criteria into standardized and structured language. We also drew inspiration from writing tools that help writers improve their text’s readability and grammar.

Instead of highlighting repeated words or sentences in the passive voice, our prototype nudges researchers toward writing eligibility criteria in a way more easily translated by machine. It offers feedback and suggestions, almost like an English language tutor, and proposes alternative ways to write the criteria that would make them more standard and eventually, more translatable.

Sample Word text with track changes

screen shot from within the eligibility criteria normalizer showing alternate phrasings for a sample criterio
We drew inspiration from revision tracking and grammar-type tools to design our standardization tool for researchers.

This shift toward more standardized language can make it easier to match content across databases, such as matching a list of patients with a set of conditions and prior treatments.

The prototype also helps researchers understand the consequences of their word choices. It looks at previous studies with similar eligibility criteria and notes how many participants they recruited. Additionally, input from consensus-based standards may also be presented.  While not a perfect metric for inclusiveness, this feedback shows someone running a study how their word choices compare to others and the potential impact of those choices on their study’s overall success.

Research by academic psychologists has shown that nudging works in a wide variety of settings. To the best of our knowledge, this is the first time a nudge has been used to coach researchers, but these nudges are not requirements. Researchers can still write their eligibility criteria in the way they think makes the most sense. However, by moving researchers toward standardized phrasings, our prototype can help computers match patient characteristics with eligibility criteria and potentially get more eligible patients into clinical trials.

More work is needed before we can fully implement our tool and test at scale, but we are making progress. We recently completed a pilot study with non-federal groups to determine whether the structured data (so, not the nudging agent but the data our tool learns from) could be used to create tools to help with clinical trials access. Our findings were positive, confirming that private industry and academia need more data like ours for building artificial intelligence tools. The work was featured by the White House on AI.gov as an example for “Industries of the Future.”

The Health Sprint piloting effort included physicians and patient advocates as well as data stewards and experts in the relevant domain areas from within government. For example, Rick Bangs, MBA, PMP, a patient advocate, has worked with various organizations including the National Cancer Institute and the ClinicalTrials.gov development team. Regarding clinical trial matching, Bangs noted, “The solution here will require vision, and that vision will cross capabilities that no one supplier will individually have.”

Next up, we need to evaluate whether this tool helps researchers write eligibility criteria in the “real world,” where all innovations must live.

headshot of Justin KoufopoulosJustin Koufopoulos is a Presidential Innovation Fellow and product manager working to making clinical research more patient-centered. He has worked with the White House, CIO Council, National Library of Medicine, General Services Administration, Department of Commerce, and Veterans Administration on issues ranging from internet access to artificial intelligence.

headshot of Gil AlterovitzGil Alterovitz, PhD, FACMI, is a Presidential Innovation Fellow who has worked on bridging data ecosystems and artificial intelligence at the interface of several federal organizations, including the White House, National Cancer Institute, General Services Administration, CIO Council, and Veterans Administration.


The Presidential Innovation Fellowship brings together top innovators and their insights from outside of government, including the private sector, non-profits, and academia. Their insights are brought to bear on some of the most challenging problems within government and its agencies. The goal is to challenge existing paradigms by rethinking problems and leveraging novel, agile approaches. PIF was congressionally mandated under HR 39, the Tested Ability to Leverage Exceptional National Talent (TALENT) Act. The program is administered as a partnership between the White House Office of Science and Technology Policy, the White House Office of Management and Budget, and the General Services Administration.

Expanding Access, Improving Health

Guest post by Kathryn Funk, program manager for NLM’s PubMed Central.

Last week, National Library Week celebrated how libraries and library workers make our communities stronger. In the spirit of building strong communities, NLM has committed to “democratiz[ing] access to the products and processes of scientific research.”

NLM delivers on that commitment by supporting the NIH Public Access Policy. This policy, passed by Congress in 2008, requires authors funded by NIH to make publicly accessible in PubMed Central (PMC) any peer-reviewed paper accepted for publication. Now, over a decade after the NIH Public Access Policy went in to effect, PMC makes more than 1 million NIH-funded papers available to the research community and the public. This volume of publicly accessible, NIH-funded papers represents a clear return on investment for the public, but numbers alone don’t provide the full story.

A quick dive into NIH Research Matters, a weekly update of NIH research highlights, offers a much richer and more personal picture of how the NIH Public Access Policy and NLM’s support of it can strengthen and empower communities. Making NIH-funded papers publicly accessible in PMC means that the public has free and direct access to research that touches on some of the most critical public health concerns facing our community, including studies that:

  • Suggest a method for detecting breast tumors earlier and more often, creating a higher chance of survival for patients (NIH Research Matters | PMC);
  • Identify treatment options for reducing the risk of death for people who’d previously had a non-fatal opioid overdose (NIH Research Matters | PMC);
  • Explore how maternal nutrition supplements can increase infant birth size and potentially improve children’s life-long health (NIH Research Matters | PMC);
  • Identify young people with suicidal thoughts by using machine learning to analyze brain images (NIH Research Matters | PMC);
  • Gauge exercise’s impact on the growth of new nerve cells in the brains of mice, which could potentially reduce memory problems in people with Alzheimer’s disease (NIH Research Matters | PMC); and
  • Develop blood tests to detect signs of eight common types of cancer (NIH Research Matters | PMC).

These examples illustrate that access, while essential, is not the Library’s end goal. Improved health is.

NLM supports public access to research outputs to accelerate scientific discovery and advance the health of individuals and our communities. It is the best way we can honor the investment made by the American people in scientific research and the surest way to make our communities stronger.

casual photo of Kathryn FunkKathryn Funk is the program manager for PubMed Central. She is responsible for PMC policy as well as PMC’s role in supporting the public access policies of numerous funding agencies, including NIH. Katie received her master’s degree in library and information science from The Catholic University of America.

Building Data Science Expertise at NLM

Guest post by the Data Science @NLM Training Program team.

Regular readers of this blog probably know that NLM staff are expanding their expertise beyond library science and computer science to embrace data science. As a result, NLM—in alignment with strategic plan Goal 3 to “build a workforce for data-driven research and health”—is taking steps to improve the entire staff’s facility and fluency with this field so critical to our future.

The Library is rolling out a new Data Science @NLM Training Program that will provide targeted training to all of NLM’s 1,700 staff members. We are also inviting staff from the National Network of Libraries of Medicine (NNLM) to participate so that everyone in the expanded NLM workforce has the opportunity badge reading "Data Science @NLM Training Kickoff" to become more aware of data science and how it is woven in to so many NLM products and services.

For some of our staff, data science is already a part of their day-to-day activities; for others, data science may be only a concept, a phrase in the strategic plan—and that’s okay. Not everyone needs to be a data scientist, but we can all become more data savvy, learning from one another along the way and preparing to play our part in NLM’s data-driven future. (See NLM in Focus for a glimpse into how seven staff members already see themselves supporting data science.)

Over the course of this year, the data science training program will help strengthen and empower our diverse and data-centric workforce. The program will provide opportunities for all staff to participate in a variety of data science training events targeted to their specific interests and needs. These events range from the all-hands session we had in late January that helped establish a common data science vocabulary among staff to an intensive, 120-hour data science fundamentals course designed to give select NLM staff the skills and tools needed to use data to answer critical research questions. a badge reading "Data Science Readiness Survey Completed" and showing a thumbs up We’re also assessing staff members’ data science skill levels and creating skill development profiles that will guide staff in taking the steps necessary to build their capacity and readiness for working with data.

At the end of this process, we’ll better understand the range of data science expertise across the Library. We’ll also have a much clearer idea of what more we can do to develop staff’s facility and fluency with data science and how to better recruit new employees with the knowledge and skills needed to advance our mission.

In August, the training program will culminate with a data science open house where staff can share their data science journey, highlight group projects from the fundamentals course, and find partners with whom they can collaborate on emerging projects throughout the Library.

But that final phase of the training initiative doesn’t mean NLM’s commitment to data science is over. In fact, it will be just the beginning.

In the coming years, staff will apply their new and evolving skills and knowledge to help NLM achieve its vision of serving as a platform for biomedical discovery and data-powered health.

How you are supporting the data science development of your staff? Let’s share ideas to keep the momentum going!


Co-authored by the Data Science @NLM Training Program team (left to right):

    • Dianne Babski, Deputy Associate Director, Library Operations
    • Peter Cooper, Strategic Communications Team Lead, National Center for Biotechnology Information
    • Lisa Federer, Data Science and Open Science Librarian, Office of Strategic Initiatives
    • Anna Ripple, Information Research Specialist, Lister Hill National Center for Biomedical Communications

National Public Health Week 2019: How NLM Brings Together Libraries and Public Health

Guest post by Derek Johnson, MLIS, Health Professionals Outreach Specialist for the National Network of Libraries of Medicine Greater Midwest Region

Recent articles in Preventing Chronic Disease and The Nation’s Health chronicle how public libraries can complement the efforts of public health workers in community outreach and engagement. Data tell us that more Americans visit public libraries in a year (1.39 billion) than they do health care providers (990 million). More so, over 40% of computer-using patrons report using libraries to search for health information. However, we also know many individuals struggle with accessing and understanding the health information they encounter every day.

This challenge begs the question, “How does the National Library of Medicine (NLM) increase access to trustworthy health information to improve the health of communities across the United States?”

It’s an important question, and, as we celebrate National Public Health Week, it gives us an opportunity to reflect on the incredible work NLM is doing through its National Network of Libraries of Medicine (NNLM) to bring libraries and public health together.

Take, for example, Richland County Public Health in Ohio. Richland County is approximately 33% rural. Many rural areas have been identified as “internet deserts.” In addition, adults in the county have lower rates of high school and college-level education compared to state averages. Seeking to address these disparities, Richland County Public Health applied for a funding award from NNLM’s Greater Midwest Region to develop an Interactive Health Information Kiosk in partnership with the county public library system.

With funding in hand, Richland County Public Health loaded select NLM resources onto specially configured iPads and installed them in the nine branches of the Richland County Libraries. A health educator trained library staff, local healthcare providers, and the public on how to use those resources to access trustworthy health information. Moving forward, librarians will be able to help patrons use the health kiosks. As a result, Richland County Public Health is helping improve health literacy among adult residents and, ultimately, enabling them to make more informed decisions about their health.

Another example of a public health and public library collaboration comes from NNLM’s Middle Atlantic Region (MAR). The Philadelphia Department of Public Health recognized the need to engage individuals in neighborhoods most vulnerable to severe weather events to increase their knowledge of disaster and emergency preparedness.

With funding from MAR, the Philadelphia Department of Public Health partnered with four branches of the Free Library of Philadelphia to train both librarians and local residents on emergency preparedness. Participants learned how to make use of the NLM Disaster Information Management Research Center and where to find local resources during weather-related emergencies.

These are just two of the many projects that NNLM helps facilitate across the country through its network of more than 7,500 library, public health, community-based, and other organizational members.

And, while NNLM continues to identify partnerships for funding public health and library projects, it also engages health educators by offering continuing education credit for Certified Health Education Specialists (CHES). CHES-certified professionals work in a variety of health care and public health settings where they help community members adopt and maintain healthy lifestyles. Health educators can earn continuing education credits by attending specially designated NNLM webinars on topics such as health statistics and evidence-based public health, with more courses in the works.

As communities continue to rely on the public health workforce to sustain and build healthy environments, know that the National Library of Medicine and its National Network of Libraries of Medicine are here to support the work they do!

headshot of Derek JohnsonDerek Johnson, MLIS is the Health Professionals Outreach Specialist for the National Network of Libraries of Medicine Greater Midwest Region. In this capacity, he conducts training and outreach to public health professionals on a variety of topics, including evidence-based public health, health disparities, and community outreach.

 

An Introduction to Authority-based Security

Guest post by Kurt W. Rodarmer, a software security architect in NLM’s National Center for Biotechnology Information.

NLM is working to unleash the potential of data and information to accelerate and transform biomedical discovery. Foundational to that goal lie the data themselves. We assess their value, collect and curate them, and then make them accessible.

But access has its risks. Big risks. Especially when it comes to personal medical data or hard-earned, grant-funded proprietary data. We need to find a way to deliver access while simultaneously controlling and protecting the data.

That’s where security comes in.

We’re all familiar with “identity-based security,” evolution’s primitive mechanism that predates our species. It starts by using our eyes, ears, and nose to identify someone or something and ends with an immediate risk-assessment. Not surprisingly, this mechanism was modeled in modern cybersecurity and is virtually ubiquitous across consumer and industrial-grade systems.

For all their efforts though, these systems sure seem to fail—a lot. Common wisdom suggests breaches are inevitable, but that’s not entirely true. There are other approaches.

Authority-based security is one. With that, authority, permissions, and trust are explicitly modeled, and policy decisions are made up front. We create objects that embody these ethereal concepts and make them tangible. These objects can then be stored, transmitted, accessed, sub-divided, transferred, etc. The discipline of modeling and managing authority is called Authority Management.

Identity- and authority-based approaches achieve several common goals. They each have strengths and weaknesses. Where they differ, the stronger, more effective, and more elegant of the two is nearly always authority-based.

Both approaches grant permissions based upon security policy. Authority-based security captures the result of policy evaluation as permissions in unforgeable and unmodifiable tokens. Since these tokens come from a known source of authority and are tamper-evident, the permissions they contain require no further scrutiny. They are as trustworthy as the authority that issued them. A permission token typically contains only a small subset of the overall permissions available to an individual, ideally never more than are needed within the current dynamic context.

By contrast, identity-based techniques make permission decisions based upon global attributes or provide crude static mechanisms. In most cases, they reflect zero context sensitivity. That means, for example, that if I run a program on a stock Linux system, that program executes using 100% of my permissions, even though it may need only read access to one file and write access to one directory. For all I know the program could be surreptitiously stealing my most sensitive data in the background, and I’d have no awareness or protection against it. Without my permission? That’s the point—I just gave it ALL my permissions!

In an authority-managed system, I would have given that same program permissions to access only the file and directory needed, leaving it powerless to read other sensitive files, much less phone home and exfiltrate them.

So, if identity-based security is so far behind the curve, what accounts for its continued use? It has one highly prized strength: its ability to revoke permissions on the spot. Since permissions are granted at the moment they are going to be exercised, any permission can be immediately denied as the result of updating policy. Since this policy update is often reactive, coming about once damage has already occurred and possibly delayed by weeks or months, the value of its immediacy is questionable. Tokens have a built-in timeout making them self-revoking, and in practice perform similarly.

Here’s how it works. To do anything of substance in a system, you need permissions. You may have those permissions already stored on some device, such as your phone. Or, you may need to go through the process of identifying yourself to some part of the system that is storing permissions on your behalf, accessible once your identity has been authenticated. In either case, the first step is to get ahold of a token containing your set of pre-approved permissions.

The permission set you now hold represents the complete permissions you have within the system you have just entered, e.g., dbGaP, a grant administration system, etc. It is unlikely to represent all the permissions you have within every system you can access. Even so, it’s probably too permissive for what you have in mind. Your next step would typically be to subset your permissions to only those needed to limit the potential damage should the token fall into the wrong hands.

Sometimes you need to share your permissions, such as when a grant-funded investigator delegates most of the research documentation to lab assistants. She can take her permission tokens received with the grant, subset, and delegate them to her lab as appropriate, so everyone can work.

What else can you do with them? Literally anything that can be done in an information system! Beyond implementing the traditional security processes of Identity and Access Management (IAM, a proper subset of Authority Management), tokens are also used to protect resources in other ways. They can be used to model spending accounts and quotas, control access to consumable or metered resources, mitigate DOS attacks, provide audit trails, and eliminate the use of passwords and multiple logins.

Because tokens carry permissions whose source of authority is irrefutable, they are the mechanism for implementing the fundamental principles of security. We can bring some of their benefits to bear right now and help lay the groundwork for secure, accessible biomedical data.

headshot of Kurt RodarmerKurt Rodarmer started work on military-grade secure operating systems over 20 years ago in Silicon Valley, working with the architect of KeyKOS, Norman Hardy. He is an expert in secure software and language design and has formalized the field of Authority Management. Kurt previously worked for Apple and Oracle and was a consultant to IBM and Sun, among others.