Building a 21st Century Archival Collection at the National Library of Medicine

Across NLM, strategic planning is underway and many of our colleagues are thinking about what the next century will bring for the Library. Among the many questions in mind is “what is a 21st-century collection?”

In the Images and Archives Section in NLM’s History of Medicine Division we have seen our archival collections, and our profession, grow and evolve over the past few decades as the archival records of individuals, organizations, and other communities in health and medicine are increasingly created and communicated electronically and online.

But there is more to building a 21st century collection than being digital.

Before we dive into what NLM’s archival collections look like in the 21st century, we realize some readers of this blog may be thinking, “NLM has archives?”  Indeed we do.  Our archival collections consist of over 12,000 audio-visual titles, over 150,000 prints and photographs, 18,000 linear feet of archival and manuscript collections, 6.8 terabytes of born digital content, and 5.1 terabytes of web archives that document, among other things, biotechnology; drugs; health policy; public health; and the research of leading biomedical scientists such as Marshall Nirenberg, Joshua Lederberg, and Michael E. DeBakey.

We consider three broad areas relevant to building 21st-century archival collections:

  1.  Acquiring materials that document both the past and current history of medicine and the health sciences;
  2.  Preserving those materials; and
  3.  Providing access to those materials.

These are not new areas of consideration for archivists, but what we acquire and how we preserve it and provide access to it has been rapidly changing, which means our thinking has to change as well.

So what makes a 21st-century collection different from one from just 50 years ago?

Expanded formats

While we continue to collect analog materials, we have expanded the formats and types of records we collect to include born-digital files—everything from email to word processing documents to digital photographs and videos.  NLM, through its web archiving program, also collects online materials such as blogs, government web sites, online news, and others related to topics such as Ebola, Zika, and bioethics.

Long-term access

The addition of born-digital materials brings preservation and access challenges. To understand the scope of these challenges, consider how you would access a file today you had saved on a floppy disk in 1992. You’d need both the right hardware—a computer with a floppy disk reader—and the right software to read the file. Neither is easy to find. To overcome these types of challenges, archivists are collaborating with IT professionals and others on ways to preserve born-digital content so it will be accessible for decades or even centuries from now.

Enhanced utility

Until recently, access to archival collections has primarily meant being able to read or view content, but we envision a 21st century collection that offers new forms of access that allows for running queries across many items and collections. Researchers may be looking for the initial occurrence of something, for patterns in how it was applied, for the response to it or impact of it. By providing tools and systems that allow this kind of analysis, we will not only accelerate discovery and glean insights; we will also deepen the collection’s usefulness.

How can we make all this happen?

Building 21st-century archival collections ideally means working with the creators of content to acquire and preserve materials before they disappearWeb and social media content in particular is in a constant state of change and at high risk for loss. We will also need tools and systems that support collecting and managing this content on a large scale, and policies and processes for making this content available to researchers who not only want to dive into individual documents, but also run queries across collections. Interestingly, these issues and others parallel those faced by data scientists, ranging from provenance to stewardship, intellectual control, privacy, and long-term access for both anticipated and unanticipated research needs.

As we collect and preserve these archival materials, we aim make them broadly accessible to researchers, medical professionals, educators, students, and the general public.

We invite you to learn more about the NLM’s archival collections and explore some of our online resources from the History of Medicine Division, including the following:

Rebecca Warlow works in the Images and Archives Section, History of Medicine Division, Library Operations Christie Moffatt works in the Images and Archives Section, History of Medicine Division, Library Operations 

Guest bloggers Rebecca Warlow and Christie Moffatt work in the Images and Archives Section, History of Medicine Division, Library Operations.

Further readings

Embracing the Future as Stewards of the Past, A View from NLM’s History of Medicine Division, Jeffrey S. Reznick, PhD, Chief of the NLM History of Medicine Division

Responding to a Call to Action: Preserving Blogs and Discussion Forums in Science, Medicine, Mathematics, and Technology, post on the Library of Congress’ The Signal by Christie Moffatt

National Digital Stewardship Alliance 2015 National Digital Agenda