Development of a Digital Repository for NLM Digitized Collections and Born-Digital Resources
Development of Functional Requirements
In April 2006, the Acting Associate Director for Library Operations appointed the NLM Digital Repository Working Group (DRWG) and charged them with the task of developing functional requirements for an NLM digital repository to provide access to and preservation of digital content not covered by PubMedCentral and the NIH CIT Videocast project. The creation of functional requirements and identification of key policy issues for an NLM digital repository were essential first steps to aid in building the NLM collection in the digital environment.
A number of Library Operations program areas are in need of such a digital repository to support their existing digital collections and to expand the ability to collect a growing amount of born-digital resources. Dozens of digital collections have been created by the History of Medicine Division that require long-term management and preservation. Collection development and acquisitions staff is seeing an increasing availability of born-digital materials that NLM needs to add to its collection. The NLM preservation program has embraced digitization as a preservation format to replace microfilming.
The DRWG completed a functional specifications document by March, 2007. In addition, the group identified policy and management issues related to the creation, design and maintenance of the repository. By identifying high level functional requirements and policy considerations, the DRWG endeavored to outline an infrastructure and bring a standards-based approach to the management, preservation and access of existing and future NLM digital resources.
- DRWG project charter (PDF)
- NLM Policies and Functional Requirements Specification (PDF, MS Word)
- Requirements for an NLM Digital Repository: Report and Recommendations (PDF)
Selection of Digital Repository Software
In June 2007, the Digital Repository Evaluation and Selection Working Group (DRESWG) was established to evaluate commercial systems and open source software and select one (or combination of systems and software) for use as an NLM digital repository. Based on the work of the DRWG, the DRESWG scanned the literature and conducted investigations to construct a list of ten systems and software for initial evaluation. The DRESWG then developed a set of “Master Evaluation Criteria,” to provide a decision method to narrow the ten systems to three systems for detailed consideration. All ten systems were ranked and three were identified for in-depth testing: DigiTool, DSpace, and Fedora. Because Fedora has a limited user interface, the DRESWG selected Fez, a Web interface to Fedora, to enable more effective testing.
DSpace 1.4.2, DigiTool 3.0, and Fedora 2.2/Fez 2 Release Candidate 1 were installed on NLM servers for extensive hands-on testing. The DRESWG established a ground-rule that the latest production versions of each system would be installed and tested. A Consolidated Digital Repository Test Plan was created based on the requirements enumerated in the NLM Digital Repository Policies and Functional Requirements Specification. The Test Plan contained 129 specific tests. A variety of digital objects including digitized pamphlets, video files, images, integrated resources, etc., from the Library's collections, were loaded into each system for testing.
After completion of all testing, DRESWG recommended that NLM select Fedora as the core system for the NLM digital repository. DRESWG was highly impressed with a number of Fedora capabilities, including the strong technology roadmap, the excellent underlying data model that can handle NLM’s diverse materials, the active development community, Fedora’s adherence to standards, and Fedora’s use by leading institutions and libraries with similar digital project goals.
DRESWG also recommended that work should begin immediately on a Fedora pilot project using four identified collections of materials from NLM and the NIH Library. Most of these collections already have content files and metadata for loading into a repository. After an initial pilot phase at approximately 6-8 months, the effort will be evaluated. NLM senior staff concurred with this recommendation and work has already begun on the pilot implementation.
- DRESWG project charter (PDF)
- NLM Consolidated Digital Repository Test Plan (MS Excel)
- Recommendations on NLM Digital Repository Software (PDF, MS Word)
- Evaluation of Digital Repository Software at the National Library of Medicine (article from May/June 2009 D-Lib Magazine)
- Digital Collections, the public website for NLM's digital repository
- About the Digital Collections repository