Skip Navigation Bar
 

Developing Permanence Levels and the Archives for NLM’s Permanent Web Documents

Permanence Levels

The National Library of Medicine has developed a system to indicate to users which of its Web documents will be kept permanently available and whether the contents and identifiers of those documents could change over time. In 1999, the Working Group on Permanence of NLM's Electronic Information was established and given the following charge:

To examine the range of electronic information produced by NLM and develop recommendations in the following areas:

a) levels of permanence suitable for different categories of NLM information, e.g., permanent retention in unaltered form, permanent retention in continually updated form, retentions for a limited time in unaltered form, etc. In developing recommended levels of permanence, the Working Group should consider NLM's role as the holder of its own archives, as well as the needs of current and future users.

b) methods of recording and communicating the level of permanence of NLM electronic information, e.g., in metadata or by some other means

c) procedures for ensuring that the levels of permanence are implemented in practice

d) approaches to labeling, organizing, retrieving and displaying NLM's electronic information so that the retention of older materials (e.g., for NLM Archives) does not have a negative impact on those seeking current information.

The Working Group's discussions focused on three important characteristics of Web documents: identifier validity, resource availability, and content invariance. The Group developed a rating system based on these three concepts. The ratings later were distilled into the following four permanence levels:

Permanent: Unchanging Content

The National Library of Medicine has made a commitment to keep this resource permanently available. Its identifier will always provide access to the resource. Its content will not change. Example: Minutes of the NLM Board of Regents meetings

Permanent: Stable Content

The National Library of Medicine has made a commitment to keep this resource permanently available. Its identifier will always provide access to the resource. Its content is subject only to minor corrections or additions. Example: NLM Annual Report

Permanent: Dynamic Content

The National Library of Medicine has made a commitment to keep this resource permanently available. Its identifier will always provide access to the resource. Its content could be revised or replaced. Example: NLM Home Page

Permanence Not Guaranteed

The National Library of Medicine has made no commitment to keep this resource available. It could become unavailable at any time. Its identifier could be changed. Example: Frequently Asked Questions

The Working Group analyzed the documents that were available on the NLM Web site and developed a list of document categories. To simplify the assignment of permanence levels by Library staff, document categories were assigned default ratings. The default rating for press releases, for example, is Permanent: Stable Content. If a default rating does not seem appropriate for a particular document, it can be changed by the person responsible for assigning the metadata or by a system administrator.

Document Category Default Permanence Level
Announcements, News Permanence Not Guaranteed
Applications, Forms, Registrations Permanence Not Guaranteed
Bibliographies Permanent: Dynamic Content
Calendars, Schedules Permanence Not Guaranteed
Clinical Alerts Permanent: Unchanging Content
Contracts and Related Resources Permanence Not Guaranteed
Database Permanent: Dynamic Content
Digital Library Collections Permanent: Dynamic Content
Exhibitions Permanent: Stable Content
Fact Sheets Permanent: Stable Content
FAQs, Help Files, Pocket Cards Permanence Not Guaranteed
Finding Aids Permanent: Dynamic Content
Grants, Awards Permanence Not Guaranteed
Lists of Links Permanence Not Guaranteed
Minutes (Official) Permanent: Unchanging Content
Newsletters Permanent: Stable Content
Organizational Charts and Directories Permanence Not Guaranteed
Other Blank (No Default Rating)
Photos of Staff, Programs, Activities, Buildings and Grounds Permanence Not Guaranteed
Policies (Official) Permanent: Stable Content
Press Releases Permanent: Stable Content
Procedures Permanence Not Guaranteed
Product, Program, and Project Descriptions Permanent: Dynamic Content
Reports (Official) Permanent: Stable Content
Software Permanence Not Guaranteed
Staff Biographical Sketches Permanence Not Guaranteed
Staff Papers Permanence Not Guaranteed
Staff Presentations Permanence Not Guaranteed
Statistics and Reports Permanence Not Guaranteed
Technical Documentation Permanent: Dynamic Content
Training Material and Manuals Permanent: Dynamic Content
Visitor Information Permanence Not Guaranteed

NLM's Metadata Schema

During the deliberations of the Working Group on Permanence, NLM's Task Group on Metadata and Methods of Recording Permanence Levels was appointed and charged with developing an expanded set of metadata to increase the retrievability of NLM's Web documents. It also was asked to decide how permanence metadata would be recorded and displayed. The Task Group recommended that metadata should be created for all publicly available electronic resources created by NLM and that permanence data be a required element of the metadata set. The NLM set is based on the Dublin Core Metadata Element Set but with some local adaptations--most notably the addition of permanence ratings. (See NLM Metadata Schema).

Implementing the System

A third committee, known as the Electronic Archive Group (EAG) then was charged with developing a pilot project for assigning metadata including permanence levels and building an archive for outdated Web documents of permanent value to NLM. The EAG evaluated several systems under development elsewhere and concluded that TeamSite, the content management system developed by Interwoven, Inc. that was being purchased for NLM's main Web site, could be used for assigning metadata and managing the archiving workflow. A template was created in TeamSite and Web contributors were trained to use it to assign basic metadata for all documents that would be submitted for promotion to the Web. The template was designed to minimize the burden on document creators.

Default values or drop-down menus are provided wherever possible. This minimal set includes:

Title

Heading

Date Published

Date Last Modified

Next Review Date

Contact email address

Publisher

Rights

Permanence Level

Permanence Guarantor

Language

When a contributor assigns to a document a rating of Permanent (Unchanging, Stable, or Dynamic content), the system notifies the NLM Archives Team. The Archives Team reviews the document category and permanence metadata and forwards the document for promotion to the Web. The Cataloging Section then creates a complete MARC bibliographic record with standardized access points, including MeSH and an NLM classification number. The record appears in NLM's online catalog and is distributed to the bibliographic utilities and other NLM licensees. Enhanced metadata created by the Cataloging Section is then added to the header information of the online resource. The metadata can be viewed by clicking on a link that appears at the end of each resource.

The Archives

The Archives contain permanent resources with outdated content. This includes older material that was once up on the current site but is no longer of current interest and earlier versions of current documents that have undergone major revisions. After investigating archive models developed elsewhere, the EAG determined that the best way to ensure proper migration of all permanent resources and allow searching and retrieval of archived items was to keep the archive as a separate but integral part of NLM's main Web site. The system was designed to query the current site and the Archives at the same time but search results for current and outdated documents are clearly differentiated. Search results for archived documents are listed separately and may be accessed by clicking on a folder labeled "Archives".

The system prompts Web contributors at regular intervals to review and revise their current documents as needed. If contributors create a major revision of a permanent document or decide that a permanent document should be removed from the current site without being replaced, the archiving function is triggered.

When a document is moved to the Archives, the date archived is added to its URL. If a user enters the original URL for an archived document that has no current version on the main site, a redirect page informs the user that the document has been moved to the Archives and provides a link to it. The only links in an archived document that continue to function are those to other parts of the same archived document. All other existing links remain, however they are not maintained. Users can find the earlier and later versions of Permanent: Stable documents and trace changes in these NLM programs and services over time by clicking on Previous Version and Replaced by links.

NLM developed a sidecar approach to providing metadata for non-HTML documents such as PDFs, MS Word documents, etc., using a separate XML file to contain the NLM modified Dublin Core metadata. Web documents created by NLM divisions that do not use the TeamSite content management system also are not included in the Archives. In the future, the workflow will be modified so that all of NLM's outdated Web publications of permanent value can be added to the Archives. In the near term, work will continue on automating more of the archiving process and streamlining the addition of enhanced metadata to permanent Web documents once their MARC records have been created.

November 2007