MLA 2008, NLM Update, Diane Boehr
On the Record
Report of the Library of Congress
Working Group on the Future of Bibliographic Control
Head of Cataloging, NLM
For MLA Annual Meeting
May 20, 2008
The Working Group was charged to:
- Present findings on how bibliographic control and other descriptive practices can effectively support management of and access to library materials in the evolving information and technology environment;
- Recommend ways in which the library community can collectively move toward achieving this vision;
- Advise the Library of Congress on its role and priorities.
Was formed in Nov. 2006, with 3 reps from ALA and ARL, 1 rep each from the Association of Law Libraries, the Program for Cooperative Cataloging, the Special Libraries Association. the Medical Library Association, NFAIS, Google, and Microsoft. At large members from the Coalition for Networked Information and OCLC.
Three public hearings, March–July 2007:
- Users and uses of bibliographic data (held at Google headquarters, San Jose)
- Structures and standards for bibliographic control (held at ALA headquarters, Chicago)
- Economics and organization of bibliographic control (held at LC, Washington, DC)
Draft report issued Nov. 30, 2007
- Two weeks for public comments
Final report issued Jan. 9, 2008
Invited presentations and time for audience comments. People were also encouraged to submit written testimony. 75 written testimonies, of which >15 were submitted on behalf of institutions. Draft report sparked over 135 pages of comments, again both individuals and institutions.
The Audience for the report
While presented to LC, recommendations are situated in the broader context in which LC functions.
Recommendations are labeled with the primary groups that should take ownership/action
The Working Group’s Vision of the Future
The future of bibliographic control will be collaborative, decentralized, international in scope, and Web-based
This sentence summarizes the essence of the report. Libraries will work together with each other and other players in the bib environment. Responsibility will not fall solely on LC, but will be spread amongst the community. It will include players from all over the world and both our data and standards will be accessible on the Web. We will expose our data to the users where they are (Google, Yahoo), rather than expecting them to come directly to us.
Working Group’s Guiding Principles
- Redefine bibliographic control
- Redefine the bibliographic universe
- Redefine the role of the Library of Congress
Bib control is the organization of materials to facilitate discovery, management, identification and access. Bib control needs to be thought of as more than just cataloging. The catalog is just one access route to material that a library manages for its users. Bib control must embrace all library materials, a diverse community of users and a multiplicity of venues where information is sought. The metadata created by the library must also be usable outside the library environment. This means the tightly controlled consistency designed into library standards to date, is unlikely to be realized or sustained in the future, even within the local environment.
The bib universe is wider than even libraries and publishers and must include database producers, creators of content, vendors, user communities, and commercial outlets. Library holdings serve as one node in the web of connectivity.
LC has served as a defacto national library, but in fact has no legal mandate or funding lines to do so. Therefore in times of constrained budgets, it is necessary for LC to re-examine what they do and who it benefits and determine if other players can assume some of those roles.
Working Group’s Guiding Principles
High level recommendations
- Increase the Efficiency of Bibliographic Record Production and Maintenance
- Enhance Access to Rare, Unique, and Other Special Hidden Materials
- Position our Technology for the Future
- Position our Community for the Future
- Strengthen the Library and Information Science Profession
Recommendations divided into five main categories
1. Increase efficiencies
- Eliminate Redundancies
- Make use of bibliographic data available earlier in the supply chain
- Re-purpose existing metadata for greater efficiency
- Fully automate the CIP process
The larger publishers already have rich bibliographic data that they are sharing with online sellers like Amazon. We should not have to recreate or rekey that information to put it in our catalogs. However, this may mean that we need to be more flexible in accepting data that may not conform precisely to US library standards.
Make use of the rich data already available A&I services, IMDb, etc.
NLM has already learned many of these lessons in the indexing environment, where we have been able to get publishers to follow a standard format for submitting article data. However, libraries don’t have the same clout with book publishers, so it’s harder to enforce standards.
CIP, Cataloging in Publication—publishers supply data to LC before the book is published and bibliographic records are created and made available at that time and is printed in the book itself. Since publishers want to participate in CIP, this is a case where LC has the clout to encourage/require publishers to submit their data in a standard format. NLM has been a CIP participant for over 20 years, so we would welcome a more automated process.
1. Increase efficiencies (con’t.)
- Distribute responsibility
- Share responsibility for creating and maintaining bibliographic records
- Collaborate on authority record creation and maintenance
- Increase re-use of assigned authoritative headings among various communities
- Internationalize authority files
There is already a Program for Cooperative Cataloging (PCC), where libraries share the responsibility for creating high quality bib and authority records, but the number of participants is relatively small . Expand the number of participants in the PCC component programs.
We need to work with other communities, such as the A&I services, publishers and research institutions to develop more effective ways of identifying and distinguishing authors, and collocating forms of names in different languages and scripts to allow true international sharing of data.
1. Increase efficiencies (con’t.)
- Re-examine current economic model for data sharing in the networked environment
- Increase incentives for sharing bibliographic records
OCLC, the Online Computer Library Center, has become the defacto world union catalog. Their procedures and pricing models have created some of the barriers to record sharing, both for original records and record maintenance. Any member library can create new records, but permission to update records in the database is very restricted. After a one time credit to the originating library for new records, OCLC makes the money when the record is later used by others. There are vendors who will create cataloging records and sell them to libraries, but contractually forbid them from sharing these records. Is there a way libraries and vendors who create the data could get some of the money that currently goes only to OCLC each time a record is used by another library?
2. Enhance Access to Hidden Collections
- Make the discovery of rare & unique materials a high priority
- Provide some level of access to all material, rather than comprehensive access to some material and no access at all to other material
- Encourage digitization to allow broad access
- Share access to unique materials
Rather than everyone devoting a great deal of time and effort to the same commercially produced titles, reallocate resources to describe the rare and unique materials. Where copyright allows, digitize material so people can access it wherever they are. Share the metadata created for unique collections so they are not hidden.
3. Position Technology for the Future
- Web as Infrastructure
- Develop a more flexible, extensible metadata carrier
- Express library standards as well as library data in machine- readable and machine-actionable formats
- Extend use of standard identifiers
In fact, the future is already here. We must catch up with it. MARC is over 40 years old. It has served us very well, but its use is limited solely to the library community. If we are to work in a broader info environment, we need develop a carrier that is compatible with today’s Web technology.
We need to recognize that in today’s environment, machines as well as human are the users of our data and structure it accordingly
3. Position Technology for the Future (con’t)
- Standards Development
- Improve the standards development process
- Develop standards with a focus on return of investment
- Incorporate testing and implementation plans as integral parts of the development process
Standards are vital when data must support a growing number and variety of applications
Current standards development is often done haphazardly with related standards developed independently (e.g. NISO, DLF, ONIX, MARC all independently developed holdings standards which are not compatible with each other; those developing MARC have not always worked together with those writing the cataloging rules). No one is currently overseeing the process.
Bullet 3 + Need to involve programmers or those who are familiar with systems design from the beginning, not after the standard is developed.
3. Position Technology for the Future (con’t.)
- Suspend further new work on RDA
- The promised benefits of RDA are not discernable in the drafts seen to date
- Business case for moving to RDA has not been made satisfactorily, particularly given the potential costs of adoption
- More real-world testing of the FRBR model, on which RDA is based, is needed
A standard currently in development is RDA: Resource Description and Access. It is a revised set of cataloging rules designed to replace the current Anglo-American cataloging rules. RDA’s stated goals were admirable, but the actual product that has been distributed in draft form is very disappointing. It does not seem to take us into the future we envision. Rather than being simple and streamlined and attractive/understandable to other communities, it is actually more complex. The language is opaque, and there is some concern that the theoretical underpinnings of FRBR (Functional Requirements for Bibliographic Records—a model published in 1998 by IFLA) are not yet truly proven. Therefore the WG recommended that we suspend work on this standard so the cost/benefits and underpinnings of the new standard could be re-evaluated.
4. Position our Community for the Future
- Design for the future
- Integrate user-contributed data, while maintaining the integrity of the library-created data
- Provide links to appropriate external data
- More research into use of computationally derived data
- Clarify and further explore the use of the FRBR model in the Web environment
Web 2.0 applications. Our users now expect this functionality based on their other online experiences.
Develop ways to categorize or identify creators of added data so other users can make informed decisions on the value/relevancy of the data, without violating privacy concerns.
Links to reviews, book jackets, data sets, etc.
NLM is already doing lots of research in the area of computationally derived data.
Test the FRBR model in the Web environment.
4.Position our Community for the Future (con’t.)
- Evolve & transform LCSH
- Pursue de-coupling of subject strings
- Encourage application of & cross-referencing with other controlled subject vocabularies
- Recognize the potential of computational indexing in the practice of subject analysis
We recognize that controlled vocabularies are very valuable, but Library of Congress Subject Headings (LCSH) as currently constructed are difficult to use in the Web environment.
NLM is way ahead of the game here. MeSH is a true thesaurus, and Cataloging decoupled our strings back in 1999. NLM researchers have made excellent progress in using computational indexing for subject analysis. MTI, Medical Text Indexer software, is already used to suggest subjects in the Indexing environment and Cataloging has worked with Lister Hill to modify the program to meet Cataloging needs and is very close to implementing use of MTI to suggest subject headings for catalogers.
5. Strengthen the Profession
- Build an evidence base
- Encourage ongoing qualitative and quantitative research in bibliographic control
- Design LIS education to meet present and future needs
During our work, we were continually struck by the difficulty of finding hard data about many of these issues. What data elements are really needed and used? Is there a way to determine the cost benefits of controlled vocabulary? We have lots of anecdotes, but few facts. As we’ve done in medicine, it’s time to implement evidence-based librarianship.
- Report presents a vision and broad directions for the future
- It is not a specific implementation plan
- A call to action
We recognize that some recommendations are very broad, others more precise. The report is meant to inform the discussion and debate that needs to happen among a wide range of participants, and catalyze thoughtful and deliberate action. It should convey the sense of urgency we feel in getting this dialog and activity underway.
- Three separate groups in the library reviewed the document
- LC has committed to responding in writing to each of the separate recommendations by ALA Annual, June 2008
ABA (Acquisitions and Bibliographic Access) managers, internal working group (part of strategic planning process), reference specialists (Thomas Mann, et al.) and their reports were due to Deanna on May 1.
The RDA recommendation was considered first because it is very time-sensitive and the 3 national libraries issued a joint statement on that on May 1.
Impact on NLM?
- Cataloging descriptive process could be streamlined
- Catalogers could focus on the intellectual tasks of subject assignment, classification, and linkages between items
- More of NLM’s cataloging resources could be devoted to providing access to our hidden collections
Since NLM is already participating in the CIP program for medical titles and is a full PCC member, I don’t think we will end up taking on many more responsibilities or tasks in the national arena.
However, if cataloger’s time were freed up through the availability of more descriptive data earlier in the supply chain, and more automated authority work, catalogers could spend time on other more intellectual tasks of subject assignment, classification and linkages. They could also work on new tasks associated with digitization and exposing our hidden collections.
Other things NLM could do
- Work with Lister Hill to develop automated means of disambiguating authors
- Work with publishers to assist in developing author identifiers
- Use authorized name headings in indexing citations as well as in bibliographic records
If we could automate more of the intellectual tasks associated with authority work by having authors self identify at the publication stage, or use algorithmic methods to disambiguate names, then we could apply authority control to indexing citations as well as the bibliographic records, providing better collocation of articles for our users.
Other things NLM could do
- Work cooperatively with LC to develop crosswalks between MeSH and LCSH
- Investigate the possibility of user tagging for bibliographic citations. Review the tags to enhance the MeSH vocabulary and/or PubMed mappings
Having good crosswalks between the vocabularies would be a service to the entire library community, since NLM could supply our headings for CIP titles, for example, and the corresponding LCSH could be added programmatically and not have to go back to LC for additional work. It would also enhance the UMLS.
Access the Working Group’s Report
For those of you interested in seeing the full report and all the related documents here is the URL.
I want to thank the Medical Library Association for giving me the opportunity to represent you on this Working Group. It was a very educational and enjoyable experience.