Legacy to Heritage: Subject Access to Manuscript Collections in the Electronic Age

Based on a paper presented at the 1995 meeting of the Northwest Archivists, Seattle, Washington, May 6, 1995

Terry Abraham
Head, Special Collections and Archives
University of Idaho Library
Moscow, ID 83844-3125

The inadequacies of the Library of Congress Subject Headings (LCHS) to manuscript and archival cataloging have a long and distinguished history. The National Union Catalog of Manuscript Collections (NUCMC), an organizational unit of the Library of Congress, initially used LCHS in its first attempt at a nationwide catalog of manuscript collections. In the 1962 volume, they declared LCHS inadequate for their uses. Terms, they said, "have been compiled without regard to any established list of subject headings and without any comprehensive precedent for indexing manuscript material." (NUCMC 1959-1962 Index, p. iii) Here we have the surprising vision of Library of Congress catalogers rejecting the Library of Congress list of subject headings. Even so, Walter Rundell reported, in his In Pursuit of American History, that researchers complained about the "inadequacy of [NUCMC's] library headings or tracings" as late as 1970. (p. 239)

Other complaints surfaced as well. Recently, on the Archives Listserv, a Canadian archivist noted a particular frustration, the lack of adequate LCHS terms for the literature of archival science. "For example," he noted, "there is no subject heading in LCHS for the archival function of appraisal." (2 May 1995) This certainly overlooks the decided lack of books devoted to archival appraisal, all of which could be counted on the tines of a fork.

And Marcus Robbins, of the City of Portland Archives, asked in the same forum whether it was time to scrap subject access in a system that provides full-text searching. Specifically, "should we scrap subject access in favor of local keyword, personal name, geographic location, and organization? Should we also shift emphasis to form, genre, and activity of creation? Our current authority file is way too broad and imprecise. In the past, catalogers have misused subject terms, making such searches inaccurate and unusable." (20 April 1995)

Even among librarians, primary users and consumers of LCHS, the number of critics of LCHS ranks high. A browse through Library Literature over the past three decades will give quite a feel for the depth of criticism and, as well, its lack of effectiveness.

In response to changing technology, more than user complaints, NUCMC eventually returned to using LCHS. "Beginning with 1985, in preparation for the National Union Catalog of Manuscript Collections becoming part of an automated national database, the decision was made to conform to the Library of Congress Subject Headings whenever possible." (NUCMC 1985 Index, p. vii)

Yet, this did not resolve the questions of subject access, either to books or manuscripts and archives. A recent analysis of subject searching on on-line public access catalogs (known as OPACS, many of which keep track of the kinds and numbers of searches) demonstrates that while subject searches predominate, LCHS searches are quite a small minority. Out of 10 million searches on one system in 1994, less than six per cent were LCHS searches. "One important thing we have learned from tracking OPAC use over the last 10-15 years is that subject phrase searching (i.e., searching on Library of Congress Headings) is not frequent, popular, or often successful. In OPACS where this is the only kind of subject searching supported, it is not used as frequently as title searching." (Charles Hildreth, PACS-L, 3 April 1995)

These criticisisms, complaints, and lists of inadequacies, do not mean that archives (or libraries) should abandon LCHS. Even modifications of LCHS terms to meet perceived needs on local systems are merely exchanging one set of problems for another. The Art and Architecture Thesaurus is an example of a local term list that, through massive grant funding, has become a recognized part of the cataloging process. This is not an avenue to be taken lightly. In addition, in a world-wide environment, local variations tend to get lost in the larger schemes of things. At the least, they serve to confuse users who must learn new terms at every research stop.

In recent months the news media has become conscious, in its overexcited way, of something it refers to as "legacy software." This is software, now decades old, but still in use, which is now causing problems. The best example is the millions of dollars planned to be spent over the next five years to ensure that large national databases and control programs are aware that the year after 1999 is 2000 and not 1000 or 1900. This is the result of hard coding dates as the last two digits of the year, i.e. '94 or '95. The programs will add one to '99 and be quite happy with '00 but we might find our social security benefits, for example, off by a thousand years.

There is a lot of legacy code in the world, from software, to miles and gallons instead of meters and liters, to the odd jog in the road at the boundary of a 1913 residential subdivision. The Library of Congress list of Subject headings (even though greatly revised in the last several years) is just that kind of legacy code.

Legacy code is what we are stuck with and, often, what we choose to live with because it is too much trouble or too expensive to change. Critics of LCHS have offered alternatives for decades, but we have only revisions, not a wholesale overhaul.

On solution to the legacy code problem is to leave it as it is and bypass it. And today we see glimmers of what will bypass, but not replace, LCHS access to manuscript and archival collections.

Many repositories, including the University of Idaho, have succeeded in getting manuscript and archival material cataloged and added to the library's (and the regional or national) database. These MARC-AMC records make full use of the capabilities of LCHS for subject access as well as provide various other forms of key-word searching.

But the new development, less than six months old in our shop, is to load entire inventories to the institutional web-server. This adds the document to possible WAIS search results (or will in the future) and permits full-text searching of the inventory once it is opened in the browser.

An effort is underway to standardizing the coding for inventories to facilitate these kinds of searches. This is one of the goals of the Berkeley Finding Aid Project, which is attempting to identify an encoding standard for electronic finding aids. The standard will be in a Standard Generalized Markup Language Document Type Definition (SGML DTD). "Researchers at the University of California, Berkeley will develop the encoding standard in collaboration with leading experts in collection processing, collection cataloging, text encoding, system design, network communications, authority control, text retrieval, text navigation, and computer imaging. Project participants will analyze the structure and function of representative finding aids. The basic elements occurring in finding aids will be isolated and their logical interrelationships defined. The DTD will then be developed based on the results of this analysis." (Pitti, Daniel. Sharing the Wealth: Toward a Finding Aid Encoding Standard. 1995.)

The result of this project, if widely adopted, will be to turn our legacies into our heritage. Legacies are a form of inheritance in that they are personal and singular. LCHS, and the MARC format, will continue to have a use in providing brief abstracts of groups of records for researchers. They serve, in this way, as a guide to the inventories. Recent developments in adding hyper-text links to MARC fields has demonstrated that it is possible to provide immediate jumps from the descriptive abstract to the more comprehensive inventory.

Access to the inventory is the key to making use of the heritage information archivists have compiled about documents. A heritage is a community based foundation for the future. Heritage provides the continuity of the past for the present to build upon. Access to inventories on the World-Wide Web servers increased the ability of researchers to identify what they need, who has it, and how big is it. Loading existing and future word-processed inventories into HTML and posting them on the Web is just the first step of this process. The next step is to link them to the MARC records. This heritage of descriptive inventories will become the building blocks of a descriptive program that will increase the availability of our collections world-wide.

Return to Selected Papers and Presentations

legacy.htm / June 1995 / tabraham@uidaho.edu