Achieving the Promise of Oral History in a Digital Age
Abstract and Keywords
The phrase “digital revolution” is frequently used in both popular and academic discourse to describe the multiple contexts of our increasingly electronically enriched and computer-dependent society. The essence of this article happens to be achieving the promise of oral history in a digital age. In oral history and other academic areas utilizing the interview as a central methodological element, the “digital revolution” specifically refers to the mainstream integration of digital technologies into all facets of the oral history process—in the field, in the archive, and in the distribution of the interview content. This article explores how digital technologies have significantly impacted and have become integral to the recording of oral history, as well as to the dual archival imperatives of access and preservation. Digital video recording started playing a pivotal role in practices of oral history by the twentieth century. Oral history has always been bound to technology, and technologies will forever change.
In a revolution, as in a novel, the most difficult part to invent is the end.
—Alexis de Tocqueville
The phrase “digital revolution” is frequently used in both popular and academic discourse to describe the multiple contexts of our increasingly electronically enriched and computer-dependent society. In oral history and other academic areas utilizing the interview as a central methodological element, the “digital revolution” specifically refers to the mainstream integration of digital technologies into all facets of the oral history process—in the field, in the archive, and in the distribution of the interview content. This chapter will explore how digital technologies have significantly impacted and have become integral to the recording of oral history, as well as to the dual archival imperatives of access and preservation.
First and foremost, digital technology does not make us better interviewers, better project directors, or better oral history archivists. Digital technology does not change the multitudes of stories that can be documented, studied, archived, and preserved. Dramatic changes in technology often frustrate practitioners who must learn new methods: however, when one considers the core practice of oral history, (p. 286) digital recording alone changes very little. There is still a microphone connected to a recorder that converts sound into data on some form of media, which eventually needs to be archived and preserved properly. The recorders are smaller and can record a much higher quality signal. The media is no longer tape, and the data is no longer preserved in the form of magnetic particles. Although the essential addition of the computer in the workflow has changed the process dramatically, merely utilizing digital technologies to conduct oral history interviews does not constitute a revolutionary change.
The digital revolution will fulfill its promise to oral history more completely when digital technologies are utilized to change the ways in which oral history is organized, accessed, utilized, and preserved as an information package. Oral history collections have always been a complex resource to navigate. For the field of oral history, the true digital revolution has only just begun, and it entails more than the economic or technological changes that make recorders cheaper, smaller, and faster with better resultant sound. The real revolution will be a change in consciousness about how oral history, as a historical resource, can be engaged and discovered more easily, more widely and effectively distributed, and ultimately, more responsibly preserved.
Since the late nineteenth and early twentieth centuries, interviewers in the professional practices of oral history, folklore, anthropology, ethnology, ethnomusicology, linguistics, sociology, and other related disciplines have relied on technology, specifically audio recording technologies, to record their data in the field. In 1890, the folklorist and anthropologist J. Walter Fewkes proclaimed that the study of folklore “cannot reach its highest scientific value until some method is adopted by means of which an accurate record of the stories can be obtained and preserved.”1 Over the course of the twentieth century, analog recording technologies improved greatly in regard to portability and fidelity. In a 1937 article for the Southern Folklore Quarterly, the folklorist John Lomax described his first “electrically driven” recording machine used in 1933 to record folk songs:
Improvements have come slowly. … The amplifier weighed more than one hundred pounds; the turntable case weighed another one hundred; two Edison batteries weighed seventy-five pounds each. The microphone, cable, the tools, etc., accounted for sufficient weight to make the total five hundred pounds.
Lomax later described tearing out the back seat of his car to install the recorder, which, on one field trip, overturned. He recalled, “As a result my son got acid burns and lost a suit of clothes.”2
In 1954, the anthropologist Alan Merriam wrote, “At present … the ethnographer is not so much interested in whether a recorder should be used, as in what (p. 287) type of recorder will give the best results.” By then the acoustic cylinder recorder was obsolete. However, according to Merriam, “The choice among disc, wire and tape machines … still faces the ethnographer.”3 Cylinders, discs, and wire were soon replaced with magnetic tape. Open reel recorders maintained a dominant position for decades, eventually giving way to the portable and commercially popular, and at one time ubiquitous, audiocassette.
The reliance of professional and amateur interviewers on increasingly portable audio recorders to gather field data represented a revolutionary shift in both the purpose and practice of ethnographically oriented disciplines, heightening the centrality of the text as it was originally performed. Portable recorders created a radical change in methodology, in theory, and in the consciousness of the practitioners. As the oral historian Dale Treleven noted in 1984, “the mechanical recorder freed the new breed of historian-interviewers from scrawling interviewee responses on paper.”4
I have been recording with digital audio and video field recorders since 1996. In recent years, I have interacted with many folklorists and oral historians in the context of facilitating their transition to digital recording technologies. I have observed that analog users display a distinct sense of comfort and security in pressing the mechanical play and record buttons together on the professional analog cassette recorders and feeling the recording process begin. The transition to digital audio and video technologies is no longer a blind nod to technological determinism or early adoption. Both digital audio and video provide numerous advantages over their analog counterparts, including, most importantly, quality.
In addition to capturing a wider dynamic range, digital audio is a much lower-noise recording. When comparing professional quality recorders, the sonic quality of the signal being recorded digitally can be far superior to the analog counterpart. The advantages digital audio provides, with regard to data migration, are significant as well. High quality copies of a field recording no longer need to be compromises in quality or in time. This listing of technological advantages does not mean that digital technology does not bring with it numerous challenges. The impending obsolescence of analog recording does, however, foster in many practitioners a sense of urgency, and hence frustration.
All technologies change. As improvements are made and then marketed to consumers, these technologies become commercially mainstream. Innovations are discovered, the technologies adapt, they are re-marketed as improvements the general consumer cannot live without, and the mainstream changes again. Each instance of change has had an impact on oral history methodology, but it did not revolutionize the profession. The methodological shift from interviewers “scrawling” on paper to utilizing portable recording technologies in the field was, indeed, a revolutionary shift in our field.
Magnetic tape recording technologies dominated our practice for decades. For almost twenty years, the major equipment decision an interviewer had to make was whether to purchase the Marantz or Sony cassette recorder. The Philips Company invented the audiocassette in 1963. Later that decade, 4-track and 8-track cartridge (p. 288) technologies competed for market share in the commercial music marketplace. The 8-track tape player gained a major foothold with its presence in automobiles, but from a commercial recording perspective, the cassette emerged as the dominant format. In the early 1980s the compact disc appeared as an alternative distribution media for commercial music. Despite the rise of the CD, digital recording technologies did not immediately filter down to mainstream, affordable portable field recording solutions until the early 1990s. In fact, audiocassette technologies continued to dominate fieldwork recording solutions at this time, while maintaining a strong foothold in the commercial music market. According to the Recording Industry Association of America (RIAA), commercial music cassettes continued to outsell compact discs until 1992. Surprisingly, the audiocassette held its ground through the 1990s. In 1997, the full-length audiocassette still made up 18.2 percent of musical sales. But by 2007, that number was down to .3 percent. Presently we face rapidly declining CD sales as rates of legal and illegal downloadable music increase dramatically.
The commercial music transition from cassette tape to CD changed very little about the music industry. Yes, the consumer had to purchase new equipment, but the basic paradigms and business models remained the same. It is the transition from CD to digital downloadable music that has completely revolutionized the music industry. Portable recording and preservation options for archivists, folklorists, and oral historians will always be subject to the dynamic trends of the consumer music marketplace whose interests are not always aligned with the needs of field recording and archival preservation. Oral historians, folklorists, ethnographers, and archivists do not, nor will they ever, drive consumer market trends for recording technologies and media.
Although the use of digital recording technologies in portable field recorders is a relatively new phenomenon, digital recording of audio had been conducted since the 1980s. Recorded sound has always been preserved as data in one form or another. Analog recording devices reproduce sound using transducers, in this case, microphones, which convert analog sound vibrations into an electric signal. In the case of magnetic tape, this electrical signal is represented by the arrangement of magnetic particles on the tape. To hear the recorded sound on the magnetic tape, this process is reversed. Computers do not read magnetic particles or electrical signals. Digital recording of audio adds, through the use of an analog to digital (A/D) converter, the conversion of electrical signals into discreet code that is readable by a computer. In order to play back digital audio, the digital data is converted, through the digital to analog (D/A) converters, from binary code to electrical signals resulting in sound vibrations projected from a speaker. Now, a digital audio file, from a data perspective, more closely resembles a Microsoft Word document than an analog recording on magnetic tape.
Digital audio recording uses time sampling and quantization in order to represent analog signals as data. Discreet time sampling, “the essence of all digital audio systems,” involves sampling the sound wave at various intervals. An increased sampling rate yields a higher quality representation of the sound wave. If sampling (p. 289) represents time in the measurement of digital audio, “quantization represents the value of the measurement, or … the amplitude of the waveform at sample time.”5
Because of the dominance of the compact disc, an early standard for digital audio quality was established. “CD quality” recording meant 16-bit/44,100KHz. A standard audio CD contained seventy-four minutes, or 650 megabytes of uncompressed, audio information. This standard is already beginning to change as storage media becomes dramatically less expensive and recording capabilities improve. As the capabilities of recording increased, data footprints of audio recording technologies have also increased. To counter the increasing size of audio data files and an increasingly limited amount of bandwidth available, compression algorithms (lossy and lossless) were employed for certain digital audio technologies enabling the network mobility of these files. Lossy compression algorithms utilize psycho-acoustics and noise shaping technologies in order to decrease the file size, also resulting in a reduction in recording sonic quality; lossless compression algorithms yield decreases in files size but retain the originally recorded sonic quality.
As with computer technologies, the digital audio marketplace has been a rapidly changing and dynamic environment. Formats and technologies that once held great promise have, in a very short time, become obsolete. Digital Audio Tape (DAT), one of the early portable digital recording formats, recorded on only one side of the tape and utilized a technically elaborate rotary recording head, which employed a helical scan method of encoding data on to the tape. Over time, DAT proved to be technically unstable, disastrously fragile from a preservation perspective, and was abruptly abandoned by recording studios and fieldworkers alike.
The Minidisc proved to be another fleeting format briefly popular among oral historians and folklorists. The primary market for this technology was the lucrative portable music market previously dominated by the lingering audiocassette. However, Minidisc failed in the commercial music marketplace losing market share to the less proprietary and more flexible MP3 compression technologies.
Despite the fact that the CD finally overcame the audiocassette in the commercial music market, early recordable CD technology was not affordable and thus not easily accessible. The cassette maintained its place in the marketplace in the face of the compact disc mainly because of the cassette's familiarity and its onetime ubiquity. Early into the twenty-first century, recordable CD technology (CD-R/RW) grew more common. Almost every computer sold came with a CD-R drive, yet the CD-R portable field recorder still did not catch on as a mainstream field recording option. Given the dominance of the audio CD in the commercial marketplace, I expected, at the time, for this format to take off. It had two important advantages: one could purchase recordable CDs almost anywhere, and one could be completely independent from the computer. However, one could record only seventy-four minutes on a CD-R and the format was limited to 16-bit/44.1khz, and most importantly, users in the commercial marketplace were getting away from the CD-R as their primary means of storing music.
By the time that portable CD-R recorders emerged on the scene, computers were fast becoming the primary means of managing individual music collections. Flash (p. 290) memory recording, a process that mirrored digital photography, emerged as the dominant portable, digital field-recording format. Flash memory recording involves very few moveable parts and is inexpensive, and one can record as long as one can afford additional memory. Once again, the dominant portable field-recording format of choice mirrors the commercial marketplace, a convergence resulting in greater market and format stability. Napster and Apple's iPod have revolutionized the digital music industry and the technology has matured. In June 2008, Apple announced “music fans have purchased and downloaded over five billion songs from the iTunes store.” The paradigm of recorded sound as data files managed by a computer now dominates both the commercial music and the archival communities.
Digital video recording by consumers is wildly popular. Again, emerging technologies have made it possible to capture high quality digital signals on affordable cameras, edit video on home computers, create and distribute DVDs, or upload compressed files to YouTube—again, from the home computer. Affordable digital video cameras can capture extremely high quality digital audio in addition to a video signal that exceeds previous “broadcast quality” standards. Despite these capabilities, the usage of digital video field-recording equipment has not yet become standard practice. Archivists and oral historians have debated for years the pros and cons of video oral history. The high cost for the preservation of high-resolution, broadcast-quality video has deterred mainstream integration of video into standard fieldwork practice. From an access perspective, digital video will indeed challenge the dominant paradigms of oral history, especially as video transitions from standard definition to high definition. The prevalence of digital video on the Internet, the ability to record video with Web cameras and on low-cost cell phones, and the widespread use of oral history methodology in documentary films are just a few factors that have encouraged a societal expectation for video. Oral history will not be immune to this expectation, so oral history archives should prepare for a major influx of digital video. User expectations for video are increasing, however the archival challenges to preserve and provide access to digital video involve a much greater investment in resources. One high definition videotaped interview (recorded with a certain high-resolution codec) can equal the size of a standard computer hard drive. Ten of these interviews can equal the data footprint of many file servers currently deployed in archival settings. The current state of preservation technologies falls short compared with the current state of our ability to record high-resolution interviews. Once again, high-resolution video has become less expensive to record and simultaneously, more expensive to maintain and preserve. However, with the popularization of video sharing Websites such as YouTube, users' expectations for quality have actually diminished. The tolerance for highly compressed video has greatly (p. 291) increased and is even acceptable when broadcast on network and cable news television channels. The Internet delivers a dramatically increasing percentage of video content, enabling public access to videotaped oral histories from all over the world, sometimes within just a few moments of completion of the recorded interviews.
Access: Accommodating the Oral History User
Oral history professionals appreciate the historical and cultural richness embedded in oral history interviews. Oral history is gaining in popularity as a historical resource, but remains an underutilized resource for historical research. This may be due partly to the difficulty researchers have had in gaining access to oral history collections. The oral history information package may contain multiple recorded copies of an interview in a variety of formats: reel-to-reel tape, cassette tape, audio CD, multiple forms of digital data files. If the interview is transcribed, an archive could possess a typescript and an electronic transcript that may or may not exist in a variety of stages of completion (first draft, audited, edited etc.). Additionally, the information package should contain relevant descriptive and technical metadata at either the collection or the interview level that intellectually tie these multiple components together.
Digital technologies, specifically the Internet, have dramatically changed the archival profession and the practice of information seeking and retrieval. Between the 1970s and the 1990s, many oral history collections published printed collection guides. The conversion of these guides into an online environment meant repositories could integrate the processes of efficiently and remotely searching and browsing bibliographic records contained in large or small oral history collections, all in an online context. Simply using the Internet to distribute metadata records, however, meant that the user still had to physically come to the archive to interact with the interview materials, unless, of course, the repository's collections circulated.
Archival finding aids have traditionally been important tools for discovery and detailed description of archival materials. Yet, the “digital revolution” has dramatically altered users' expectations regarding what the Internet can provide. The physicality of the archive is no longer the preferred public access point for the typical user. What archivist has not heard in the past few years, “When will your collections be accessible online?” When users ask about collections being mounted online, they are not limiting their expectations to remote access to bibliographic records about the interview. Typically, users want access to digital surrogates of the interview itself. They want—even demand—access to archival materials of all kinds from their computer, while at home, in an airport, or sitting in Starbucks. Archives and digital programs have been rapidly initiating and dramatically increasing their digitization efforts during the last decade to address the demand. However, great strides made (p. 292) in providing digital content in the commercial marketplace decrease the general public's tolerance for the slower pace made by archival institutions placing their archival materials online.
Consider the older model of research methodology. Typically, researchers would discover the presence of an oral history collection or interview they needed for their research. This initial discovery was often made through a catalog, a printed guide, or word of mouth. The researchers would physically travel to the archive where they would be confronted with a large stack of tapes and transcripts (if the collection had even been transcribed). Information seeking and retrieval in a text-based, paper environment involves scanning text for keywords and concepts congruent to the information seekers' intentions. Browsing audio in an analog, mechanical environment is a comparatively inefficient way to seek out specific information in a textually oriented society. Compared to visually scanning text, cognitively scanning the analog audio is a much less efficient and slower process. If researchers were short on time or impatient, they would likely scan through a collection containing over ten thousand pages of transcript and choose to listen only to the audio that coincides with the relevant text, rather than making the commitment to listen to the entire audio collection, which could amount to more than two hundred hours of audio recording. For now, a transcript, if accurately generated, is still the most efficient tool for locating specific information in a collection. Navigating an analog audio collection, in the absence of a transcript, is greatly facilitated by the existence of a sequential, subject-based index. Less efficient is a finding aid that gives a brief item-level description and a listing of subject headings representing the interviews in the collection, but a collection level description with an inventory list would have been the more likely archival option. Ultimately, collection level description does very little for seekers of specific information embedded within interviews contained in large oral history collections.
Interview-level description has traditionally been the metadata ideal. Brief descriptive metadata enables the user of the physical collection to efficiently target specific interview content, browse, or discover serendipitously individual interviews not previously considered. The problem with generating item-level description has always been time and money. Archivists of large oral history collections often have difficulty committing to anything other than collection-level description. The typical user would be given a collection-level description, transcripts, and tapes, and would then begin the daunting task of manually seeking information.
As the Internet matured, repositories began to place full text transcripts and audio excerpts or interviews on static Web pages. Although some archivists resisted posting interviews online because of control issues, the practice dramatically changed the mode of distribution of oral history interviews making the interview accessible to a wider audience. This in no way changed the way in which the user interacted with the interview. The past decade has witnessed decreasing costs of both bandwidth and digital storage yielding numerous initiatives involving more innovative and effective ways of publishing oral history collections on the Internet.
(p. 293) In my own career, I moved from an oral history archival position to directing the Kentucky Oral History Commission. With an increasing focus on digitization and digital projects, I left the commission to direct the digital program for the University of Alabama Libraries. Having gained perspective working in the world of academic digital programs, I became director of the University of Kentucky's Louie B. Nunn Center for Oral History. During this same period, oral history was undergoing a transformation of presentation in a digital, online environment.
In 1998, the Kentucky Oral History Commission, a program of the Kentucky Historical Society, launched an initiative to document the struggle to end legal segregation in Kentucky. A project director conducted more than one hundred analog interviews. These interviews were accessioned and processed the same way other new collections had been processed, and the oral histories were described at the collection level. As the Civil Rights Project gained momentum, in 2000, the program launched production of a video documentary Living the Story: The Civil Rights Movement in Kentucky. Well received, the documentary continues to run on Kentucky Educational Television. As the archivist for the oral history collection at the time, I faced a demand for usage of the Civil Rights collection that we had not experienced with other collections. In the face of demands for specific and precise information, I found the collection-level system for archival description inadequate for this particular project in the physical archival context.
To better meet users' needs, I sought to develop an online interface to effectively deliver oral history materials as a total information package. The platform for our customized collection management database was a customized Microsoft SQL database which simultaneously controlled the Statewide Guide to Oral Histories. I felt the logical solution would be to integrate a content delivery system into the architecture of our preexisting collection management database. The integration of these databases ensured that the records would not have to be imported or exported into another interface for online distribution. Records were keyed just once, and with a single command and the addition of a link to the corresponding filename for the digital surrogates (i.e., digital audio file and transcript text file on the SQL server), the interview would be made public.
My overarching goal for the new interface was to allow users to quickly and easily browse and search both records and interviews, and listen to and view interview segments from the collection. The entire audio and video interviews were not delivered by the Civil Rights in Kentucky Oral History Project Digital Media Database, as the Kentucky Historical Society did not have ready access to streaming technologies. The goal for this digital initiative was to create multiple access points for this collection based on usage patterns of the collection I had observed thus far. Most users were seeking to browse the collection based on subject—subjects primarily based on the categories that structured the documentary. I observed that users seemed to want to interact with interviews on these particular topics: desegregation of education, life under segregation, public accommodations, open housing, and protests and demonstrations.
(p. 294) It became evident that users preferred to interact with “local” materials, so we constructed a drop-down menu that instantly isolated items from pre-selected Kentucky counties. In addition to providing the user the traditional, searchable metadata record, which included descriptive metadata including an interview synopsis and keywords, we enabled users to browse by subject, by county, and by decade.
Merely browsing metadata records containing descriptive metadata linking to audio and video excerpts and nearly ten thousand pages of text would be of little use to the serious researcher without powerful search capabilities. The Civil Rights in Kentucky Oral History Project Digital Media Database enabled users to search metadata records as well as to perform global and item-level keyword searches of full, oral history transcripts. In order to make the audio and video excerpts searchable, the transcript of each excerpt was dropped into an additional text field in the database and linked to the digital surrogate. In order to make the audio and video excerpts discoverable, each excerpt contained its own level of descriptive metadata. For this project, metadata needed to be generated at the collection-level, the interview-level, as well as the individual excerpt-level—a labor-intensive and ultimately expensive process.
The Digital Media Database for the Civil Rights in Kentucky Oral History Project was designed to enhance the user's experience when using oral history. This solution attempted to integrate the various modules: search, browse, and metadata. A reviewer, Mary Larson, stated:
This site has just about everything that modern Internet users have come to expect in terms of searchability, ease of use, and good design. At the same time, it presents information in a number of different formats and for a number of varied audiences—from the general public to serious researchers to high school students—while making it relevant to all of them … this is a model oral history Web site and perhaps it will inspire other programs to look to this type of approach to projects in the future.6
The online database developed for the Kentucky Oral History Commission's Civil Rights Movement in Kentucky Oral History Project was a customized solution that required a tremendous amount of resources, effort, and maintenance—a model not easily replicated or realistically afforded for other collections. A more scalable model was needed for delivering the majority of our digital assets in a less labor intensive and less expensive fashion. One positive aspect of designing the database was that it was customized to work only with oral history materials. The text of the transcript worked well with the oral and visual dimensions of oral history. The downside of a customized solution built from the bottom up is providing ongoing support and maintenance. The programmer on the project, although extremely talented, utilized techniques not easily discernable by others. Additionally, Kentucky's state government Information Technology (IT) department consolidated and removed IT managers from the Kentucky Historical Society. When I left the position at the Kentucky Historical Society in 2006, the customized (p. 295) solution had virtually no support. A database that was originally designed for dynamism and rapid updates ultimately became a static exhibition.
Relatively inexpensive commercial platforms such as OCLC's CONTENTdm have enabled smaller institutions, public libraries, state historical societies, colleges and universities to construct a sophisticated, and relatively easy to manage digital archive. When I accepted the position as the director of University of Alabama's digital program, the library had already invested in CONTENTdm. I was comfortable with this interface in that I had been a member of the team that initially implemented the system at the Kentucky Historical Society. Like many other digital archive platforms, CONTENTdm originally was designed for delivering digital surrogates for photographic and manuscript materials. For these formats, the interfaces have greatly matured. The complexities of the oral history information package add a degree of difficulty for more generalized online interfaces. If the digital oral history collection is singularly dimensional and contains only audio files, or only transcripts, the complexity is minimal, and the collection and the interface are usually quite easily made compatible. However, if the object contains multiple dimensions, metadata, transcript, and audio or video, the effectiveness of interacting with oral history materials can prove awkward. CONTENTdm, out of the box, excels at presenting digital photograph collections online. The interface was designed for simplicity and efficiency. It is also fairly customizable if your institution has ready access to a good PHP programmer. Few institutions do. In the case of delivering oral history content, CONTENTdm, as originally designed, struggled.
Administrators of the local system can create “compound objects,” associating multiple objects for the users to experience together. Theoretically, in the case of oral history, this enables the user to access the transcript and the audio together. In practice, this is not completely the case. Although the audio and the transcript are merged into a single record, the user must still use each component of the oral history package separately. At the University of Alabama we committed to putting online the Working Lives Oral History Project, an oral history project focusing on African American industrial workers in the central Alabama region. This collection, fortunately, was transcribed, which we possessed as typescript. After scanning the transcripts and digitizing the audio, we uploaded the materials into our CONTENTdm interface. The University of Alabama Libraries uploaded the interviews as compound objects, placing the scans of the transcripts under the same record as the digital audio files. Without a PHP programmer on staff, I struggled to make the presentation of the oral history interview work in a more integrated fashion. Clicking “access this item” opened the audio file, but it opened the audio file in a separate browser tab, taking the user away from the interview record. The user had to open the audio file, then reverse back to the interview record on the original browser tab, open the transcript, move the audio tab out of the way, and only then could the user follow along in the transcript.
The major challenge facing oral history collections in a digital archival environment is making the individual components of the collections all work together in a user-friendly, efficient, useful, and intuitive manner. There are countless examples (p. 296) of repositories mounting digitized oral history collections onto platforms that do not complement the potentially multidimensional, complex nature of oral history materials. Researchers can read and search the transcript, and one can listen to, or watch, an oral history interview. Researchers usually cannot do both in a user-friendly and intuitive manner, though. If they search on a single word in a transcript, they should be simultaneously linked directly to that moment in the audio or video file as well. If a digital collection is placed online and the interface for accessing the interviews is not usable, the responsible repository may have increased the potential audience for those archival materials; functionally, access will more closely resemble the access models represented by boxes of tapes and stacks of printed transcripts. The only difference between the two approaches is a change in venue. Technologies exist to make textual and audio materials work together online, and it is our responsibility as curators of oral history materials to make them easy and efficient to use.
Since becoming the director of the Louie B. Nunn Center for Oral History at the University of Kentucky Libraries, I have worked closely with the Kentuckiana Digital Library (KDL), the principle distributor of digital collections for the University of Kentucky Libraries Special Collections and Digital Programs and the Nunn Center, in order to refine their online oral history interface. The digital program had developed an innovative search and retrieval interface, but the interface was constructed in a way that made it very difficult to use. Very little collection-level metadata and no item-level metadata were presented to the user, resulting in a context in which users had to initially guess, or previously know, the contents of the collections in order to know what effective search terms to use. The search interface was indeed incredibly powerful and innovative, targeting the whole interview rather than pre-selected interview excerpts. Eric Weig, director of the Kentuckiana Digital Library and co-designer of the original interface articulated his intentions:
Rather than trying to identify key moments in the audio or breaking it up into logical segments, we store landmarks in its metadata: the line numbers in the transcript associated with five-minute intervals in the audio.7
By embedding time-code into the transcript, we enabled time correlation between the transcript and the audio or video, yielding an integrated final product where the components work together. The search results were highlighted and contextualized by the inclusion of surrounding text in the transcript. Weig notes that the “line number that each search hit appears on is indicated. … The line number range for each five-minute segment of audio is also displayed, so the user can select the segment they want, and even estimate where in the segment they can hear their search term.”8 The connection between the transcript and the audio was rooted in the five-minute audio chunks that were automatically created and the corresponding time-code markers that were manually embedded in the transcript. The connection between transcript and audio was what other oral history online interfaces lacked at the time. As represented in the first incarnation of the KDL oral history interface, this connection was awkward, but we have worked together, at the University of Kentucky, to refine it. While closely scrutinizing the initial interface (p. 297) we determined that a major area of weakness was in the line numbering system. Line numbering functioned to inform the user of the location of search hits in the text and inform the user as to which corresponding audio segment to click on in order to listen to their search result. The user would click on the corresponding link and be delivered five minutes of audio to navigate. Neither line numbers or time-code appeared in the text of the transcript, though, and therefore had little meaning when navigating away from the search results and into the transcript. The user linked to the audio segment containing the specific information they sought, but they had to browse the entire transcript in order to locate the corresponding text. Navigating away from the search results page and into the PDF transcript/audio interface removed you from the “keywords in context” that were so useful in discerning the appropriate search result.
We have worked to embed the search engine and the resulting keywords in context into the same interface users navigated in order to view the transcript (see figure below). Additionally, we created a customized software solution to more easily (albeit still manually) embed time-code markers into the transcript. The decision was made to embed these markers at one-minute intervals throughout the transcript. The five-minute interval proved to be, still, too much text to scan while trying to determine the specific location of the information being sought in the audio file.
The new KDL oral history interface allows users to search text, contextualize search results, link search results to their corresponding place in the text, and finally, take the user within one minute of the search result in the audio file. As soon as the user decides to refine their initial search, they are able to quickly move around the audio and the transcript, and both discover and pinpoint the textual or conceptual information they sought. The layout of the component of the interface is a simple four-quadrant interface—the interview metadata and media player filling the upper quadrants with the transcript text and the search and retrieval interface filling the lower quadrants.
The time code just to the left of the transcript text appears as links, which corresponded to the time code in the audio file. Clicking on the time-code-link activates that particular audio chunk. Additionally, users are able to use the interface to download custom audio segments, or to print a formatted version of the transcript.
At the University of Kentucky, we are exploring ways that we can implement user-generated tags as a part of the metadata structure of the oral history interviews and collections. Modeling tagging methods used by Flickr and other online digital photograph-sharing sites, users can tag the content of oral history interviews, combining user-generated tags with the subject headings created by the archivist, empowering users to markup subjects important to them. The resultant record represents both user and archivist generated metadata enhancing the overall user experiences of both search and discovery. Additionally, we are exploring ways that the oral history collection could more efficiently and automatically integrate with the rest of the digital collection. Of course, the oral history interviews are included (p. 298) in global searches of the KDL interface, yet I envision concepts and terms within the interview itself automatically linking to other related materials in the digital environment. If the name of a racehorse is mentioned in the oral history transcript, and the digital library contains digital photographs from the farm that bred the racehorse, we should offer the user the option to access these related materials, while interacting with the oral history interview.
The digital revolution has enabled sophisticated search, data mining, and query techniques that enable users to interact and engage with the large quantities of oral history material in a comprehensive and integrated fashion. As voice recognition technologies mature, nontextual searching is, no doubt, on the horizon, again enhancing the discovery capabilities of the information seeker. The digital environment can bring together the different digital surrogates in the oral history information package for the user or researcher to experience in user-customizable ways, a digital environment where the user decides how to navigate the oral history collections. In many cases, online archival delivery systems seem to be designed by archivists and librarians for other archivists and librarians. Designers of digital (p. 299) archival systems need to balance the needs of users and the content, and create a user-centered system of information architecture. These same designers need to study the ways that users of oral history collections use our collections, then design our discovery, retrieval, and delivery tools that serve up our material. Users want search and retrieval systems that are as simple to use and as effective as Google. Users want intuitive browsing interfaces such as those found at Amazon.com, or Apple's iTunes. Social networking provides great opportunities for users to instantaneously share oral history resources with much wider audiences outside of the traditional archival context. More creative, user-friendly, and intuitive ways of presenting oral history materials can transform the ways in which oral history materials are used today, and it is our job to continuously refine the process of enhancing access to our oral history collections.
A tangible result of the “digital revolution” has been the economic accessibility and affordability of professional-level audio digitization and recording equipment. I offer assistance to numerous institutions and individuals in making the digital transition with regard to recording and both analog and digital preservation. Most major oral history programs with an archival focus have been digitizing their recorded materials for preservation purposes. Audiocassettes and reel-to-reel tapes that were recorded on in the 1970s are reaching a critical time in their lifecycles. Because of rapid analog obsolescence, digitization has become the de facto means of audio preservation. However, oral history archivists face an impending preservation crisis, not related to the preservation of degrading analog tape. As digital recording technologies have improved over the years, very little attention has been paid, until recently, to the long-term preservation of this digital content.
In the early stages of digitization efforts, oral history repositories were storing their digital content on recordable CDs, as we were told that this technology would last more than one hundred years. Although testing has demonstrated that gold CD-Rs manufactured with phthalocyanine dye have proven more stable, CD-R methods of preservation are no longer considered long-term with some studies estimating end-of-life to be just ten years, much shorter than the analog tape-based medium housing the original analog recording. In addition to problems with life expectancy, digital audio preservation efforts have the added problem of media and format obsolescence. In the face of analog obsolescence we are given little choice for preservation measures; yet, as we generate massive amounts of digital content, too little attention has been paid in the archive, until just recently, to adopting precise, scalable strategies for responsibly preserving the digital content we generate.
Since hard drives are relatively inexpensive, we often hear recommendations to use them to back up materials. Those with institutional support are encouraged (p. 300) to place their materials onto a server-based, networked environment, which is also regularly backed up. Merely creating a backup copy of the interview is not responsible digital preservation. Although making backup copies of interviews is critical for short-term protection of the interview, long-term preservation of recorded audio demands protections to ensure that the digital files will remain uncorrupted and be accessible in the future, from both the hardware and software perspectives. Technological innovation is exciting, and improvements to applications and formats often make our lives easier. Change, however, is not without consequence.
Format obsolescence is often difficult to imagine in the context of the present. The current standard formats such as .wav, .MP3 or .pdf are so entrenched in the commercial marketplace that it seems difficult to imagine a scenario wherein these formats would become unusable. Inevitably they will. It is vitally important that archivists and practitioners continue the dialog to refine strategies and standards. Nonproprietary formats often do not equate to popular formats, yet popular formats are not immune to dramatic change or eventual abandonment. Nevertheless, it is critical for keepers of digital formats to continue to monitor current “best practices” in the archival and preservation communities, for these communities have a great stake in the matter.
It is also important to store digital files in an environment where future migration can be an efficient and automated practice. Placing large amounts of data on hard drives or in a networked environment enables future users to “batch convert” formats. The user simply sets up and runs a simple conversion script that migrates the file to another format according to the parameters set up in the script. Individually uploading hundreds or thousands of interviews that were recorded onto individual CD-DA discs or stored as data files on individual CD-Rom discs will prove to be time consuming and expensive. Automated format migration is much more efficient with large quantities of data files and is technology that is, now, quite affordable.
Another very real and difficult variable in the digital preservation equation is maintaining data integrity. Digital files corrupt, and there is little use in inadvertently making a backup copy of a corrupt file. Still, as our hard drives begin to fill with greater and greater amounts of data, workflows and procedures where archival staff must, each year, critically listen to each recording in the collection in order to ensure continued data integrity are unrealistic. Automated methods are available for conducting such a daunting manual task, but again, these methods are usually employed in large, server-based digital archival environments. I cannot underscore enough the importance, now more than ever, of working closely with well-funded archival institutions that have implemented a responsible digital preservation plan.
Although digital video technologies have matured from a recording/capture standpoint, digital video still poses a seemingly insurmountable challenge from a preservation perspective. Like its analog predecessor, tape-based digital video is a fragile medium. Hard drive storage of large quantities of high-resolution digital video continues to be prohibitively expensive for most individuals and (p. 301) institutions. But this is rapidly changing. As compression algorithms improve for video, and data storage options get increasingly inexpensive, we will see a surge in the usage of digital video in the field and in the archive. Digital video cameras are opting to record straight to onboard hard drives or removable flash drives rather than on optical or tape-based media. Because of the combination of the large data footprint of high-resolution digital video and limited bandwidth availability, digital video is only now emerging as a potential factor in the access-oriented component of the digital revolution. Technologies on the horizon will enable more affordable and automated approaches to long-term storage of high quality digital video.
Following Hurricane Katrina, several universities in Alabama received a grant from the Institute of Museum and Library Services (IMLS) to set up a statewide digital preservation network. This network was server based and implemented a simple software package known as LOCKSS, a platform that was developed by a group out of Stanford University. LOCKSS literally stands for “Lots of Copies Keep Stuff Safe.” In this context the server application harvests digital items from one location and places exact copies of the item on five additional nodes. In our case, the “dark archive,” the highly restricted, preservation-oriented digital repository, involved the University of Alabama, University of Alabama–Birmingham, Troy University, Spring Hill College, and the University of North Alabama, and the project was directed by Auburn University. Copies of University of Alabama's digital objects would be harvested by all of the other nodes in the network, as would the digital objects of other institutions be harvested by the University of Alabama's node. LOCKSS then employs a voting system that continuously monitors data integrity. If one of the files begins to demonstrate corruption, the other five files replace the corrupt file with a fresh copy. Distributed digital preservation is an excellent model for the future with regard to large archival institutions with adequate financial investment. Unfortunately, it is not as scalable for the smaller institution. More than ever, it is important to partner with a capable oral history archive, thus placing the burden of digital preservation on those institutions that demonstrate a level of commitment to both analog and digital preservation. In the case of smaller institutions, collaborative partnerships should be set up to address preservation needs.
The digital revolution has only just begun. Digital technologies have opened new and exciting opportunities for recording, preserving, and disseminating oral history interviews, potentially changing, in dramatic fashion, the entire field of oral history. At the same time, migration to digital technologies poses challenges that oral historians and archivists must overcome. Oral history has always been bound to technology, and technologies will forever change. That puts us in the position of depending on technology to save us from technology. Too many historical resources have disappeared because of format degradation or technologic obsolescence. The preservation of oral history materials demands our utmost attention in order to honor the stories that we have been so privileged to record.
(1.) J. Walter Fewkes, “A Contribution to Passamaquoddy Folk-Lore,” Journal of American Folklore 3, no. 11 (1890): 257–80.
(2.) John A. Lomax, “Field Experiences with Recording Machines,” Southern Folklore Quarterly 1, no. 2 (1937): 58–59.
(3.) Alan Merriam, “The Selection of Recording Equipment for Field Use,” Kroeber Anthropological Society Papers 10 (1954): 5–9.
(4.) Dale E. Trelevan, “Oral Historians: Masters of or Slaves to Technology?” Oral History Review 12 (1984): 101–4.
(5.) Bit depth is a measurement of the number of bits used to encode the dynamic range of each sample: the greater the bit depth, the higher the resolution of the sample. Ken C. Pohlmann, Principles of Digital Audio (New York: McGraw-Hill, 2005).
(6.) Mary Larson, “Review of the Civil Rights Movement in Kentucky Oral History Project Digital Media Database,” Oral History Review 34, no. 1 (2007): 145–46.
(7.) Eric Weig, “Large Scale Digitization of Oral History: A Case Study,” D-Lib Magazine 13, no. 5/6 (2007): 1–9.