Becoming Digital: Preparing Historical Materials for the Web

Audio and Moving Images

igitizing audio and moving image sources involves even more specialized technical issues, and our treatment of these questions must be necessarily more truncated. Even after twenty pages of detail, the NINCH guide concludes that “practice and technology are still evolving” and “it is difficult to draw general conclusions or to make firm recommendations on ideal storage, delivery and management of digitized audio and moving image materials.”47 As these are the most complex historical artifacts to digitize—film, for instance, is a rapid series of photographs with sound attached—this uncertainty should come as no surprise.

The hesitation of the NINCH guide highlights that audio/visual (A/V) digitizing presents some particular challenges that go well beyond those faced in most text and image projects. First, you need a device that can play back the original recordings, which are sometimes in obsolete formats—from wax cylinders to 8 track tapes.48 Second, you need a lot of digital storage space. One hour of uncompressed video can take up 146 gigabytes—as much space as 146,000 digitized books, or almost twice the size of the average public library. Third, you may have to do serious editing to make the digital objects usable for your audience. That could range from editing out background noises in an audio clip to making a two-hour oral history more manageable for a hurried listener. Fourth, you are likely to spend more time creating metadata for audio and moving images than for text or images. For the foreseeable future, we are unlikely to be able to search automatically through sound or moving images as we can with text. (New video searches from Google and Yahoo do not actually scan the video itself, but instead search the closed captioning text and other transcripts.) Unless you provide some descriptive metadata, your users will have no way of knowing what they will find in your online A/V collection except by spending hours listening or watching. Of course, a full transcript provides an excellent finding aid, but it can be very expensive to prepare.

How do you get audio or moving images into digital form? The simplest procedure is to connect your playback device (e.g., the turntable or VCR) through its “A/V Out” port (or ports) to a digital audio or video capture device (e.g., a stand-alone CD or DVD burner). If you want more control over sound quality and more flexibility, however, you should place a computer workstation with a digital audio or video input card between the playback mechanism and the capture device. Sound editing programs improve and condense the sound file (e.g., by deleting unneeded parts of the audio spectrum) and help you to break the file into manageable pieces. The Library of Congress and MATRIX at Michigan State University, both of which have done extensive work with historical sound, strongly advise that you use professional quality equipment and avoid consumer multimedia sound cards, which are notorious for their poor reproduction of audio.49 A single weak link in the digitizing chain can have a deleterious effect on sound or video quality.

As in other areas of digitizing and digital preservation, experts lament the absence of agreed-upon standards, but audio digitizers seem to be settling on a sampling rate (the number of times per second that the unbroken analog sound wave is digitally captured, usually expressed in kHz) of 96 kHz, a bit depth (the measurement of how much data you have in each sample) of 24 bits, and the WAV (waveform audio) file format. Most commonly, as noted in Chapter 2, audio is compressed (using a program like Autodesk Cleaner) and then streamed (in Windows Media, RealMedia, or QuickTime formats) to make it possible for web surfers to begin listening while the clip is downloaded. But some sites provide direct downloads of MP3s (a powerful and widely accepted format for compressed audio) or WAV files.50

Digitizing moving images is an even less settled practice than audio. The EVIA (Ethnomusicological Video for Instruction and Analysis) Digital Archive at Indiana University and the University of Michigan, which is undertaking a major digitization project, explains that “digital video is an arena in which no ready solutions or models exist.” They are first archiving existing videos on DigiBeta tapes—an approach also being followed by the Library of Congress, the SHOAH Visual History Foundation, and New York University’s Hemispheric Institute of Performance and Politics. Then, EVIA uses a high-end digital video encoder to compress the tapes in an MPEG-2 format that still consumes 27 gigabytes per hour, or 4 terabytes for the entire project. They acknowledge that, from a strictly archival point of view, no compression would be better, but that would require 146 gigabytes of storage space for an hour of video—an expensive proposition even in these days of cheap storage. Even Internet2 (an experimental higher speed version of the Internet) cannot effectively deliver the compressed files to users, and so EVIA will create additional versions, for example, a version at one-seventh the size for classroom projection or research, and a dramatically smaller streaming version in RealMedia , QuickTime, or Windows Media.

As these complex specifications indicate, a large-scale video digitization project is not for faint hearts or shallow pockets. But that does not mean that you cannot relatively easily digitize some old video-taped interviews, using a video capture card on your computer and standard video software like Apple’s easy-to-learn iMovie, and then stream the results in one of the major streaming formats. It is even easier to incorporate interviews done with your digital camcorder into your website because no conversion from analog to digital is required. You won’t be in compliance with the most rigorous archival standards, but it can be a nice addition to your site.

47 NINCH Guide, 140. See also CDP Digital Audio Working Group, Digital Audio Best Practices, Version 1.2 (Denver: Colorado Digitization Project, 2003), ↪link 3.47a; MATRIX, “Audio Technology / A/D Conversion and Digital Audio Signal Transfer,” Oral History Tutorial, ↪link 3.47b.

48 NINCH Guide, 122–24.

49 CDP, Digital Audio Best Practices, 19–22; MATRIX, “Audio Technology”;< Carl Fleischhauer, “The Library of Congress Digital Audio Preservation Prototyping Project” (paper presented at Sound Savings: Preserving Audio Collections, Austin, Texas, 24–26 July 2003), ↪link 3.49ab. For sound editing, Sound Forge and Cool Edit Pro are PC only and Bias Peak is Mac only; Pro Tools has versions for both platforms.

50 See, for example, Virginia Danielson, “Stating the Obvious: Lessons Learned Attempting Access to Archival Audio Collections,” in Folk Heritage Collections in Crisis, ed. Council on Library and Information Resources (Washington, D.C.: CLIR, 2001), ↪link 3.50a. Initially, sound digitizers were more likely to use 44 kHz and 16 bits, and some still see that as adequate. For 96 kHz, see Fleischhauer, “The Library of Congress Digital Audio Preservation Prototyping Project;” Michael Taft, “The Save Our Sounds Project” (paper presented at Sound Savings: Preserving Audio Collections, Austin, Texas, 24–26 July 2003), ↪link 3.50b; Bartek Plichta and Mark Kornbluh. “Digitizing Speech Recordings for Archival Purposes;” NINCH Guide, 126, 133; MATRIX, “Audio Technology.” Macintosh computers tend to use AIFF (Audio Interchange File Format) for audio, but WAV (which was developed for Windows) files generally also play on that operating system.