Becoming Digital: Preparing Historical Materials for the Web


In this chapter you will learn about:

  • The pros and cons of digitization
  • The different ways that text can be digitized
  • The benefits and costs of marking up text
  • How to make text machine readable
  • The ways to make images digital
  • How to digitize sound and moving images
  • Whether you should do all this work yourself

he past was analog. The future is digital. Tomorrow’s historians will glory in a largely digital historical record, which will transform the way they research, present, and even preserve the past. But what can today’s historians do with the massive analog historical record of the past millennia? This chapter provides some answers, offering advice on how to make the past—in the form of written records, photographs, oral history tapes, films, material culture, or other analog documents and artifacts—into digital files. In doing so, we can allow these resources to gain the significant advantages we outlined in the introduction, such as accessibility, flexibility, and manipulability.

An astonishing amount of the analog historical record has already become digital in the past decade. As we mentioned in Chapter 1, the Library of Congress’s American Memory project presents more than 8 million historical documents. ProQuest’s Historical Newspapers offers the full text of five major newspapers including full runs of the New York Times and the Los Angeles Times. The Thomson Corporation’s 33-million-page Eighteenth Century Collections Online contains every significant English-language and foreign-language title printed in Great Britain in that period. Most dramatically, the search engine behemoth Google has announced plans to digitize at least 15 million books.1 Hundreds of millions in federal, foundation, and corporate dollars have gone into digitizing a startlingly large proportion of our cultural heritage

This treasure of digital history presents an incredible boon to historians, offering possibilities for online research and teaching that would have been unimaginable just a few years ago. For the aspiring digital historian, it also imparts another benefit: the experience of the first generation of digitizers provides a set of benchmarks and approaches that will help you convert historical documents for your own site with greater efficiency, better reproductions, and, we would hope, a lower cost than the pioneers. But, as historians know all too well, the lessons of the past are never simple and straightforward. “There are no absolute rules,” observes a report from the Digital Library Forum. “Every project is unique.”2 This caution applies particularly to historians with small projects because the advice literature focuses on large-scale library- or archive-based projects. Your own goals and budget for organizing an exhibit for your local historical society or posting some documents for your students may be much more modest.

This chapter won’t turn you into an expert on scanning resolutions for photographs or sampling procedures for audio; rather, we want to give you enough information to plan intelligently and to oversee a digital history project. To do that, we offer you an overview of the basic whys, whats, hows, and whos in digitization: why it is useful (and when it isn’t), what digital formats you should consider producing, how it is actually done, and who should do it (you, your organization if you have one, or another person or company).

1 Barbara Quint, “Gale Group to Digitize Most 18th-Century English-Language Books, Doubles Info Trac Holdings,” Information Today, Inc. (17 June 2002), ↪link 3.1a; Kinley Levack, “Digital ECCOs of the Eighteenth Century,” (November 2003), ↪link 3.1b; “Google’s Gigantic Library Project,” SPARC Open Access Newsletter, 81 (2 January 2005), ↪link 3.1c. An even more ambitious, multibillion dollar digitization effort is proposed by the Digital Promise Project: Creating the Digital Opportunity Investment Trust (DO IT), A Proposal to Transform Learning and Training for the 21st Century (Washington, D.C., 2003).

2 Digital Library Forum, A Framework of Guidance for Building Good Digital Collections (Washington, D.C.: Institute of Museum and Library Services, 2001), ↪link 3.2a. This chapter is indebted to the many excellent digitization reports and handbooks, including especially Humanities Advanced Technology and Information Institute and National Initiative for a Networked Cultural Heritage, The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials (Washington, D.C.: National Initiative for a Networked Cultural Heritage, 2002) ↪link 3.2b; (hereafter cited as NINCH Guide) and Maxine K. Sitts, ed., Handbook for Digital Projects: A Management Tool for Preservation and Access, 1st ed. (Andover, Mass.: Northeast Document Conservation Center, 2000) ↪link 3.2c. See also Western States Digital Standards Group Digital Imaging Working Group, Western States Digital Imaging Best Practices, Version 1.0 (University of Denver and the Colorado Digitization Program; Denver, 2003), ↪link 3.2d (hereafter Western States Digital Imaging); Alan Morrison, Michael Popham, and Karen Wikander, Creating and Documenting Electronic Texts: A Guide to Good Practice (London: Arts and Humanities Data Service, 2000), ↪link 3.2e(hereafter Creating and Documenting Electronic Texts); and the various documents listed at “Digital Library Standards and Practices,” Digital Library Federation, ↪link 3.2f.