Collecting History Online

Qualitative Concerns

nce you convince contributors to participate in your site, what do you actually want from them, and how can you ensure that what they submit is useful and authentic? Oral historians, ethnographers, and sociologists have carefully thought out sound and effective “instruments” (controlled and rigorous methods for collecting information).26 Although consulting this literature is worthwhile, the limitations of web forms and other collection mechanisms such as email work against the strict replication of these standards online. For example, the common method of repeating an important question several times, using a different phrasing each time, to ensure an accurate answer from a respondent sounds good in the abstract. As we discussed in Chapter 4, however, the size and resolution of most computer screens permits only a fairly limited amount of text and response space, thus requiring a great deal of scrolling (and frustration) as the number of questions proliferate. Historical surveys on the Internet–in the interest of attracting a large group of contributors–probably should not match the density and complexity of offline versions. And, in any case, social scientists would not consider them true “surveys” in the sense of a scientifically valid form for collecting information.

In your web form, discussion board prompt, or email exchange, be wary of asking for too much. As Don Dillman, a sociologist who has studied the effectiveness of web surveys, notes, “Survey designers try to get too much detail from respondents. The result is survey abandonment, which the Internet makes relatively easy.” Try to keep at least the initial entreaty short and as open-ended as possible–certainly fewer than ten questions, and probably better under five. We have found that some of the most effective online collection projects involve not much more than a call to “Share your story.” (Yes, most people are narcissistic and like to talk primarily about themselves and their experiences.) In the narrative that results from such open-ended questions, you will often find the answers to more specific questions that would have been far down a long and off-putting survey form. One of the online collecting efforts related to September 11, 2001 (see below), simply asked “Where were you?” and yet was able to collect a vast archive of rich first-hand narratives–not just of where, but of when, with whom, and how their diverse set of contributors experienced that day. Historians new to survey design tend to be too specific. Although you may think a detailed survey will get you exactly what you want, it may in fact be confusing to visitors who do not know as much about the broad sweep of your subject as you. In addition, asking questions in the authorial voice of a scholarly essay or book repels most potential contributors. Write questions in a more casual prose style.27

Finally, be flexible. Remember that one of the advantages of the web is its ease of revision: you can change wordings and collection formats at any time if things are not working out the way you envisioned. (Do be sure to save earlier versions so that future researchers can properly understand older sets of responses.) You should also be prepared to accept things you had not intended to collect. Sometimes contributors will want to give you materials you did not ask for or tales that seem unrelated to your focus. Always consider accepting these donations. The public’s generosity may surprise you, and it may enrich your project in ways you could not have anticipated.

Of course, along with great generosity sometimes comes undesired mischief. How can you be sure your contributors are who they say they are? How can you be sure their contributions aren’t faked, or taken from other sources? Concern about the falsification of digital historical documents and materials, we believe, has mostly turned out to be a phantom problem. We are not alone in this assessment. Newspaper websites, which rely on the registration information given by surfers to make money off of targeted advertisements, have found (much to their surprise) that relatively few people enter fake information, even though there are sometimes no checks against such subterfuge. In one study, the Philadelphia Inquirer discovered that only about 10 to 15 percent of their 300,000 registered users had entered bad email addresses (and some of those were merely by accident or due to technical difficulties), even though a person’s email address is among the most guarded possessions of the online world because a vast majority of people are worried about spam. Zip codes and other less problematic bits of personal information are falsified at an even lower rate.28

We think the nonprofit mission of online historical archives generally produces even higher rates of honesty. Most people who take the time to submit something to your project will share your goals and your interest in creating an accurate historical record. Rogues and hackers have more interesting things to do on the Internet than corrupt historical archives. But our best defenses against fraud are our traditional historical skills. Historians have always had to assess the reliability of their sources from internal and external clues. Not only have there been famous forgeries on paper, but written memoirs and traditional oral histories are filled with exaggerations and distortions. In the past as in the present, historians have had to look for evidence of internal consistency and weigh them against other sources. In any media, sound research is the basis of sound scholarship.

Nevertheless, some technical methods can help double-check online contributions. As we explained in Chapter 2, every computer connected to the web has an Internet Protocol (IP) address. A small bit of programming code can capture this address and attach it to the other metadata associated with a contribution. If you are skeptical that a contribution has come from a specific person or location, a WHOIS search, which translates the numbers of an IP address into a semi-readable format that often includes a contributor’s Internet service provider and broad area of service, occasionally results in helpful information.29 Less cloak-and-dagger is a simple email or telephone follow-up with the person to thank them for their contribution; if the email bounces back or the phone number is incorrect, you should be more skeptical of the submission. Following up in this way also presents an opportunity to ask contributors if they might have any other documents or recollections, and whether they might know of others who can supplement your archive.

A less obvious but perhaps more important measure of the “quality” of a historical collection created online becomes apparent when the collection is assessed as a whole rather than on the level of individual submissions. Like any collection, online or offline, a minority of striking contributions will stand out in a sea of dull or seemingly irrelevant entries. Historians who have browsed box after box in a paper archive trying to find key pieces of evidence for their research will know this principle well, and it should not come as a surprise that these grim percentages follow us into the digital realm. Yet as we also know, even a few well-written perspectives or telling archival images may form the basis of a new interpretation, or help to buttress an existing but partial understanding of a certain historical moment. At the same time, the greater size and diversity of online collections allow you more opportunities to look for common patterns. Why do certain types of stories reoccur? What does that tell you about both popular experience and the ways in which that experience gets transformed into memory?

Moreover, because of a digital collection’s superior manipulability compared to a physical collection, historians can search electronic documents in revealing and novel ways. On the web, the speed with which one can do this sort of analysis can enable both quick assessments of historical collections as well as more substantive investigations. For instance, when historian Michael Kazin used search tools to scan our September 11 Digital Archive for the frequency of words such as “patriotic” and “freedom,” he came to some important, if preliminary, conclusions about the American reaction to the terrorist attacks. Kazin discovered that fewer Americans than we might imagine saw September 11 in terms of nationalism, radical Islam versus the values of the West, or any other abstract framework. Instead, most saw the events in far more personal and local terms: the loss of a friend, the effect on a town or community, the impact on their family or job.30 The ultimate quality of a digital collection may have more to do to with the forest than the trees, so to speak.

26 See R. Tourangeau, L. J. Rips, and K. Rasinski, The Psychology of Survey Response (New York: Cambridge University Press, 2000) for an overview of the subject. For more on web surveys from the social science perspective, see M. P. Couper, M. Traugott, and M. Lamias, “Web Survey Design and Administration,” Public Opinion Quarterly, 65, 2 (2001), 230Ð53, and M. P. Couper, “Web Surveys: A Review of Issues and Approaches,” Public Opinion Quarterly, 64, 4, (2000), 464Ð94. A full bibliography of survey design is available from the Laboratory for Automation Psychology and Decision Processing at the Human/Computer Interaction Laboratory at the University of Maryland, ↪link 6.26.

27 Quotation from Don A. Dillman, “Internet Surveys: Back to the Future,” The Evaluation Exchange, 10, 3 (2004), 6. See also Don A. Dillman, Mail and Internet Surveys: The Tailored Design Method (New York: Wiley and Sons, 2000), and related papers at ↪link 6.27.

28 “Web Newspaper Registration Stirs Debate,”, 14 June 2004, ↪link 6.28a. Online collecting projects that focus on sensitive topics obviously may encounter more resistance to revealing accurate personal information. See R. Coomber, “Using the Internet for Survey Research,” Sociological Research Online, 2, 2 (1997), ↪link 6.28b.

29 The American Registry for Internet Numbers has a free IP lookup service at ↪link 6.29a. Non-U.S. domains (those with two-letter country codes at the end) can be located through, ↪link 6.29b. Domains that end in .aero, .arpa, .biz, .com, .coop, .edu, .info, .int, .museum, .net, and .org can be located through the governing body for the web, the Internet Corporation for Assigned Names and Numbers (ICANN), at ↪link 6.29c. Several commercial services scan worldwide IP addresses, e.g.,, ↪link 6.29d, and Network Solutions, ↪link 6.29e.

30 Michael Kazin, “12/12 and 9/11: Tales of Power and Tales of Experience in Contemporary History,” History News Network, 11 September 2003, ↪link 6.30.