Readings for Feb. 7, 2006 — digitizing the past
The Digital History chapter, “Becoming Digital,” offers a concise and (mostly) user-friendly discussion of the issues of digitization. I must admit I skimmed the more technical sections, but it’s good to know where I can find a basic overview if I should need it in the future.
The discussions of potential methodological shifts resulting from metadata, tagging, and most simply computer-readable text were particularly interesting to me. I’m fascinated by language and linguistic patters and keyword searches could simplify my research. Since my previous work quantitatively analyzed rhetorical and voting patterns, I had to physically read and track keywords in the Congressional Record, the Public Papers of the President, and major newspapers. Midway through an early version of this project, I tried using digital search engines, but found that they rarely turned up articles and quotes that I knew existed. Consequently, I still don’t trust keyword searches for my research.
However, I will be the first to admit that such searches are excellent exploratory tools. If nothing results from a particular keyword search, it is unlikely that the search engine missed a treasure trove of information. However, the chapter’s discussion of metadata pros and cons, while intriguing, gave me even more reasons not to trust keyword searches for my work. But, better understanding of the process has helped me to understand why and how keyword searches did not work for my earlier work and how they may work in future projects.
So, I’m curious. How much do you trust keyword searches? When do you use them in your research? If you’re interested in creating a digital archive, which metadata option(s) do you think will best fit your project. As a linguistically-inclined historian, I’d love to have both fully searchable typed texts AND page scans with exhaustiive keywords. But, I knew that’s too much to expect of small projects with limited resources. I’m still not sure what trade-off I’d make in their shoes.
I usually trust keyword searches, but have had similar experiences to the one you describe with knowing that a phrase exists in the material but not being able to bring up the passage in a search. I think you sometimes have to jump through some hoops to make a search work, but there are times that you simply cannot read the entire book again to find the quote you’re looking for. So I may not fully trust keyword searches but they can be helpful in saving time most of the time.
Comment by Rod — February 7, 2006 @ 12:14 am
To me, the *really* tricky question comes up when you start wanting to do powerful things beyond raw pattern matching (finding a given set of characters in the broader mass of texts); for example, if you wanted to be able to search an individual’s correspondence for keywords that appear in letters to a specific person, you’d have to have tagged each letter with a bit of metadata that details its recipient…to do the really powerful stuff, raw digitization is just the beginning…
Comment by Josh — February 7, 2006 @ 1:24 pm
Keyword searches generally work for me, as long as I enter the correct search term. This is where the idea of knowing the historical terms would come in handy. I keep running into this problem for my own research, and have subsequently begun a list of words that I need to try searching with later.
Comment by Linda — February 7, 2006 @ 3:37 pm