CHNM to undertake major text-mining study

We are delighted to announce that beginning this summer the Center for History and New Media will undertake a major two-year study of the potential of text-mining tools for historical (and by extension, humanities) scholarship. The project, entitled “Scholarship in the Age of Abundance: Enhancing Historical Research With Text-Mining and Analysis Tools,” has just received generous funding from the National Endowment for the Humanities.

In the last decade the library community and other providers of digital collections have created an incredibly rich digital archive of historical and cultural materials. Yet most scholars have not yet figured out ways to take full advantage of the digitized riches suddenly available on their computers. Indeed, the abundance of digital documents has actually exacerbated the problems of some researchers who now find themselves overwhelmed by the sheer quantity of available material. Meanwhile, some of the most profound insights lurking in these digital corpora remain locked up.

For some time computer scientists have been pursuing text mining as a solution to the problem of abundance, and there have even been a few attempts at bringing text-mining tools to the humanities (such as the MONK project). Yet there is not as much research as one might hope on what non-technically savvy scholars (especially historians) might actually want and use in their research, and how we might integrate sophisticated text analysis into the workflow of these scholars.

The project will first conduct a survey of historians to examine closely their use of digital resources and prospect for particularly helpful uses of digital technology. It will then explore three main areas where text mining might help in the research process: locating documents of interest in the sea of texts online; extracting and synthesizing information from these texts; and analyzing large-scale patterns across these texts. A focus group of historians will be used to assess the efficacy of different methods of text mining and analysis in real-world research situations in order to offer recommendations, and even some tools, for the most promising approaches.

Archives by Year:

About

Since 1994, the Center for History and New Media at George Mason University has used digital media and computer technology to democratize history—to incorporate multiple voices, reach diverse audiences, and encourage popular participation in presenting and preserving the past. We sponsor more than two dozen digital history projects and offer free tools and resources for historians. Learn More

Featured Project

National History Education Clearinghouse

The National History Education Clearinghouse is the central online location for accessing high-quality resources in K-12 U.S. history education. Explore the highlighted content on our homepage or visit individual sections for additional materials.