Appendix: Database Software, Scripting Languages, and XML
Most historians encounter simple databases when they use common software applications such as Access or Filemaker. Web databases differ from these “client,” or desktop, programs. They run silently in the “background” of a web server and respond to specialized requests through small pieces of programming code rather than through the point-and-click actions that occur on a personal computer screen. These invisible database operators permit quick and complicated “queries,” or instructions, from a web page.
Your budget and the other technologies you plan to use on your site will help you choose an appropriate database program. If you anticipate having more than 100,000 entries in the database and require no corruption or loss of information and a service-oriented company to go to with problems (in other words, if you are in charge of a large project like Ellis Island’s website, which stores millions of names as well as associated documents), you may want to use software from Oracle, the high-end market leader, or similar products like DB2 from IBM. These expensive database products are often sold with technical support and cost thousands of dollars per year (or more) to buy and run.1 Even then, many large historical sites with some in-house technical capabilities may be able to use a free high-end alternative called PostgreSQL, which has most of the same features as the commercial software with none of the eye-popping cost.2 For example, the Eastern Illinois University students and professors behind the Coles County Legal History Project, which is cataloguing and making web searchable legal documents from the era of Abraham Lincoln, Esq., decided to move from Access to PostgreSQL as their collection and website grew rapidly over the years.3 PostgreSQL, like Oracle’s products, is “ACID” compliant, an acronym for a database checklist that ensures data integrity during fast-paced, high-volume usage.
For all but the most extensive historical archives, however, PostgreSQL, Oracle, DB2, and other robust databases are overkill, and many good alternatives for small- to medium-sized historical websites exist. Microsoft SQL Server sometimes comes bundled with Internet Information Services (IIS), Microsoft’s web server software (beginning at around $1,500), and in certain ($20,000 and up) versions it can handle as much information as any database (educational versions of SQL Server cost considerably less). In addition, SQL Server includes administrative software that maintains the look and feel of Windows and thus may feel more comfortable than other database packages (though you may not see this administrative software if you do not own or run the server it is on).
Microsoft’s low-end personal database, Access ($229 alone, and also available as part of the Office suite of programs), meant to run more on client computers than web servers, is inexpensive and can be pressed into service as a web database, but not without some software linkages that will require some technical knowledge. The University of Minnesota’s Immigration History Research Center uses Access to store information about its thousands of documents and images, although they convert the Access data into another format to make it searchable via the web.4
More recently, with the widespread adoption of open source software, a good alternative to all of the preceding options has emerged: MySQL, the leading free database.5 MySQL runs on virtually any type of server and ably handles ten or even hundreds of thousands of documents, as it does for our September 11 Digital Archive.6 Although perhaps not as robust as PostgreSQL or Oracle, MySQL is extremely capable for most of the tasks historians will ask of it (e.g., finding a specific document quickly) and is slowly gaining many of the high-end features and stability of its rivals. MySQL will likely continue to proliferate, given its undeniably attractive price and large base of users ready to help out others. Most commercial web hosts (see below) provide MySQL for customers who want to attach a database to their site.
But how do you access the database software lurking on the server, either to put materials in or to get them out to display on your website? In general, putting things in is easier because it can be done through various interfaces without programming. Many database programs come with, or allow for, web-based interfaces to enter datafor instance, the popular phpMyAdmin for MySQLthough you may find such interfaces lacking many of the features you are used to with programs running on your personal computer (like Access, Excel, and FileMaker). But such web-based interfaces allow entry from any computer with an Internet connection and a browser and hence facilitate the distribution of data entry. Another possibility is using special linking software to open a web server database on your personal computer within Access, Excel, or FileMaker. The Open Database Connectivity protocol (ODBC) enables different client and server database software to function seamlessly together in this way. If you are planning to enter your data only once, it may be easier, however, just to record it as you would in one of the easy-to-use client programs (like Access) and then hand off a tab-delimited file (a text file with tabs separating each piece of information, which most databases like Access can easily create) to your web server administrator for ingest into a web database.