Blues Lyrics: A Post-Mortem
May 8th, 2006My final project consists of a database of blues lyrics and a separate site that provides search hints as well as general information about the database and me. At present, the database contains the lyrics of 66 songs by 21 artists from Race recordings made in the 1920s, 1930s, and early 1940s. As I indicated on the About Blues Lyrics site, I am painfully aware that my project is still very much under construction. I found it difficult to leave my project alone this last week, but at least doing so has put me in a better position to assess the strengths and weaknesses of my work so far.
When I first set out to create a searchable collection of blues lyrics, my decision seemed relatively unproblematic. I have had a longstanding interest in blues, especially blues recordings from the 1920s, 1930s, and 1940s. Over the years, I have collected some 2000 blues songs from this era, read extensively on the topic, and written several papers that analyzed blues lyrics or employed them as historical evidence. For all these reasons, I did not have to conduct much additional research on my topic; instead, my research focused on whether there was a need for a searchable collection of blues lyrics and how I could go about creating one.
I soon became convinced that there was, in fact, a need for my project. Oddly, just as I began to consider creating a searchable collection of blues lyrics, two of my fellow graduate students told me that they had wanted an easy way to find blues lyrics on a certain topic. After searching the internet, I determined that a large, searchable collection of blue lyrics did not currently exist. There were, however, several large collections of blues lyrics (without a search function). These sites, most notably Harry’s Blues Lyrics, seemed to make my job easier, as I could simply copy the lyrics from other collections and add them to my own collection. By copying the lyrics in other collections, I did not have to transcribe lyrics from the recordings themselves.
Despite the ease with which I could gather lyrics, the process of collecting them posed some challenges right at the outset. I soon realized that even existing collections as large as Harry’s Blues Lyrics contain only a fraction of the blues songs from the 1920s, 1930s, and 1940s. (In fact, a few notable artists, such as Furry Lewis, Papa Charlie Jackson, and Leroy Carr, are missing from these collections). The fact that none of these collections are exhaustive means that, at some point, I would eventually have to begin transcribing lyrics from the CDs I own. I also noticed that these collections lack metadata regarding the date and the location of the original recordings. I would have to supply this information myself, but often doing so involved more than simply looking at a CD’s liner notes. As I looked more at existing collections, I discovered that they contained a surprising amount of errors. Whenever the lyric in an existing collection did not jive with my memory of the lyric, I listened to the song and made the necessary corrections; nevertheless, by making use of other collections, I expect that I inadvertently introduced errors into my own collection.
An additional challenge arose from my desire to create a searchable collection, as the attempt to accurately reflect a singer’s pronunciation often runs counter to the kind of standardization needed to facilitate searches. Even though all the lyrics in my database came from other collections, I have edited most of these lyrics in order to better facilitate searches. Given my time constraints, I had to invent ways to resolve the tension between accuracy and standardization as I went along. I am not sure that I would make the same choices if I had had more time to deliberate, but at least my About Blues Lyrics site indicates the kinds of decisions I made. If I decide to stick with these decisions, this site will serve as a reminder of how to continue negotiating the conflict between accuracy and standardization. On the other hand, if I decide to invent new rules, the information on this site will help me undo the edits I have already made.
Aside from the challenges involved in collecting and editing lyrics, the choice of which search tool to use proved more difficult than I had anticipated. Early on, I regarded ProQuest, with its ability to perform Boolean searches across a defined number of categories, as a model for my collection. At first, I hoped that using XML or SQL would allow me to enable the type of searching possible on ProQuest; however, once I realized that learning either would take a considerable amount of time, I ruled both of them out. Greenstone Digital Library, which is one of the first open source search tools I encountered in this class, did not really enable the type of searching I wanted. Another tool, PhiloLogic, is closer to what I wanted, but installing it requires more direct access to a server (and more technical knowledge) than I currently have. Finally, Josh recommended using either Lazybase or Google Base.
After playing around with both, I ruled out using Google Base for a number of reasons. First, the people at Google review each record uploaded to Google Base before it publishes, which means I would not have complete control over the content of my collection. Second, even though Google Base would allow visitors to search only the lyrics I uploaded, most visitors would probably search all the content on Google Base and only encounter links to the lyrics that I uploaded somewhere near the bottom of a large list of search results. Third, unlike Google itself, Google Base does not have any advanced search functions. Finally, any lyrics that I uploaded on Google Base would attain a relatively high profile, which means that getting angry messages from copyright holders would be all the more likely.
Given all of these considerations, I ultimately settled on using Lazybase. Fortunately, this decision resulted in several unintended benefits. I found that I could give my site a memorable public URL, which is lazybase.com/blueslyrics. I realized that I could create a number of interconnected databases on Lazybase. This aspect allowed me to define certain data types that link to artists (i.e., Gender, Genre, Region, and Song) and other data types that link to songs (i.e., Artist, Year, and Recording location). I also discovered that, even though Lazybase provides several advanced search functions, its quick search function is quite useful in that it displays search results by data type. For example, when I perform a quick search for the word “Georgia,” Lazybase provides a results page that, instead of listing seven undifferentiated results, shows one match in Region, five matches in Song, and one match in Artist. The fact that Lazybase displays search results listed by data type, and it links various data types together, will help visitors find what they are looking for (and locate related entries) faster. These two aspects also make Lazybase similar to the Internet Movie Database.
Lazybase, however, proved to be a bit inappropriate for my purposes. Clearly, it was not intended for containing lyrics or any other large piece of text. In Lazybase, text fields cannot contain more than a line of text, so, in order to display the lyrics of a song properly, I had to create text fields called Line 1, Line 2, Line 3, and so on. Then, instead of simply uploading or copying an entire song lyric to Lazybase, I had to copy and paste each line separately. Entering lyrics into Lazybase is thus a relatively time-consuming process, one that increases the number of opportunities for errors in the transfer of data. Additionally, when a visitor views All Songs, the display is unusually large and messy because there is no way to hide all the Line fields.
I also discovered that Lazybase was not really suited for any large, constantly growing collection. By default, it displays entries in the order they have been created. Although visitors can elect to view the full list of entries in each data type alphabetically or chronologically, the list of links related to one data type (e.g., all the Songs by the Artist “Blind Willie McTell”) will always appear in the order they were created. Unless I want to start my database all over and add each entry in a carefully planned fashion, each list of links will appear quite disorganized once the database is large enough to contain dozens of links related to each data type.
A final problem with Lazybase, one that I touched on when I mentioned Line fields cannot be hidden, is that there is little I can do to modify the appearance of Lazybase. Although appearance encompasses the aesthetic look of my site, I am more concerned by my inability to frame the site with explanatory information and search hints. While I can embed data from Lazybase in a table in another website, I cannot transfer Lazybase’s search functions to the same site. As a result, I had to create a separate site that provides explanatory information and links to Blues Lyrics. Unfortunately, it is very likely that a visitor might encounter my Blues Lyrics database without ever knowing there is a separate About Blues Lyrics site.
Now that I have thought more about this issue, I realize that I can add another data type to my Blues Lyrics database called “About Blues Lyrics.” This data type would consist of a single entry—a link to my About Blues Lyrics site. Even so, I expect that Lazybase may be too ill-suited for my purposes. It was fun to play around with Lazybase, especially with its ability to interconnect databases, but I think I will have to look for another search tool as I continue my project. Hopefully, the next tool that I use will resemble Lazybase in terms of providing a personalized URL, interconnecting databases, and differentiating search results. At the same time, I want the tool to be better suited for displaying lyrics and supporting a large, constantly growing collection. A more ideal tool would also allow me to house the search functions and the explanatory information within a single website.
With a little more time and money, I could perhaps learn SQL, rent some server space, and establish a domain name. I know there are reference books that contain the lyrics and recording information for the same recordings with which I am dealing; with more time, I would consult these books in order to help me transcribe lyrics, to obtain metadata, and to supplement my reliance on other web-based lyrics collections. Ideally, I would like my website to contain lyrics for more than 40 artists and 750 songs. I would also like to include a lengthier discussion (perhaps as long as this post-mortem) of the choices I made in transcribing lyrics, as well as my rationale for making these choices. (I had also considered provide links to other online lyrics collections, but, given the amount of errors on these sites, I am not sure that I want to endorse using them.)
If I ever accomplish all of these goals, then I would consider supplementing the lyrics with explanatory footnotes, including biographies (or links to biographies) for each artists, and providing streaming audio for every song. These considerations, of course, are a long way off.