My rotation through the Research Division

I am not sure what I was expecting when the first year fellows were assigned to the Research division. I came with a preconceived notion of what Digital History research is and what historians do with it. It turned out that the scope of my understanding was actually quite limited. My time in the division has taught me a lot about the vast applications and possibilities of Digital History. We (the first year fellows) were given chances to get our hands dirty and it proved very rewarding. Sadly, this blog post marks the end of our rotation through the Research division.

Our first assignment was to PressForward. We started from the ground up by familiarizing ourselves with the project. We installed the plugin on the sandbox server and got to bang around on it. We explored the PressForward.org site as well as the digitalhumanitiesnow.org site. I must admit that my initial reaction was that PressForward was a glorified RSS reader with some added features of promoting articles. I use Feedly (a RSS reader) on my phone to follow various history blogs and I, at first, did not see a big difference between the two. It wasn’t until someone explained “gray literature” that the full purpose of PressForward came into view. Until that point, I had been ignorant to the issue of online scholarship. The PressForward site explains “gray literature” to be “conference papers, white papers, reports, scholarly blogs, and digital projects.” Online scholarship is being under-appreciated and forgotten in a discipline that has focused so heavily for so long on printed material.  My assignment as an Editor-at-Large and then as an Editor-in-Chief brought this issue into focus for me

Working as an Editor-at-Large and Editor-in-Chief really solidified the importance of PressForward. As an Editors-at-Large, we worked through the live feed of articles and websites coming into Digital Humanities Now. I learned that it can be labor intensive to sift through the various websites and articles to find important, relevant material. It is not always easy to find the scholarship and pertinent information. I also learned first hand about the limitations of the software. On a couple of occasions I fell victim to the browser’s back button instead of closing a window. I then found myself back at the beginning of the feed instead of where I was before I had clicked on the article. After shadowing Amanda and Mandy when they were Editors-in-Chief, the first year fellows were able to make decisions on what would be published to DH Now. It was a very fun experience that helped me begin to grasp the extent of online scholarship and publishing. In addition, reading through the articles helped us to be informed an the various projects in the field. I even found articles that did not qualify for DH Now but were of interest to me. I bookmarked more than a handful that I wanted to return to later.

The second assignment was Programming Historian. While our time in PressForward gave us an overview of one of the projects, Programming Historian introduced us to the “nuts and bolts” of the division. It was here that my experience differed from Alyssa’s and Stephanie’s experiences. I came into this program with a background in computer programming. While I am not a computer science “person” I did take classes, during my undergraduate, on C#, HTML, CSS, and Javascript. I struggled at first with the syntax of Python but my background in programming proved very helpful in picking up the language and quickly moving through the lessons. However, I found the lessons to be more focused  on the task of the program (manipulating strings, working with web pages etc.) than learning the language itself. I think it would be beneficial to those without programming experience to work through the Python lessons at Codecademy before starting the Programming Historian lessons. I found the lessons to be very interesting and fun to do. I am excited to use these programs, such as frequency counts and n-grams, in my own research.

The final part of Programming Historian were the lessons on APIs, more specifically the Zotero API. I had never used Zotero so these lessons introduced me to both Zotero and the Zotero API. Before I began the lessons I played around on Zotero, starting my own library and learning to love the program. From the beginning, I wanted to use my personal library in the lessons and not the sample one provided. By doing this, Spencer and I found a problem in the lessons when my program couldn’t access my library. Alyssa has since reported it and a problem she had to GitHub. After finishing the API lessons, I wanted to do things that the lessons did not delve into. With help from Spencer I was able to bang around on the API in an attempt to add/edit the author field of an item. While we did not find a solution we did make headway and it really piqued my interest in working on the Zotero API.

I am leaving a much improved Digital Historian. The Research division had to help the first year fellows through a learning curve that, in some ways, Education and Public Projects don’t. We now know the Center and feel comfortable in it.We got our feet wet and our hands dirty. The Research division was a great place to do that.

Research Division Reflection

It’s hard to believe that the first year fellows have already completed our first rotation within a division. I was nervous to begin the fellowship in the Research Division, since I’m not super-technical (I was rightly told that I can no longer claim to not be a “technology person”), but I have had quite a learning experience. I learned new skills – I can now effectively explain to someone what a plugin actually does and how it works – and went out of my comfort zone in learning Python.

In our first week, we began with PressForward. After playing around with the sandbox site, I installed the PressForward plugin onto my dev site to get a better handle of how it worked. Once I was more comfortable with the logistics of the plugin I moved on to working as an editor-at-large of Digital Humanities Now. It was incredibly interesting to see how the plugin can be used for academic purposes and how it aggregates and organizes content. I was astounded by the quantity of content that was part of the all content feed, especially since a disproportionate amount of the posts were not related to digital humanities.

In our second week, we shadowed Tuesday’s editors-in-chief, Amanda and Mandy, and watched them go through the process of examining the articles under review and deciding which pieces should be published. Prior to Thursday, I familiarized myself with the editors-at-large corner and read several editors’ choice articles. I especially enjoyed reading “Thoughts on feminism, digital humanities and women’s history,” since my area of research is women and gender. On Thursday we were editors-in-chief, which was such a fun experience.

It was beneficial to begin work with PressForward from the ground up. We started with the sandbox, moved on to seeing how the plugin worked for DH Now, and then used the plugin to publish an issue of DH Now. It is a fantastic tool for disseminating often overlooked material to a wide audience and for collecting and curating information. Overall, I had a positive experience with PressForward and DH Now.

After PressForward, we started learning Python through the Programming Historian lessons. I had minimal experience using HTML, CSS, and XML to create a website from scratch when I was in library school, but programming is not something I am comfortable with. At first Programming Historian was fairly easy and the first few lessons seemed straight-forward, but once I got past the “Manipulating Strings in Python” I started to feel lost. After completing those lessons I moved onto the Zotero API lessons. These were more difficult for me to comprehend, especially since, as Stephanie pointed out, they are not in layman’s terms. With help from Jordan and Spencer, I was able to get through the lessons using the sample Zotero library.

I cultivated my own Zotero library and then went back through the API lessons using it instead of the sample in order to see how much of the lessons I could understand on my own. I was successfully able to get through the first two lessons, which was very exciting. I ran into some problems with the third lesson when Text Wrangler was not reading the URLs from the first two items in my library. It was working when I used the sample library because the URLs are links to simple HTML pages, but the links in my library are linked to more complicated sites, such as the source’s record in EBSCO. Jordan had discovered another problem earlier with the user and group tags, and I went into GitHub and reported both of our problems. I am excited to see how I will use Python in the future with other digital humanities projects.

It was an illuminating contrast to work with both PressForward and Python and to see how the latter influences the former. I can understand why we began in the Research Division since the technical skills we learned are necessary in order to have a solid foundation and understanding of digital history.

Digital Campus Podcast: Back to the Future of Digital Humanities

This past Friday, I co-produced a Digital Campus Podcast with help from my digital history mentor and second-year Digital History Fellow at the Center, Anne Ladyem McDivitt. Stephen Robertson hosted, and Dan Cohen, Amanda French, Mills Kelly, and Tom Scheinfeldt joined the discussion. Of particular interest to me was their debate about the use of Twitter as an academic outlet, how it has evolved, and the possibilities for the future. Will the academic community return to blogging or will Tumblr be the go-to platform?

Digital Campus Episode #106 – Back to the Future of Digital Humanities

Reflections on My First Year as a DH Fellow

This has been a very eventful and exciting first year for me working within the Center for History and New Media. As I mentioned in my introduction post, I came in with a vague familiarity with CHNM from my previous institution. However, being immersed within it helped me to form a much greater understanding of what the Center does, what the possibilities of Digital History are, and where I fit into that picture for my career and studies.

First, the inclusion of the DH Fellows into the different divisions throughout the year was extremely helpful for me. Through this process, I learned how the Center works, as well as the different projects that were available. I came out way more knowledgeable about Omeka, PressForward, and how projects such as Sea of Liberty come to life.

Beyond the actual projects, one of the primary benefits of being a DH Fellow was the establishment of communication and networks that I feel will be incredibly helpful for me continuing forward. With the various projects, such as our project that scraped THATCamp data (those posts are here in five parts-one, two, three, four, five), we were able to communicate with people within CHNM for assistance. That process was immensely helpful, as well as the Digital Campus podcast sessions that let me engage in the process of thinking about the implications of current events for digital history, as well as participating in an experience working with major players in the field.

As well as working with people that are currently doing Digital History, I was also able to work with my peers that were in the Clio 1 and 2 classes. The DH Fellows, through the suggestion of Spencer Roberts, created the DH Support Space that met every week. This support space was a great addition to the DH Fellowship, as it allowed us to both use the tools we had learned in class and at CHNM, as well as to assist other students in issues they had. This let me grow my skills and learn in the process of trying to help other students with their assignments and projects.

Lastly, I believe one of the most useful aspects of my first year was the seminar. First, we researched what made a digital humanities center, and which ones are still around today. This then led into researching the history of the Center for History and New Media. Each of the first year DH Fellows was given a project to research, and I picked teachinghistory.org. This project let me dive into grants, documents, and how our projects played into the history of the Center. It was very helpful in learning where the Center fits into the larger context of Digital History, which is significant (and helpful!) for my minor field of Digital History.

I feel that overall, this entire year has been incredibly useful and helpful for me, and I have learned so much from working here. I look forward to assisting others this year, mentoring, and continuing to learn the process of doing Digital History through my assignment as a DH Fellow in my second year.

Reflections on the Spring Semester and Year 1 as a Digital History Fellow

It seems like just yesterday we walked into the Center for History and New Media a bit unsure about what our first year as DH fellows would entail. Looking back it has been an extremely rewarding and valuable experience. Last fall we blogged about our rotations in both the Education and Public Projects divisions. In the Spring we moved to Research for seven weeks where we worked on a programming project for THATCamp and on the PressForward project before moving onto a seminar about the history of CHNM. I want to use this blog post to reflect on the spring semester and look back at the year as a whole.

Our first stop during the spring semester was the Research division. We began our seven weeks by taking on a topic modeling project which aimed to mine all the posts from the THATCamp individual websites and blog about the process. As we used the Programming Historian to learn python (or at least attempt to), we thought a lot about tools and the scholarly research process. We discussed Zotero as a tool and the values and community behind THATCamp as a training network and community for the Digital Humanities. Although we struggled with the programming aspect of this assignment and managed to miss important concepts behind Topic Modeling, the assignment gave us some insight into what kinds of challenges and opportunities topic modeling holds. From this project I learned first hand the importance of understanding the black box behind Digital Humanities tools. After finishing with our topic modeling project we moved onto the PressForward project. We spent a week working as Editors-at-Large and helped second year fellow Amanda Morton with her Editor-in-Chief duties. Thinking about scholarly gray literature and measuring reception of scholarly works on the internet we also spent time researching AltMetrics.

At the end of the three rotations we were left with a very clear understanding of each division, its current and past projects, the audiences it creates for and the overlap between each division. We then began a seminar with Stephen Robertson that explored the history of RRCHNM. In this seminar we tried to understand how RRCHNM developed over the years into its current state and how RRCHNM fits into the larger history of the digital humanities. Beginning with an overview of what a Digital Humanities Center is and how its defined, we collaboratively looked at all 150 centers in the United States and tried to get a sense of the different models that exist and just how many actually fit the definition of a digital humanities “center” as defined by Zurich. What we realized is that the Center for History and New Media stands out from other Digital Humanities centers due to its unique attachment to the History Department but also because of the origins of the center and because of Roy Rosenzweig’s vision.

After we defined just what a center was and looked at the different models, we started to look at the origins of RRCHNM and try to create a genealogy of the different projects and trace the development of the center. Each of the first year fellows took a different major project and traced its history through grant documents and reports. I read up on Zotero in its different iterations and learned a lot about how Zotero was originally conceived as well as how it has grown, expanded, and changed since 2004.

I think one of the things that has been immensely useful for the first year fellows is the ways much of our work at the center was paralleled by our coursework. In the PhD program at GMU we’re required to take a two course sequence in digital history. The first sequence focuses on the theory of Digital History and the second is largely a web design course that introduces us to the basics of HTML and CSS. Often times the topics in Clio I related directly to why we were doing at the center and the dual exposure allowed us to see the application of things we had discussed in Clio first hand.

At the suggestion of Spencer Roberts, the fellows decided to begin a Digital History Support Space in the Fall. The support space offers “advice, guidance, and assistance for students doing digital history projects.”  Every Monday from noon to 5pm (and sometimes even on weekends) we met with students taking the Clio courses, offered advice about and brainstormed potential projects, helped to debug code, and offered a space to work where help was available if needed. We were able to draw on experience from the center and offer advice about what kinds of tools are available and where resources might be found. We weren’t experts but working with the other students in our Clio classes was equally beneficial. It left me with a better understanding of the issues, topics, and tools discussed in our classes. As many of the PhD students move onto Clio III: Programming for Historians with Lincoln Mullen this fall, I’m looking forward to continuing the Support Space.

The fellowship has been structured in such a way that each element has built on itself to provide us with experience and an understanding of digital history, digital humanities, and the debates, methodologies, and histories of the discipline. This fall I’ll be working in the Research Division on the PressForward project and helping to manage both Digital Humanities Now and the Journal of Digital Humanities. Our first year as Fellows has gone by extremely fast but I’m looking forward to beginning a new year and moving into the role of mentor to the new group of DH Fellows.

THATCamp Mallet Results

We have spent the last few weeks working to build a python script that would allow us to download and prep all of the THATCamp blog posts for topic modeling in MALLET (for those catching up, we detailed this process in a series of previous posts). As our last post detailed, we encountered a few more complications than expected due to foreign languages in the corpus of the text.  After some discussion, we worked through these issues and were able to add stoplists to the script for German, French, and Spanish.  Although this didn’t solve all of our issues and some terms do still show up (we didn’t realize there was Dutch too), it led to some interesting discussion about the methodology behind topic modeling.  Finally we were able to rerun the python script with the new stopwords and then feed this new data into MALLET.

Continue reading

Unexpected Challenges Result in Important and Informative Discussions: a transparent discussion about stripping content and stopwords

As described in previous posts, the first year Digital Fellows at CHNM have been working on a project under the Research division that involves collecting, cleaning, and analyzing data from a corpus of THATCamp content. Having overcome the hurdles of writing some python script and using MySQL to grab content from tables in the backend of a WordPress install, we moved on to the relatively straightforward process of running our stripped text files through MALLET.

As we opened the MALLET output files, excited to see the topic models it produced, we were confronted with a problem we didn’t reasonably anticipate and this turned into a rather important discussion about data and meaning.

Continue reading

Pre-processing Text for MALLET

In our previous post, we described the process of writing a python script that pulled from the THATCamp MySQL Database. In this post, we will continue with this project and work to clean up the data we’ve collected and prepare it for some analysis. This process is known as “pre-processing”. After running our script in the THATCamp database all of the posts were collected and saved as text files. At this stage, the files are filled with extraneous information relating to the structure of the posts. Most of these are tags and metadata that would disrupt any attempts to look across the dataset. Our task here was to clean them up so they could be fed into MALLET. In order to do this, we needed to strip the html tags, remove punctuation, and remove common stopwords. To do this, we used chunks of code from the Programming Historian’s lesson on text analysis with python and modified the code to work with the files we had already downloaded.

Continue reading

Extracting Data from the THATCamp Database Using Python and MySQL

This week we’ve continued to work on building a python script that will extract all of the blog posts from the various THATCamp websites. As Jannelle described last week, our goal was to write a script that downloads the blog posts in plain text form and strips all of the html tags, stopwords, and punctuation so that we can feed it into MALLET for topic modeling and text analysis. After several long days and a lot of help from second year fellow Spencer Roberts, we’ve successfully gotten the code to work.

Continue reading

Spring Semester in Research and a THATCamp Challenge

The spring semester is here and the first year DH fellows have begun our rotation into the Research division of CHNM.

To get the ball rolling, we spent a week working through the helpful tutorials at the Programming Historian. As someone new to DH, with admittedly limited technical skill and knowledge, these were immeasurably useful. Each tutorial breaks content into smaller, less intimidating units. These can be completed in succession or selected for a particular topic or skill. While there is useful content for anyone, we focused our attention on Python and Topic Modeling with the aim of solving our own programming dilemma.

Our central challenge was to extract content across the THATCamp WordPress site to enable us to do some text analysis.

Continue reading