Reflections on My First Year as a DH Fellow

This has been a very eventful and exciting first year for me working within the Center for History and New Media. As I mentioned in my introduction post, I came in with a vague familiarity with CHNM from my previous institution. However, being immersed within it helped me to form a much greater understanding of what the Center does, what the possibilities of Digital History are, and where I fit into that picture for my career and studies.

First, the inclusion of the DH Fellows into the different divisions throughout the year was extremely helpful for me. Through this process, I learned how the Center works, as well as the different projects that were available. I came out way more knowledgeable about Omeka, PressForward, and how projects such as Sea of Liberty come to life.

Beyond the actual projects, one of the primary benefits of being a DH Fellow was the establishment of communication and networks that I feel will be incredibly helpful for me continuing forward. With the various projects, such as our project that scraped THATCamp data (those posts are here in five parts-one, two, three, four, five), we were able to communicate with people within CHNM for assistance. That process was immensely helpful, as well as the Digital Campus podcast sessions that let me engage in the process of thinking about the implications of current events for digital history, as well as participating in an experience working with major players in the field.

As well as working with people that are currently doing Digital History, I was also able to work with my peers that were in the Clio 1 and 2 classes. The DH Fellows, through the suggestion of Spencer Roberts, created the DH Support Space that met every week. This support space was a great addition to the DH Fellowship, as it allowed us to both use the tools we had learned in class and at CHNM, as well as to assist other students in issues they had. This let me grow my skills and learn in the process of trying to help other students with their assignments and projects.

Lastly, I believe one of the most useful aspects of my first year was the seminar. First, we researched what made a digital humanities center, and which ones are still around today. This then led into researching the history of the Center for History and New Media. Each of the first year DH Fellows was given a project to research, and I picked teachinghistory.org. This project let me dive into grants, documents, and how our projects played into the history of the Center. It was very helpful in learning where the Center fits into the larger context of Digital History, which is significant (and helpful!) for my minor field of Digital History.

I feel that overall, this entire year has been incredibly useful and helpful for me, and I have learned so much from working here. I look forward to assisting others this year, mentoring, and continuing to learn the process of doing Digital History through my assignment as a DH Fellow in my second year.

Reflections on the Spring Semester and Year 1 as a Digital History Fellow

It seems like just yesterday we walked into the Center for History and New Media a bit unsure about what our first year as DH fellows would entail. Looking back it has been an extremely rewarding and valuable experience. Last fall we blogged about our rotations in both the Education and Public Projects divisions. In the Spring we moved to Research for seven weeks where we worked on a programming project for THATCamp and on the PressForward project before moving onto a seminar about the history of CHNM. I want to use this blog post to reflect on the spring semester and look back at the year as a whole.

Our first stop during the spring semester was the Research division. We began our seven weeks by taking on a topic modeling project which aimed to mine all the posts from the THATCamp individual websites and blog about the process. As we used the Programming Historian to learn python (or at least attempt to), we thought a lot about tools and the scholarly research process. We discussed Zotero as a tool and the values and community behind THATCamp as a training network and community for the Digital Humanities. Although we struggled with the programming aspect of this assignment and managed to miss important concepts behind Topic Modeling, the assignment gave us some insight into what kinds of challenges and opportunities topic modeling holds. From this project I learned first hand the importance of understanding the black box behind Digital Humanities tools. After finishing with our topic modeling project we moved onto the PressForward project. We spent a week working as Editors-at-Large and helped second year fellow Amanda Morton with her Editor-in-Chief duties. Thinking about scholarly gray literature and measuring reception of scholarly works on the internet we also spent time researching AltMetrics.

At the end of the three rotations we were left with a very clear understanding of each division, its current and past projects, the audiences it creates for and the overlap between each division. We then began a seminar with Stephen Robertson that explored the history of RRCHNM. In this seminar we tried to understand how RRCHNM developed over the years into its current state and how RRCHNM fits into the larger history of the digital humanities. Beginning with an overview of what a Digital Humanities Center is and how its defined, we collaboratively looked at all 150 centers in the United States and tried to get a sense of the different models that exist and just how many actually fit the definition of a digital humanities “center” as defined by Zurich. What we realized is that the Center for History and New Media stands out from other Digital Humanities centers due to its unique attachment to the History Department but also because of the origins of the center and because of Roy Rosenzweig’s vision.

After we defined just what a center was and looked at the different models, we started to look at the origins of RRCHNM and try to create a genealogy of the different projects and trace the development of the center. Each of the first year fellows took a different major project and traced its history through grant documents and reports. I read up on Zotero in its different iterations and learned a lot about how Zotero was originally conceived as well as how it has grown, expanded, and changed since 2004.

I think one of the things that has been immensely useful for the first year fellows is the ways much of our work at the center was paralleled by our coursework. In the PhD program at GMU we’re required to take a two course sequence in digital history. The first sequence focuses on the theory of Digital History and the second is largely a web design course that introduces us to the basics of HTML and CSS. Often times the topics in Clio I related directly to why we were doing at the center and the dual exposure allowed us to see the application of things we had discussed in Clio first hand.

At the suggestion of Spencer Roberts, the fellows decided to begin a Digital History Support Space in the Fall. The support space offers “advice, guidance, and assistance for students doing digital history projects.”  Every Monday from noon to 5pm (and sometimes even on weekends) we met with students taking the Clio courses, offered advice about and brainstormed potential projects, helped to debug code, and offered a space to work where help was available if needed. We were able to draw on experience from the center and offer advice about what kinds of tools are available and where resources might be found. We weren’t experts but working with the other students in our Clio classes was equally beneficial. It left me with a better understanding of the issues, topics, and tools discussed in our classes. As many of the PhD students move onto Clio III: Programming for Historians with Lincoln Mullen this fall, I’m looking forward to continuing the Support Space.

The fellowship has been structured in such a way that each element has built on itself to provide us with experience and an understanding of digital history, digital humanities, and the debates, methodologies, and histories of the discipline. This fall I’ll be working in the Research Division on the PressForward project and helping to manage both Digital Humanities Now and the Journal of Digital Humanities. Our first year as Fellows has gone by extremely fast but I’m looking forward to beginning a new year and moving into the role of mentor to the new group of DH Fellows.

First-Year Review

Our spring semester as Fellows at the Center passed remarkably quickly (not solely a result of the frequent snow days but cancellations definitely contributed to the rapid approach of summer). We were kept very busy with projects for the Research division and an intensive DH Seminar this semester. Below I’ll briefly describe some of the activities we undertook throughout this period and reflect on my first year fellowship at CHNM.

The semester started with six weeks in the Research division – by far the most intimidating to someone that is new to DH. Quickly, however, we were put to work on several engaging projects and I found that I acclimated without feeling overwhelmed. We learned about PressForward by doing some user testing and improving the documentation for the plugin. We also were able to learn about the grant-writing process by doing some research for an upcoming project and we got a clearer idea of how plugins and tools are developed at the center. The majority of our time in this division was spent on the challenging task of using digital tools to uncover information about THATCamp. We blogged about the process of being set loose on the contents of THATCamp and the scraping and topic modeling we performed (those posts are available here). We shared these results in a center-wide presentation and received a lot of support and feedback for the project.

Across the semester the Fellows also focused time on providing support and assistance to other students. As many of us were also enrolled in Clio 2, we were visited many of our classmates and our table was often filled with students collaborating on skills and resources. With assignments that required significant use of digital tools, we handled questions regarding Photoshop and Dreamweaver, sought new resources and tools, and helped find errors in HTML or CSS. I saw a huge benefit in working through problems and took a lot of inspiration from the advice and suggestions of everyone at the table.

Finally, our semester came to a close as we spent the last six weeks in a seminar with Dr. Stephen Robertson. The seminar built on the experiences within each department at the Center and, with this base of knowledge, asked us to turn our gaze outward at the digital humanities as a field and DH centers as centers of production. This discussion was also a timely one, as this fall CHNM will celebrate its 20th anniversary and the Center has begun to reflect on this period. We used Diane Zorich’s work on DH centers with readings by Mark Sample, Stephen Ramsay, Bethany Nowviskie, Neil Fraistat, Elijah Meeks and Trevor Owens, to frame our discussions and answer questions about where, when and how DH work has been done.

Using centerNet as a starting place, we tried to unpack a larger history of digital humanities labs and centers. This process raised interesting questions for us about the differences between a resource center, library service desk, institutional organization and brick-and-mortar DH center. Projects, staff, infrastructure, institutional support and audience were among the issues we considered, but we were also curious about how these locations are linked through shared resources, staff and projects.

Next we dug into the history of CHNM. Oral histories have been collected from participants at the center- but we soon realized that the overview these interviews provided would be only part of the picture of CHNM. In order to further unpack this history, we would need to dive into the projects themselves. Each of us examined a pivotal project. For me this was ECHO, a web portal for the history of science and technology. Working through grant materials enabled me to make connections between this early project and current/recent projects like Hurricane Digital Memory Bank, Zotero, and Omeka. Using ECHO as a vantage point, I gained greater insight into the transitions the Center has seen – from an emphasis on CD-ROMs and single-subject websites to building tools that enable us to organize, analyze, present, and use content in new ways. Understanding and unpacking this trajectory was very useful for me and a meaningful part of my semester.

Looking across my year at CHNM, I’m very happy with the time we spent in each division. Walking into the center can be an intimidating process. One has the immediate sense that you are entering a place where things happen, where goals are made, met, and exceeded. It was very hard to imagine my place in the midst of such an accomplished group of people. With a limited digital background – this was a year of learning, asking questions and digging up online tutorials. The Center has been a remarkable resource toward that goal. Cycling through each division exposed us to a variety of projects and workflows and I’ve learned a great deal through this process. Though each division responds to their own set of concerns and audiences, there is a definite cohesion to the work that is done. It has been remarkably informative to have played a small part in that process.

Planning the Wrap-Up

It’s been a long while since one of the DH Fellows wrote a post here, but I assure our readers that we’re not being lazy. Rather, we are busy with the daily goings-on of the center. Unfortunately, some of those activities are just not as blog-worthy as some of our previous activities. But as the term winds to an end, we’ll have some reflections on our winter term (or is it spring?).

Additionally, Cohort 1 (Amanda M., Ben, and Spencer) will be preparing a final report for their time in the fellowship. All of us will continue at the university and probably in the center, but our classification as DH Fellows officially ends at the close of this term. Our final reflections on the fellowship will help to identify its actual value from the perspective of its participants. Our views are important because the fellowship was proposed and implemented with certain values in mind, but those almost always change when the rubber hits the road. And who better to identify the worth of a training program than those who have been trained?

There is some difficulty, however, in our immediate future. The fellowship was proposed and implemented for three cohorts. In the fall, three new students will take the three final positions. The last cohort of the fellowship is also limited to one year of funding, after which they will depend on the department rather than the provost. It’s an awkward situation because those of us who were here at the beginning won’t officially be present at the end, and those at the end receive only a half portion of the fellowship’s peak output. (See note below)

The difficulty, then, is writing up a report from the perspective of guinea pigs that captures the success of the fellowship before it’s officially over. We all believe it was extremely valuable to the university, to the center, and to us. But how do we make that apparent to others? And how do we convince the new provost that another series of cohorts is a valuable investment? We’ll be tackling those questions in the next few weeks, and posting some of our conclusions here.

Note: I’m not suggesting that the third cohort is being short-changed. One year of funding under this program is better than no years of funding, and that’s just how it was designed. Furthermore, all PhD students in History at Mason receive at least three years of funding from the department. The fellowship is added to those years, so even one year extra is great.

THATCamp Mallet Results

We have spent the last few weeks working to build a python script that would allow us to download and prep all of the THATCamp blog posts for topic modeling in MALLET (for those catching up, we detailed this process in a series of previous posts). As our last post detailed, we encountered a few more complications than expected due to foreign languages in the corpus of the text.  After some discussion, we worked through these issues and were able to add stoplists to the script for German, French, and Spanish.  Although this didn’t solve all of our issues and some terms do still show up (we didn’t realize there was Dutch too), it led to some interesting discussion about the methodology behind topic modeling.  Finally we were able to rerun the python script with the new stopwords and then feed this new data into MALLET.

Continue reading

Unexpected Challenges Result in Important and Informative Discussions: a transparent discussion about stripping content and stopwords

As described in previous posts, the first year Digital Fellows at CHNM have been working on a project under the Research division that involves collecting, cleaning, and analyzing data from a corpus of THATCamp content. Having overcome the hurdles of writing some python script and using MySQL to grab content from tables in the backend of a WordPress install, we moved on to the relatively straightforward process of running our stripped text files through MALLET.

As we opened the MALLET output files, excited to see the topic models it produced, we were confronted with a problem we didn’t reasonably anticipate and this turned into a rather important discussion about data and meaning.

Continue reading

Pre-processing Text for MALLET

In our previous post, we described the process of writing a python script that pulled from the THATCamp MySQL Database. In this post, we will continue with this project and work to clean up the data we’ve collected and prepare it for some analysis. This process is known as “pre-processing”. After running our script in the THATCamp database all of the posts were collected and saved as text files. At this stage, the files are filled with extraneous information relating to the structure of the posts. Most of these are tags and metadata that would disrupt any attempts to look across the dataset. Our task here was to clean them up so they could be fed into MALLET. In order to do this, we needed to strip the html tags, remove punctuation, and remove common stopwords. To do this, we used chunks of code from the Programming Historian’s lesson on text analysis with python and modified the code to work with the files we had already downloaded.

Continue reading

Extracting Data from the THATCamp Database Using Python and MySQL

This week we’ve continued to work on building a python script that will extract all of the blog posts from the various THATCamp websites. As Jannelle described last week, our goal was to write a script that downloads the blog posts in plain text form and strips all of the html tags, stopwords, and punctuation so that we can feed it into MALLET for topic modeling and text analysis. After several long days and a lot of help from second year fellow Spencer Roberts, we’ve successfully gotten the code to work.

Continue reading

Spring Semester in Research and a THATCamp Challenge

The spring semester is here and the first year DH fellows have begun our rotation into the Research division of CHNM.

To get the ball rolling, we spent a week working through the helpful tutorials at the Programming Historian. As someone new to DH, with admittedly limited technical skill and knowledge, these were immeasurably useful. Each tutorial breaks content into smaller, less intimidating units. These can be completed in succession or selected for a particular topic or skill. While there is useful content for anyone, we focused our attention on Python and Topic Modeling with the aim of solving our own programming dilemma.

Our central challenge was to extract content across the THATCamp WordPress site to enable us to do some text analysis.

Continue reading

Public Projects: Reflection

Our first semester at the Center for History and New Media has flown by. We spent the second half of the semester in the Public Projects Division which was a diverse and rewarding experience.

During this rotation we were able to tour the entire division and spend some time working with many of the division’s projects.  We spent a large chunk of time working with Omeka, testing plugins, themes, and other items that are in development.  One thing I took away from working with the Omeka team and attending the Sprint Planning meetings is how collaborative this division, and the center as a whole, is.  Between programmers, designers, testers, and content development– Omeka really is a team project that seeks to make collecting easier for museums and archives.  Through working with the software we also got some hands on experience with the amount of work it takes to build an archive and what kinds of issues come up when doing so.  We discussed and experienced issues such as the naming of pages and areas on a site, creating a strict vocabulary to make searching consistent, and developing content first hand.

We also spend time developing content for projects such as The Histories of the National Mall and Papers of the War Department.  The National Mall project allowed us to think about how the public utilizes mobile history sites when at a museum or a national park such as the Mall.  We spent a wonderful afternoon down on the mall testing the mobile first site (and enjoyed some excellent tacos from the local food truck tacos!).

Papers of the War Department was a different experience and we spent time both transcribing documents and tagging meta data for documents. Using the Scripto plugin for Omeka, we first tagged revisit documents with key words, names, places, and topics.  This element of the project required some knowledge and required a deeper engagement with the documents than transcribing did.  Transcribing the documents was challenging (seventeenth century handwriting is interesting) but we could all see the immense benefit to having the documents both transcribed and tagged on the site.

I think we are starting to really begin to understand the inner workings of the center and the projects and goals of each division.  Public Projects does several different things from software development to content based projects and I think we all benefited greatly from our tour around the division. Coincidently, the first year fellows were also taking Clio Wired I this semester and often what we did at the center overlapped with what we did in class making the experience even more valuable for us.  I think we all came away from this semester having learned a great deal and I feel much more aware of many of the issues facing scholars in Digital History centers as well as in academia in general.