Digital History as a Research Methodology

In the process of researching and writing my thesis I turned to the emerging methodology of digital history, and would like to document here the process of doing digital work and the future possibilities for digital history.  Framing Red Power sought to utilize digital technology to investigate and analyze the interaction between media and politics by focusing on the Trail of Broken Treaties as a lens to understand the complex connections between political movements and mass media.  The American Indian Movement was well-known for its propensity for grabbing headlines.  While the occupation of Wounded Knee in 1973 is highly significant and symbolic, it has received substantial attention from scholars.  The Trail of Broken Treaties, on the other hand, was AIM’s first sustained media coverage of the sort they wanted.  The type of language and rhetoric deployed by the media and by AIM, the narrative frameworks, and public discourse both shaped and were shaped by the Trail of Broken Treaties and the context of the early 1970s.  At its core,  my thesis attempted to shed light on the broader interaction between media and politics, and how this complex relationship shapes political culture and ideology.

The advent of digital technologies is changing and challenging the ways historians practice their craft.  The way we collect, present, and store information has changed rapidly in the last twenty years.  Digital history is several things: a methodology meant to aid the traditional art and practice of historians; the use of digital tools to gain insight into information that cannot be done with a legal pad and pen, which allows historians to disseminate and present their information in new ways; and reach wide audiences through nearly ubiquitous digital exposure.  The digital medium is a versatile atmosphere for historians that allows the presentation of historical data in several formats (images, sound, moving pictres) and manipulated in dynamic ways (textual analysis, text mining, GIS maps).  The power of digital tools allows historians to pose new questions to historical problems.  The goal is not simply an archive of material that drives no argument; historians must interpret and analyze material, not simply digitize sources and place them online.  Nor is the goal cliometrics 2.0 or to augment the theory-driven social sciences, but to abide by the historian’s commitment to complexity and nuance while utilizing digital technologies to aide that task.

Digital history is both a new practice and an old art in two ways.  First, the process of becoming an historian has shifted.  The skepticism of digital history is understandable because the profession has developed its own hierarchy–a clear path for how we “do history.”  Dan Cohen, a key scholar in digital humanities, warns that digital historians

need to recognize that the digital humanities represent a scary, rule-breaking, swashbuckling movement for many historians and other scholars.  We must remember that these scholars have had–for generations and still in today’s graduate schools–a very clear path for how they do their work, publish, and get rewarded.  Visit archive; do careful reading; find examples in documents; conceptualize and analyze; write monograph; get tenure.  We threaten all of this.  For every time we focus on text mining and pattern recognition, traditionalists can point to the successes of close reading–on the power of a single word.  We propose new methods of research when the old ones don’t seem broken.

Second, how historians do their work has changed.  Rather than spend hours in an archive, a Google search or a visit to the Library of Congress’s American Memory website puts a a wealth of information seconds away.

Given the nearly unending limitations of digital space that allows for an incomprehensible amount of information on the Internet, any foray into digital history necessarily begins with a clear outline to guide the research or risk an unending project that merely seeks to digitize primary sources.  The research question, like any piece of scholarship, required boundaries and a clear purpose.  The design process entails several important considerations for digital historians: the purpose of a research project, the tools that will be offered, the argument that drives the research, the technologies that are available, and the limitations placed on available sources factor into digital scholarship.  I made the choice to limit my project to national news sources for two reasons.  One, editorial and time constrains forced me to make choices about what would be included in my analysis.  Second, national media had wide audiences that responded in different ways.  My original plan was to analyze major newspapers only and focus on the national (New York Times and Washington Post), regional (Minneapolis Tribune and Chicago Tribune), and local (Sioux Falls Argus Leader and the Rapid City Journal) levels, but there was a striking lack of news coverage at the regional and local level.  The occupation of the Trail of Broken Treaties grabbed national headlines to a much greater extent than local news outlets.  I compiled sources by researching online digital repositories like ProQuest and hunted down microfilm for anything not digitized.

The next task once materials were located was the longest process in the digital project: the transcription and mark-up of the media reports and editorials.  The process used encoding called eXtensible Markup Language (XML), an encoding standard designed for sharing and structuring data on the web that allows users to define mark-up elements.  XML allowed me to define elements behind newspaper articles and preserve the original text.  For example, a section of one of the transcribed newspapers might be encoded:

<p><name type="person" key="Adams, Hank">Adams</name> does not dismiss as unimportant the $2 million damage to the BIA building, but he does not like to dwell on it.  He largely regrets it because the publicity &quot;has diverted attention&quot; from the original purpose of the protest, which was to urge changes in the handling of Indian affairs.</p>

<p>He takes the same stance regarding criminal records of some of the movement's leaders.</p>
<p>(Newspaper clippings in Minneapolis indicate that <name type="person" key="Bellecourt, Vernon">Vernon Bellecourt</name> was convicted of a tavern holdup in <name type="place" key="Terre Haute, Indiana" reg="Terre Haute, Indiana">Terre Haute, Ind.</name>, in 1951, and of a holdup in <name type="place" reg="St. Paul, Minnesota" key="St. Paul Minnesota">St. Paul</name> as a juvenile.  <name type="person" key="Bellecourt, Clyde">Clyde Bellecourt</name> and <name type="person" key="Banks, Dennis">Banks</name> have been arrested several times in connection with protests.)</p>

In the above example, elements in the primary text received tags that are invisible to the reader of a document on the project site, but are useful in making the text machine-readable.  Any editorial decisions for tagging elements has little impact on the text itself.  In this case, <p> tags define paragraphs, <name> tags define specific people and places, and so forth.

XML also allows digital historians to develop sustainable projects for the future.  Since the tags within XML documents are “intelligent,” both machines and humans can easily identify markup text.  Furthermore, any changes in the design of the site displaying XML documents can easily maintain the integrity of encoded material since XML is displayed through stylesheets.  Digital resources, suggests Abby Smith, are best “facilitating access to information and weakest when assigned the traditional library responsibility of preservation.”  “The real challenge,” she continues, “is how to make those analog materials more accessible using the powerful tool of digital technology.”  She raises important concerns about digital collections, arguing that digitizing information should not displace analog.  Technology can become obsolete and render older versions of digital resources inaccessible or, at the least, cumbersome to access.  However, this should not discourage historians from entering the digital realm.  Newer technologies and languages have made storing data easier, but digital historians should think about the available technology when developing their projects to ensure a structured and sustainable project.

With a corpus of digitized newspaper articles, next came the process of integrating digital tools that can assist historians in analyzing material.  Digital technologies are not an end in and of themselves but rather a method for querying and analyzing material in new ways.  Since I was analyzing language I turned to texual analysis tools, specifically TokenX, a powerful textual analysis tool developed by Brian Pytlik-Zillig at the Center for Digital Research at the University of Nebraska-Lincoln.  I also used a free, web-based service called Wordle that allowed me to generate word clouds from my digitized newspapers.  Developing visual representations of the newspaper articles allows historians to spot themes in text that might otherwise be hidden.

The word clouds appeared to reveal a focus on Indians, the BIA building, and the federal government far more than they focus on what the American Indian activists have to say or why they are leading a demonstration.  The issues that AIM wanted to call attention to during the demonstration, such as treaty rights or tribal soverengty, are lost in a narrative more interested in the federal response.  The word cloud itself reveals little without interpretation.  The question of why the press focused on the building and government rather than the purpose of the Trail led to my argument that the press shaped its narrative around the issues of law and order and a discussion of American Indian politics.  Tools like word clouds help to highlight the frequency of language in text, a process impossible (or nearly so) to achieve in print, and reveal ways we can visualize narratives and analyze their significance.

Using digital tools comes with a word of warning.  The integration of tools and visualizations must convey insights into the research.  Though experimenting with visualizations and tools can be useful, integrating these into digital scholarship simply for the sake of having visualizations without purpose contributes little to the scholarship.  No matter how much energy is poured into the design of a digital project, it means nothing if quality content and argument are not foregrounded.

Moving scholarship to the networked environment of the web opens our work to wider audiences–perhaps much wider audiences than historical monographs.  More and more people turn to the web for information, and thus far the profession has been left behind in producing the history web.  The web sends our information everywhere to engage many and wide audiences.  Furthermore, the historical record is open to all and the work of the historian becomes transparent.  Readers can probe the sources for themselves or even generate their own projects from prior work, building a larger network of historical data and interpretation on specific people, themes, or events.  “The goal for historians working in the new digital medium needs to be to make the computer technology transparent,” writes William G. Thomas, “and to allow the reader to focus his or her whole attention on the ‘world’ that the historian has opened up for investigation, interpretation, inquiry, and analysis.”

An issue historians must confront in the digital realm is what being an author in this medium will look like.  Readers of online material do not approach websites the same way they approach books.  Sustained reading is not quite possible with the current technology of the computer screen.  Chunking text or formatting text for “info-snacking” may be the models best suited for online reading.  Furthermore, in one of the large themes of digital history, readers will want to interact with the text.  Hypertext can weave together several aspects of a site, allowing readers to freely roam through projects.  The nonlinear approach to history allows readers to move through narratives in unrestraining ways.  “History,” notes Orville Vernon Burton, “similar to all disciplines, is badly in need of models beyond the monograph for the demonstration of excellence, and where the scholarship itself is in need of new genres and new strategies for reaching new audiences.”

Despite its advantages, digital history has issues that historians must consider.  Because the profession has not embraced the medium as serious scholarship, no quality controls, peer review, or promotion and tenure incentives exist for those doing digital history.  This reluctance by the profesion translates to historians who will not consider digital technologies as part of their work because they remain unsure how it will affect their careers.  By failing to engage the history web, however, others outside the profession will define what constitutes “good” history.  The fear and misgivings are understandable (especially given the large failures of combining computing technologies and history under cliometrics) but the goal of digital history is not to upend the profession.  We can still produce what the profession respects the most (the book) while also producing digital scholarship.  The future is where the two co-exist.

The early forms of digital history were software packages distributed to libraries and universities, digitized historical material sitting in online archives, or material used in courses. Roy Rosenzweig’s 1993 CD-ROM Who Built America? combined images, text, and audio files to produce an interactive narrative. Even the early conception for the Valley of the Shadow Project was meant for CD-ROM distribution until the release of the Mosaic web browser in 1993. The Library of Congress started offering online exhibits in 1992 with Select Civil War Photographs in 1992. Other forms of the early history web included digitized course material and syllubi.

These projects were interactive, though not “participatory” in the way William Thomas identified.  The dynamic, networked, participatory technologies available on the web today have transformed how scholars can disseminate scholarship digitally. Web 2.0 is changing this. We have gone beyond the static representations of the past represented in collections of images or the pasting of text onto a screen to the more dynamic and active capabilities of the web. Rather than guide readers through the framework of a print narrative, users can freely explore projects to experience a historical argument. Material in digital projects can be manipulated and experimented with to build their own connections between sources and the argument.

Web 2.0 technologies make this easier. Rather than lead them through a narrative, users can be presented with a host of interpretive elements that get at the historical question under investigation. Text can be mined for information, hypertext makes associations among elements, and material can be annotated and queried (such as tagging elements) in an effort to understand how everything fits together. Historians might define the parameters of their projects but users have the ability to contribute to knowledge production. The success of Wikipedia is a great example. People have a thirst for free and accessible information and relish the opportunity to contribute what they know or think about topics, a process that Steven Mintz called “active learning.”  Digital history is not only important for the audience, but for the scholar. Overtime projects will change due to the impermanence of the medium. Digital projects become open access platforms constantly undergoing changes and modifications as research is built upon and interpretations shift and transform.

Thomas writes that digital history is “a process, an active, spatial, virtual-reality encounter with the past.  I see my project embracing certain threads of that definition but still not fully addressing narrative or allowing users the full ability to query or manipulate material in the digital medium. My project sits between the archival approach but does not fully embrace certain Web 2.0 technologies that allow users full interaction with material. TokenX would serve a greater purpose than Wordle in allowing users to query primary sources. My experiments with hypertext and using popup dialogues I believe get at an interactive narrative that allows users to draw associations or retrieve contextual information about topics I bring up. However, I’m still expending mental energy on what narrative and scholarship should look like in this environment. My interpretive elements such as Wordle and Timeline allow users to visualize text and the dissemination of information, but I would be more pleased if users could add or subtract elements at their whim; for instance, having the ability to add or remove newspaper articles, words, or phrases from my word clouds. However, the XML markup of newspaper articles are fully available and open for use. I’m also trying to get a search engine that allows users to parse documents for specific keywords or phrases.

Historians must take an active role in developing and defining digital history. Open access and open source scholarship allows historians to connect with new and wider audiences. Given the problems facing the publishing industry, historical scholarship must find something beyond the monograph in order to present knowledge. People increasingly turn to the web for information and often go no further. If historians are concerned about the low quality of historical information online, then they need to take an active role in improving the quality or it will be left to others outside the historical profession. Technological barriers, wariness about using technology, and ideas of Cliometrics 2.0 has restricted the number of historians engaging digital technology. The result, as Ed Ayers points out, is that no large-scale project like the Valley of the Shadow has emerged in the last ten years.

There are also problems of acceptance of digital scholarship as serious scholarship within the profession. For young scholars, the message from the profession seems to be that it is best to ignore digital scholarship because it will not factor into a professional career. Our task should be not only to help define the history web, but to convince others in the profession that scholarship and interpretation stands at the center of digital projects; they’re not merely vehicles for entertainment, but an efficient medium for presenting knowledge, sources, interpretation, and making associations. History seems especially suited for digital technologies, and historians should embrace the medium to pioneer new methods of conveying knowledge, arguments, and interpretations.

June 03, 2009 @jaheppler