American Panorama: Part I

I recently wrote about the wave of digital history reviews currently washing over print journals like the American Historical Review, The Western Historical Quarterly, and The Journal of American History. This wave brings into focus the odd reticence of digital historians to substantively review digital history projects in open, online venues. I ended the post with a call for the field to more actively engage with the work of our peers and, in particular, to evaluate the historical contributions of these digital projects if and when they fall within our areas of subject expertise. The following is my attempt to do just that.

AmericanPanorama_Landing

American Panorama: An Atlas of United States History was released in December 2015 by the University of Richmond’s Digital Scholarship Lab. It is a collection of four map-based visualizations focusing on different topics in American history: slave migration, immigration to the U.S., canal construction, and the Overland Trails. Each of these visualizations revolve around an interactive map, with surrounding panes of charts, timelines, contextual data, and primary sources related to the topic. If I could summarize the project’s historical contributions in a single sentence, it would be this one: American Panorama incorporates movement into the history of the United States. To be even more specific, the project shines a new light on the historical movement of people. Its three most compelling visualizations (foreign immigration, slave migration, and the Overland Trails) illustrate some of the most monumental shifts of people in American history. There are certainly other episodes of travel and migration worth studying – Indian Removal or the Great Migration immediately jump to mind – but those selected by American Panorama are certainly three of the most consequential.

Like most digital history projects, American Panorama is a collaboration. Unlike most digital history projects, it’s a collaboration between academic historians and a private company. The Digital Scholarship Lab’s Robert Nelson, Ed Ayers, Scott Nesbit (now at the University of Georgia), Justin Madron, and Nathaniel Ayers make up the academic half of the project. The private half of the partnership is Stamen Design, a renowned data visualization and design studio that has worked with clients ranging from Toyota and AirBnB to the U.S. Agency for International Development. Stamen is also, in the words of tech journalist Alexis Madrigal, “perhaps the leading creator of cool-looking maps.” Stamen’s fingerprints are all over American Panorama. The visualizations are beautifully structured, deeply immersive, and packed with information. In fact, data depth and data density are the hallmarks of these visualizations – I don’t think I’ve ever seen this much historical content visualized in this many different ways, all within a single browser window. Furthermore, the project’s visual interface presents a new and valuable framework to understand the scale of people movements in a way that written narratives can struggle to convey. Writing about thousands or even millions of people moving around over the course of years and decades can often devolve into an abstract swirl of numbers, states, regions, and dates. American Panorama makes that swirl intelligible.

The project encapsulates many of the current hallmarks of digital history. It is aimed at a broad public audience and was “designed for anyone with an interest in American history or a love of maps.” Relatedly, the project is exploratory and descriptive rather than explicitly interpretive, and offers only hints at how the reader should understand and interpret patterns. Outside of brief and rather modest textual asides, readers are largely left to make their own discoveries, construct their own narratives, and draw their own conclusions. The common justification for creating exploratory visualizations rather than argumentative or narrative-driven ones is that they encourage participatory engagement. Empowering readers to control how they interact with a visualization nudges them to delve deeper into the project and emerge with a richer understanding of the topic. But an exploratory framework hinges on a reader’s abilities and willingness to discover, narrate, and interpret the project for themselves.

To take one example, American Panorama’s Foreign-Born Population, 1850-2010 offers by far the strongest interpretive stance out of the project’s four visualizations: “American history can never be understood by just looking within its borders.” Even so, the creators consign their interpretation to a short, solitary paragraph in the About This Map section, leaving readers to draw their own conclusions about the meaning and implications of this message. The tech blog Gizmodo, for instance, covered the project’s release under the headline: “See The US Welcome Millions Of Immigrants Over 150 Years In This Interactive Map.” Internet headlines have never exactly been a bastion of nuance, but to say that the U.S. “welcomed” immigrants is, well, not very accurate. It’s also an example of the kind of historical mischaracterization that can arise when projects push authorial interpretation into the background.

Full disclosure: I know and deeply admire the work of Rob Nelson, Scott Nesbit, and Ed Ayers. They are very, very smart historians, which is why I found myself wanting to hear more of their voices. What new patterns have they discovered? What stories and interpretations have they drawn from these patterns? How has the project changed their understanding of these topics? The creators of American Panorama do not answer these questions explicitly. Instead, they allow patterns, stories, and interpretations to swim just beneath the surface. This was likely a deliberate choice, and I don’t want to critique the project for failing to accomplish something that it never set out to do in the first place. American Panorama is not an academic monograph and it shouldn’t be treated as one. Nevertheless, the project left me hungry for a more explicit discussion of how it interpretation and historical literature.

I’d like to offer my own take on American Panorama using equal parts review and riff, one that combines an evaluation of the project’s strengths and weaknesses with a discussion of how it fits into themes and topics in U.S. history. To do so, I’ve focused on two visualizations: The Forced Migration of Enslaved People, 1810-1860 and The Overland Trails. Fair warning: in true academic fashion, I had far too much to say about the two visualizations, so I split the piece into two separate posts. The first is below, and the second will follow soon. (Update: you can read Part II here.)

Part I. The Forced Migration of Enslaved People, 1810-1860

In some ways, Americans remember slavery through the lens of movement. This begins with The Middle Passage, the horrifying transportation of millions of human beings from Africa to the Americas. The focus on movement then shifts to escape, perhaps best embodied in the Underground Railroad and its stirring biblical exodus from bondage to freedom. But there was a much darker, and less familiar, counterweight to the Underground Railroad: being “sold down the river” to new planting frontiers in the Deep South. The sheer volume of this movement dwarfed the far smaller trickle of runaways: between 1810 and 1860 southern planters and slave traders forced nearly one million enslaved people to move southward and westward. The Forced Migration of Enslaved People, 1810-1860 helps us understand the scale and trajectory of this mass movement of human beings.

The visualization uses a map and timeline to illustrate a clear decade-by-decade pattern: enslaved people streaming out of the Upper South and the eastern seaboard and into the cotton-growing regions of the Black Belt (western Georgia, Alabama, and Mississippi), the Mississippi River Valley, and eastern Texas and Arkansas. It shows that this shift was not uninterrupted, but came in fits and starts. The reverberations of the 1837 financial panic, for instance, dampened and diffused this movement during the 1840s. An accompanying data pane charts the in-migration and out-migration on a state and county level: during the 1830s more than 120,000 slaves left Virginia, even as 108,000 slaves streamed into Alabama. None of these findings are especially new for historians of the period, but The Forced Migration of Enslaved People brings them into sharp focus.

ForcedMigration_Data

On an interpretive level, The Forced Migration of Enslaved People helps reorient the locus of American slavery away from The Plantation and towards The Slave Market. This is part of a larger historiographical pivot, one that can be seen in Walter Johnson’s book Soul by Soul (1999). Johnson reminds us that American slavery depended not just on the coerced labor of black bodies, but on the commodification of those same bodies. It wasn’t enough to force people to work; the system depended first and foremost on the ability to buy and sell human beings. Because of this, Johnson argues that the primary sites of American slavery were slave markets in places like Charleston, Natchez, and New Orleans. Soul by Soul was an early landmark in the now flourishing body of literature exploring the relationship between slavery and capitalism. The book’s argument rested in large part on the underlying mass movement of black men, women, and children, both through slave markets and into the expanding planter frontier of the Southwest. American Panorama lays bare the full geography of this movement in all of its spatial and temporal detail.

There is a certain irony in using Walter Johnson’s Soul by Soul to discuss The Forced Migration of Enslaved People. After all, Johnson’s book includes a critique that might as well have been addressed directly to the project’s creators. He bluntly asserts that the use of maps and charts to illustrate the slave trade hides the lives and experience of the individuals that made up these aggregated patterns. Instead, Johnson calls for the kind of history “where broad trends and abstract totalities thickened into human shape.” (8) His critique echoes the debates that swirled around Robert Fogel and Stanley Engerman’s Time on the Cross (1974) and continue to swirl around the digital project Voyages: The Trans-Atlantic Slave Trade Database.

The creators of The Forced Migration of Enslaved People gesture towards the larger historiographical divide between quantification and dehumanization in an accompanying text: “Enslaved people’s accounts of the slave trade powerfully testify to experiences that cannot be represented on a map or in a chart.” Instead, they attempt to bring these two modes of history together by incorporating excerpted slave narratives alongside its maps and charts. Clicking on icons embedded in the map or the timeline reveals quotes from individual accounts that mention some dimension of the slave trade. This interface allows the reader to shift back and forth between the visual language of bars, dots, and hexbins, and the written words of formerly enslaved people themselves. The Forced Migration of Enslaved People uses a digital medium to present both the “broad trends and abstract totalities” and the “human shape” of individual lives. One of the analytical and narrative payoffs of an interactive interface is the ability to seamlessly move between vastly different scales of reading. The Forced Migration of Enslaved People breaks important new ground in this regard by blending the macro scale of demographics with the micro scale of individuals.

ForcedMigration_Expanded

Ultimately, however, the project’s attempt to combine narrative accounts and quantitative data falls short of its potential. On the whole, the scale of the individuals recedes under the scale of the data. The problem lies in the way in which the project presents its excerpted quotes. Flurries of names, places, events, and emotions appear divorced from the broader context of a particular narrative. Reading these text fragments can often feel like driving past a crash on the side of a highway. You might glimpse the faces of some passengers or the severity of the wreck, but you don’t know how they got there or what happens to them next. Then you pass another crash. And another. And another. The cumulative weight of all these dozens of wrecks is undeniable, and part of what makes the visualization effective. But it’s also numbing. Human stories begin to resemble data points, presented in chronological, bulleted lists and physically collapsed into two-line previews. The very features that make narratives by enslaved people such powerful historical sources – detail, depth, emotional connection – fade away within this interface. Narratives give voice to the millions of individuals whose stories we’ll never hear; The Forced Migration of Enslaved People helps us to hear some of those voices, but only briefly, and only in passing.

ForcedMigration_Collapsed1

Historians characterize the years leading up to the Civil War as a period defined by sectional conflict between North and South. The abolition of slavery was not the major flashpoint for this conflict; rather, the expansion of slavery into western states and territories was the primary wedge between the two sides. The issue would come to define national politics by pitting two competing visions of the nation against one another. The Forced Migration of Enslaved People reminds us that this was not just an ideological or political issue, but a spatial issue rooted in the physical movement of hundreds of thousands of people into areas like the Black Belt and the Mississippi River Vally. By the 1850s, many northerners feared that this great heave of slaveholders and enslaved people would continue onwards into the Far West. The Forced Migration of Enslaved People forces us to take those fears seriously. What if the visualization’s red hexbins didn’t stop in the cotton fields of eastern Texas? What if its timeline didn’t end in 1860? Southern slavery did not stand still during the antebellum era and its demise was far from inevitable. This visualization gives us a framework with which to understand that trajectory.

I doubt that most Americans would put slave traders and shackled black bodies within the historical pantheon of great national migrations, but American Panorama injects this vast movement of people into the history of the antebellum United States. In the second part of my discussion, I’ll turn my attention to a much more familiar historical migration unfolding at the same time: The Overland Trails.

The Perpetual Sunrise of Methodology

[The following is the text of a talk I prepared for a panel discussion about authoring digital scholarship for history with Adeline Koh, Lauren Tilton, Yoni Appelbaum, and Ed Ayers at the 2015 American Historical Association Conference.]

 
I’d like to start with a blog post that was written almost seven years ago now, titled “Sunset for Ideology, Sunrise for Methodology?” In it, Tom Scheinfeldt argued that the rise of digital history represented a disciplinary shift away from big ideas about ideology or theory and towards a focus on “forging new tools, methods, materials, techniques, and modes or work.” Tom’s post was a big reason why I applied to graduate school. I found this methodological turn thrilling – the idea that tools like GIS, text mining, and network analysis could revolutionize how we study history. Seven years later the digital turn has, in fact, revolutionized how we study history. Public history has unequivocally led the charge, using innovative approaches to archiving, exhibiting, and presenting the past in order to engage a wider public. Other historians have built powerful digital tools, explored alternative publication models, and generated online resources to use in the classroom.
 
But there is one area in which digital history has lagged behind: academic scholarship. To be clear: I’m intentionally using “academic scholarship” in its traditional, hidebound sense of marshaling evidence to make original, explicit arguments. This is an artificial distinction in obvious ways. One of digital history’s major contributions has, in fact, been to expand the disciplinary definition of scholarship to include things like databases, tools, and archival projects. The scholarship tent has gotten bigger, and that’s a good thing. Nevertheless there is still an important place inside that tent for using digital methods specifically to advance scholarly claims and arguments about the past.
 
In terms of argument-driven scholarship, digital history has over-promised and under-delivered. It’s not that historians aren’t using digital tools to make new arguments about the past. It’s that there is a fundamental imbalance between the proliferation of digital history workshops, courses, grants, institutes, centers, and labs over the past decade, and the impact this has had in terms of generating scholarly claims and interpretations. The digital wave has crashed headlong into many corners of the discipline. Argument-driven scholarship has largely not been one of them.
 
There are many reasons for this imbalance, including the desire to reach a wider audience beyond the academy, the investment in collection and curation needed for electronic sources, or the open-ended nature of big digital projects. All of these are laudable. But there is another, more problematic, reason for the comparative inattention to scholarly arguments: digital historians have a love affair with methodology. We are infatuated with the power of digital tools and techniques to do things that humans cannot, such as dynamically mapping thousands of geo-historical data points. The argumentative payoffs of these methodologies are always just over the horizon, floating in the tantalizing ether of potential and possibility. At times we exhibit more interest in developing new methods than in applying them, and in touting the promise of digital history scholarship rather than its results. 
 
What I’m going to do in the remaining time is to use two examples from my own work to try and concretize this imbalance between methods and results. The first example is a blog post I wrote in 2010. At the time I was analyzing the diary of an eighteenth-century Maine midwife named Martha Ballard, made famous by Laurel Ulrich’s prize-winning A Midwife’s Tale. The blog post described how I used a process called topic modeling to analyze about 10,000 diary entries written by Martha Ballard between 1785 and 1812. To grossly oversimplify, topic modeling is a technique that automatically generates groups of words more likely to appear with each other in the same documents (in this case, diary entries). So, for instance, the technique grouped the following words together:
 
gardin sett worked clear beens corn warm planted matters cucumbers gatherd potatoes plants ou sowd door squash wed seeds
 
As a human reader it’s pretty clear that these are words about gardeningOnce I generated this topic, I could track it across all 10,000 entries. When I mashed twenty-seven years together, it produced this beautiful thumbprint of a New England growing season.
 
Seasonal Presence of GARDENING topic in Martha Ballard’s Diary
 
Interest in topic modeling took off right around the time that I wrote this post, and pretty soon it started getting referenced again and again in digital humanities circles. Four and a half years later, it has been viewed more than ten thousand times and been assigned on the syllabi of at least twenty different courses. It’s gotten cited in books, journal articlesconference presentations, grant applications, government reports, white papers, and, of course, other blogs. It is, without a doubt, the single most widely read piece of historical writing I have ever produced. But guess what? Outside of the method, there isn’t anything new or revelatory in it. The post doesn’t make an original argument and it doesn’t further our understanding of women’s history, colonial New England, or the history of medicine. It largely shows us things we already know about the past – like the fact that people in Maine didn’t plant beans in January.
 
People seized on this blog post not because of its historical contributions, but because of its methodological contributions. It was like a magic trick, showing how topic modeling could ingest ten thousand diary entries and, in a matter of seconds, tell you what the major themes were in those entries and track them over time, all without knowing the meaning of a single word. The post made people excited for what topic modeling could do, not necessarily what it did do; the methodology’s potential, not its results.
 
About four years after I published my blog post on Martha Ballard, I published a very different piece of writing. This was an article that appeared in last June’s issue of the Journal of American History, the first digital history research article published by the journal. In many ways it was a traditional research article, one that followed the journal’s standard peer review process and advanced an original argument about American history. But the key distinction was that I made my argument using computational techniques. 
 
The starting premise for my argument was that the late nineteenth-century United States has typically been portrayed as a period of integration and incorporation. Think of the growth of railroad and telegraph networks, or the rise of massive corporations like Standard Oil. In nineteenth-century parlance: “the annihilation of time and space.” This existing interpretation of the period hinges on geography – the idea that the scale of locality and region were getting subsumed under the scale of nation and system. I was interested in how these integrative forces actually played out in the way people may have envisioned the geography of the nation. 
 
So I looked at a newspaper printed in Houston, Texas, during the 1890s and wrote a computer script that counted the number of times the paper mentioned different cities or states. In effect, how one newspaper crafted an imagined geography of the nation. What I found was that instead of creating a standardized, nationalized view of the world we might expect, the newspaper produced space in ways that centered on the scale of region far more than nation. It remained overwhelmingly focused on the immediate sphere of Texas, and even more surprisingly, on the American Midwest. Places like Kansas City, Chicago, and St. Louis were far more prevalent than I was expecting, and from this newspaper’s perspective Houston was more of a midwestern city than a southern one. 
 
Cameron Blevins, “Space, Nation, and the Triumph of Region: A View of the World from Houston,” Journal of American History, 101, no. 1 (June 2014), 127.
 
I would have never seen these patterns without a computer. And in trying to account for this pattern I realized that, while historians might enjoy reading stuff like this…
 
maine_zoom
 
…newspapers often look a lot more like this:
 
rr_timetable_crop
 
All of this really boring stuff – commodity prices, freight rates, railroad timetables, classified ads – made up a shockingly large percentage of content. Once you include the boring stuff, you get a much different view of the world from Houston in the 1890s. I ended up arguing that it was precisely this fragmentary, mundane, and overlooked content that explained the dominance of regional geography over national geography. I never would have been able to make this argument without a computer.
 
The article offers a new interpretation about the production of space and the relationship between region and nation. It issues a challenge to a long-standing historical narrative about integration and incorporation in the nineteenth-century United States. By publishing it in the Journal of American History, with all of the limitations of a traditional print journal, I was trying to reach a different audience from the one who read my blog post on topic modeling and Martha Ballard. I wanted to show a broader swath of historians that digital history was more than simply using technology for the sake of technology. Digital tools didn’t just have the potential to advance our understanding of American history – they actually did advance our understanding of American history.
 
To that end, I published an online component that charted the article’s digital approach and presented a series of interactive maps. But in emphasizing the methodology of my project I ended up shifting the focus away from its historical contributions. In the feedback and conversations I’ve had about the article since its publication, the vast majority of attention has focused on the method rather than the result: How did you select place-names? Why didn’t you differentiate between articles and advertisements? Can it be replicated for other sources? These are all important questions, but they skip right past the arguments that I’m making about the production of space in the late nineteenth century. In short: the method, not the result. 
 
I ended my article with a familiar clarion call:
Technology opens potentially transformative avenues for historical discovery, but without a stronger appetite for experimentation those opportunities will go unrealized. The future of the discipline rests in large part on integrating new methods with conventional ones to redefine the limits and possibilities of how we understand the past.
This is the rhetorical style of digital history. While reading through conference program I was struck by just how many abstracts about digital history used the words “potential,” “promise,” “possibilities,” or in the case of our own panel, “opportunities.” In some ways 2015 doesn’t feel that different from 2008, when Tom Scheinfeldt wrote about the sunrise of methodology and the Journal of American History published a roundtable titled “The Promise of Digital History.” I think this is telling. Academic scholarship’s engagement with digital history seems to operate in a perpetual future tense. I’ve spent a lot of my career talking about what digital methodology can do to advance scholarly arguments. It’s time to start talking in the present tense.

Still Playing Catch-Up

As I was flipping through the February 2014 issue of the American Historical Review I was encouraged to see that American historical profession’s flagship journal seems to be doing a pretty decent job of publishing the impressive work of female historians. Three out of its four main articles were written by women and four out of the five books in its “Featured Reviews” section were also by women. That’s encouraging. But what about the rest of the February issue? Figuring out how many women are in the 176 contributors for this single issue is a lot harder. And what about not just this issue, but all five issues it publishes annually? And what about not just this year, but every year since its inception in 1895?

Looking at gender representation in the American Historical Review is exactly the kind of historical project that lends itself well towards digital analysis. Collecting individual author information from 120 years of publication history would take an enormous amount of tedious labor. Fortunately the information is already online. I wrote a Python script to scrape the table-of-contents from every AHR issue and then, with the help of Bridget Baird, began to process all of this text to try and extract the books that were reviewed in the AHR, their authors, and the names of the person reviewing them. The data was something of a nightmare, but we were eventually able to get everything we wanted: around 60,000 books, authors, and reviewers. The challenge turned to: was there a way to automatically identify the gender of all of these different people? Especially for a dataset that spanned more than a hundred years we needed a way to take into account potential changes in naming conventions. A historian named Leslie who was born before 1950 was likely to be a man, but if that same Leslie was born after 1950 the person was likely to be a woman. Bridget’s solution was for us to write a program that relies on a database of names from the Social Security Administration dating back to 1880 to account for these changes. This approach is not without problems. It only includes American names while subtly reinforcing an insidious gender binary framework. Nevertheless, it does contribute a useful new digital humanities methodology and one that we are planning to explore with Lincoln Mullen in more depth.

This might come as a real shock, but the American Historical Review didn’t feature very many women for much of its publication history. Over the first eighty years of the AHR‘s existence there were rarely more than a handful of books written by female authors in any given issue – as a percentage of all authors, women made up less than 10% of reviewed books through the 1970s. But things began to change in the late 1970s, when female authors began a steady ascent in the AHR‘s reviews. By the end of the 1980s women’s books had nearly doubled in the journal. By the twenty-first century there were three times as many women as there had been in the 1970s.

gender_percent_byyear
Gender of book authors (as a percent of all authors) in the American Historical Review between 1895 and 2013. The number of authors categorized as “Unknown” in the early years stems from the widespread use of initials (ex. K. T. Drew). Most of these authors were likely men, but we’ve erred on the safe side in categorizing them as Unknown. In the later years, many of the “Unknowns” stem from non-U.S. names.

But other numbers paint a less rosy picture. Lincoln Mullen’s recent work on history dissertations showed a similarly steady upwards trajectory in the number of female-authored history dissertations since 1950. Although it has plateaued in recent years, women have very nearly closed the gap in terms of newly completed history dissertations. But the glass ceiling remains stubbornly low in terms of what happens from that point onwards. In book reviews published in the AHR male authors continue to outnumber female authors by a factor of nearly 2 to 1. Whereas there is now a gap of around 3-5% separating the proportion of male and female dissertation authors, that gap jumps to 25-35% in terms of the proportion of male and female book authors being reviewed in the American Historical Review.

mf_diss_book_bluegreen
Gender of dissertation authors and of book authors in the American Historical Review. Note: The above chart only looks at authors whose gender was successfully identified by the program. It is also something of an apples-to-oranges comparison given that Lincoln and I were using slightly different methods, but it gives a rough sense for the gap between dissertations and the AHR.

On the reviewer side of the equation, things aren’t much better. There are still more than twice as many male reviewers as female reviewers in the AHR. But gender inflects this relationship in less direct ways. In particular, we can look at the gender dynamics of who reviews who. About three times as many men write reviews of male-authored books as do women. In the case of female-authored books, there are slightly more male reviewers than female reviewers but the ratio is much closer to 50/50. In short, women are much more likely to write reviews of other women. And while men still write reviews of the majority of female-authored books, they tend to gravitate towards male authors – who are, of course, already over-represented in the AHR.

male_authors_withreviewers
Gender of reviewers for male-authored books. Note: The above chart only looks at authors and reviewers whose gender was successfully identified by the program.
female_authors_withreviewers
Gender of reviewers for female-authored books. Note: The above chart only looks at authors and reviewers whose gender was successfully identified by the program.

Bridget and I were also able to extract the subjects used by the AHR to categorize their reviews. Although these conventions changed quite a bit over time, I took a stab at aggregating them into some broad categories for the past forty years. Essentially, I wanted to find out the gender representation within different historical fields. As you can see in the chart below, the proportion of men and women is not the same for all fields. Caribbean/Latin American history has had something approaching equal representation for the past decade-and-a-half. In both African history and Ancient/Medieval history female historians made some quite dramatic gains during the late-nineties and aughts. The guiltiest parties, however, are also the two subject categories that publish the most book reviews: Modern/Early Modern Europe and the United States/Canada. Both of them have made steady progress but still hover at around two-thirds male.

categories_gender_bytime
The different subjects are sorted left-to-right by the number of reviews in the AHR. Again, please note that the above chart only looks at authors whose gender was successfully identified by the program.

Women are now producing history dissertations at nearly the same rate as men, but the flagship journal of the American historical profession has yet to catch up. There are, of course, a lot of factors at play. This gap might reflect a substantial time-lag as a younger, more evenly-balanced generation gradually moves its way through the ranks even as an older, male-skewed generation continues to publish monographs. It might reflect biases in the wider publishing industry, or the fact that female historians continue to bear a disproportionate amount of the time-burden of caring for families. That the AHR continues to publish far more reviews of male authors than female authors is depressing, but unfortunately not surprising given the systemic inequalities that continue to exist across the profession.

Text Analysis of Martha Ballard’s Diary (Part 2)

Given Martha Ballard’s profession as a midwife, it is no surprise that she carefully recorded the 814 births she attended between 1785 and 1812. These events were given precedence over more mundane occurrences by noting them in a separate column from the main entry. Doing so allowed her to keep track not only of the births, but also record payments and restitution for her work. These hundreds of births constituted one of the bedrocks of Ballard’s experience as a skilled and prolific midwife, and this is reflected in her diary.

As births were such a consistent and methodically recorded theme in Ballard’s life, I decided to begin my programming with a basic examination of the deliveries she attended. This examination would take the form of counting the number of deliveries throughout the course of the diary and grouping them by various time-related characteristics, namely: year, month, and day of the week.

Process and Results

The first basic step for performing a more detailed text analysis of Martha Ballard’s diary was to begin cleaning up the data. One step was to take all the words and (temporarily) turn every uppercase letter into a lowercase letter. This kept Python from seeing “Birth” and “birth” as two separate words. For the purposes of this particular program, it was more important to distill words into a basic unit rather than maintain the complexity of capitalized characters.

Once the data was scrubbed, we could turn to writing a program that would count the number of deliveries recorded in the diary. The program we wrote does the following:

  1. Checks to see if Ballard wrote anything in the “birth” column (the first column of the entries that she also used to keep track of deliveries)
  2. If she did write anything in that column, check to see if it contains any of the words: “birth”, “brt”, or “born”.
  3. I then printed the remainder of the entries that contained text in the “birth” column but did not contain one of the above words. From this short list I manually added an additional seven entries into the program, in which she appeared to have attended a delivery but did not record it using the above words.

Using these parameters, the program could iterate through the text and recognize the occurrence of a delivery. Now we could begin to organize these births.

First, we returned the birth counts for each year of the diary, which were then inserted into a table and charted in Excel:

Year Deliveries

At the risk of turning my analysis into a John Henry-esque woman vs. machine, I compared my figures to the chart that Laurel Ulrich created in A Midwife’s Tale that tallied the births Ballard attended (on page 232 of the soft-cover edition). The two charts follow the same broad pattern:

YearDeliveriesCompare

Note: I reverse-built her chart by creating a table from the printed chart, then making my own bar graph. Somewhere in the translation I seem to have misplaced one of the deliveries (Ulrich lists 814 total, whereas I keep counting 813 on her graph). Sorry!

However, a closer look reveals small discrepancies in the numbers for each individual year. I calculated each year’s discrepancy as follows, using Ulrich’s numbers as the “true” figures (she is the acting President of the AHA, after all) from which my own figures deviated, and found that the average deviation for a given year was 4.86%. Apologies for the poor formatting, I had trouble inserting tables into WordPress:

Year Deliveries Count Difference Deviation (from Ulrich)
Manual (Ulrich) Computer Program
1785 28 24 4 14.29%
1786 33 35 2 6.06%
1787 33 33 0 0.00%
1788 27 28 1 3.70%
1789 40 43 3 7.50%
1790 34 35 1 2.94%
1791 39 39 0 0.00%
1792 41 43 2 4.88%
1793 53 50 3 5.66%
1794 48 48 0 0.00%
1795 50 55 5 10.00%
1796 59 56 3 5.08%
1797 54 55 1 1.85%
1798 38 38 0 0.00%
1799 50 51 1 2.00%
1800 27 23 4 14.81%
1801 18 14 4 22.22%
1802 11 12 1 9.09%
1803 19 18 1 5.26%
1804 11 11 0 0.00%
1805 8 8 0 0.00%
1806 10 11 1 10.00%
1807 13 13 0 0.00%
1808 3 3 0 0.00%
1809 21 22 1 4.76%
1810 17 18 1 5.88%
1811 14 14 0 0.00%
1812 14 14 0 0.00%

Keeping the knowledge in the back of my mind that my birth analysis differed slightly from Ulrich’s, I went on to compare my figures with other factors, including the frequency of deliveries by month over the course of the diary.

MonthDeliveries

If we extend the results of this chart and assume a standard nine-month pregnancy, we can also determine roughly which months that Ballard’s neighbors were most likely to be having sex. Unsurprisingly, the warmer period between May and August appears to be a particularly fertile time:

Conceptions

Finally, I looked at how often births occurred on different days of the week. There wasn’t a strong pattern, beyond the fact that Sunday and Thursday seemed to be abnormally common days for deliveries. I’m not sure why that was the case, but would love to hear speculation from any readers.

DeliveriesDayWeek

Analysis

The discrepancies between the program’s tally of deliveries and Ulrich’s delivery count speak to broader issues in “digital” text mining versus “manual” text mining:

Data Quality

Ulrich’s analysis is a result of countless hours spent eye-to-page with the original text. And as every history teacher drills into their students when conducting research, looking directly at the primary documents minimizes the degrees of interpretation that can alter the original documents.  In comparison, my analysis is the result of the original text going through several levels of transformation, like a game of telephone:

Original text -> Typed transcription -> HTML tables -> Python list -> Text file -> Excel table/chart

Each level increases the chance of a mistake.  For instance, a quick manual examination using the online version of the diary for 1785 finds an instance of a delivery (marked by ‘Birth’) showing up in the online HTML, but which does not appear in the “raw” HTML files our program is processing and analyzing.

On the other hand, a machine doesn’t get tired and miscount a word tally or accidently skip an entry.

Context

Ulrich brings to bear on the her textual analysis years of historical training and experience along with a deeply intimate understanding of Ballard’s diary. This allows her to take into account one of the most important aspects of reading a document: context. Meanwhile, our program’s ability to understand context is limited quite specifically to the criteria we use to build it. If Ballard attended a delivery but did not mark it in the standard “birth” column like the others, she might mention it more subtly in the main body of the entry. Whereas Ulrich could recognize this and count it as a delivery, our program cannot (at least with the current criteria).

Where the “traditional” skills of a historian come into play with data mining is in the arena of defining these criteria. Using her understanding of the text on a traditional level, Ulrich could create far, far superior criteria than I could for counting the number of deliveries Martha Ballard attends. The trick comes in translating a historian’s instinctual eye into a carefully spelled-out list of criteria for the program.

Revision

One area that is advantageous for digital text mining is that of revising the program. Hypothetically, if I realized at a later point that Ballard was also tallying births using another method (maybe a different abbreviated word), it’s fairly simple to add this to the program’s criteria, hit the “Run” button, and immediately see the updated figures for the number of deliveries. In contrast, it would be much, much more difficult to do so manually, especially if the realization came at, say, entry number 7,819. The prospect of re-skimming thousands of entries to update your totals would be fairly daunting.

Geeking Out with History

I’ve been meaning to blog about this for awhile, but last month the Digital Youth Project released the results of their three-year study: “Hanging Out, Messing Around, and Geeking Out.” The project,  through funding from the John D. and Catherine T. MacArthur Foundation and collaborative scholarship between researchers in the UC system, looked to examine young people’s use and interaction with new media in three realms: communication, learning and play. The overall results are both fascinating and encouraging, and I’d recommend at least reading the two-page summary of their findings.

The title stems from the three modes of use the researchers identify. Hanging out is the primarily social interaction between friends and peers, exemplified by social networking sites, instant messaging, or text messaging.  The second mode, messing around, is a form of digital exploration and expression, exemplified by uploading videos or photos, trying out different online applications, or passing along discoveries (think Elf Yourself or LOLCats, for two admittedly trivial examples). The final mode, geeking out, is diving into a specific topic, finding a community of like-minded enthusiasts, and working towards a degree of expertise in the area.

For anyone interested in youth participation in new media, reading the white paper is a must. K-12 educators with even a passing interest in what their students are doing should take a glance at it. Upon first skimming it, I thought it did a great job of refuting several commonly-held perceptions about young people’s activity online. First, under the hanging out topic, of particular note is the refutation of a still-pervasive myth that kids  go online and end up primarily interacting with strangers. Instead, the researchers write, “With these ‘friendship-driven’ practices, youth are almost always associating with people they already know in their offline lives.” For the majority of young people, the idea of going into online chatrooms and striking up friendships with complete strangers is largely a relic of the past. With messing around, the researchers stress the fact that young people are not passive recipients of media, but they are increasingly participatory members of a community. There is a critical element of trial-and-error, as kids explore and incorporate (or reject) new activities. Finally, my favorite mode: geeking out. The authors highlight the important point that “one can geek out on topics that are not culturally marked as ‘geeky’.”

For some reason, this relatively innocuous assertion provoked a lot of thought on my end. “Geeking out” still carries strong cultural connotations, bringing to mind images of traditional nerd culture – see Timothy Burke’s recent post on Batman comics, in which he offers a disclaimer: “This entry is going to be the maximally geeky one.” But  “geeking out” as a verb can increasingly apply to non-“geeky” subjects: sports-obsessed fantasy football participants, any and every kind of music enthusiast, political gossip and speculation, etc. This has been one of the true hallmarks of the internet, by breaking down barriers of entry into extremely specialized fields of interest.

Which leads me to the title of this post: historians need to take advantage of the digital landscape to geek out with history. Without any amount of exaggeration, I can confidently say that my own geeking out with history has contributed just as much to my identity as a “historian” as my semesters of traditional scholarly training. Subscribing to blogs or listening to podcasts will not replace formal instruction. But it can certainly enhance the learning process, and in my mind, offers a higher ceiling for immediate participation and access. If a student writes an essay for a college course, most of the time the only reader will be the professor, and possibly some fellow students if it is a seminar. Meanwhile, if a student takes that same energy and enthusiasm to their subject online, they read related thoughts from scholars around the world, exchange comments and dialogue with some of those scholars, or post that same essay as a blog and receive feedback from a much greater number of readers.

On the research side, I think many academics are awakening to the vast potential for vertical exploration of historical source material. Fifteen years ago, doing research on a historical subject meant countless trips to archives and libraries, excursions that were largely hindered by geographical and financial considerations. Today, digitization projects have greatly streamlined the process of finding and accessing this material. In doing so, it is opening up the door for anyone to geek out with history. Genealogists and armchair historians have always greatly contributed to the field, whether or not academics like to admit it. But in the years to come, the ability of non-professionals to do professional work will grow and grow. This is a double-edged sword, as greater participation enhances the possibilities for collective intelligence and collaboration, while also running the risk of suffering from a “Barnes and Noble Syndrome,” of an environment dominated by cream-puff analysis and a lack of vigorous interpretative context.

Academic historians need to get their hands dirty online. Read (and write) blogs, mine some data, listen to podcasts, enter a virtual world, upload media, explore databases, leave comments, and share your research. Take some chances and make mistakes. In short, geek out.

Towards a “History This” Command Line

Mozilla Labs recently released the 0.1 version of Ubiquity, a Firefox extension that allows the user to interact with and direct their browser through intuitive, written commands. Ubiquity has met with largely positive and excited reviews from the tech community, from folks at Lifehacker to Hackaday to Tools for Thought. The extension currently allows for a variety of commands. The common example that everyone likes to point to is the “map these” command, where you select text, hit the keystroke to bring up ubiquity and type “map these,” which brings up something like the following:

From there, you can do a variety of things with the map itself, including navigating and moving, or inserting it into a separate page. And of course you can also highlight text, and in Ubiquity type “email this to _____,” which then searches through your Gmail contacts and sends the highlighted text to them. The most common example I’ve read is if you are looking for a restaurant at which to eat with a friend. You can highlight or type in the restaurant name, map it, look for reviews on Yelp, check your calendar for conflicts, and email an invitation to your friend with all of this information included.

Ubiquity interacts with a wide variety of sites through APIs, including Youtube, Weather, Yelp, Twitter, and Flickr. In addition, you can translate and define words, run calculations, export events to your calendar, count words in an article, or convert units. In many ways, it seems to blur the earlier function of Hyperwords (which I covered in a previous post) with the intuitive command line structure of Quicksilver (for Mac users) or Launchy (for Windows).

I immediately thought of interesting commands someone could write for engaging in historical research. Developing a Ubiquity command set for historians would go a long way towards encouraging traditionalists to finally break into digital history. Instead of reading scary words like Python or machine learning, a researcher with little technological background could hit a couple of keystrokes and be off in running with relatively in-depth analysis of digitized archival material. In many ways, Ubiquity could potentially act as a “gateway drug” for digital history. Of course, this all hinges on at least two things:

1) Quality, standardized digitization of source materials combined with quality, standardized open API’s. Dan Cohen has great arguments for the importance of a digitized collection like Google Books not only having an API, but having a good one.

2) Someone in the digital humanities would have to develop these tailored commands for different archives (Bill Turkel, you know you’re interested…) There’s already a Mozilla Labs wiki for creating new commands that looks relatively straightforward, but would probably be above most members of the history community. I’m intrigued by the idea, but unfortunately my own forays into digital history programming have presently taken a backseat to applying to grad schools. Please let me know if anyone in the digital humanities is interested in this…

I feel that Ubiquity takes a substantial next step in the evolution of online interactivity. It’s admittedly buggy (although given its 0.1 version status, this will certainly get better), but it embodies so much of what is positive in today’s digital environment: namely open-source collaboration. Mozilla Labs actively encourages anyone and everyone to develop their own commands and to share them with others. This openness combines with an intuitive simplicity that makes it truly remarkable. As of right now, Ubiquity is a fantastic timesaver and cool trick, but it lacks depth. Almost anything you do in Ubiquity could be done before – just slower and with much less efficiency or ease of use. I have absolutely no doubt that as the open-source developer community jumps on board, this will change.

But for right now what Ubiquity does best is to begin to break down the barriers between computer geeks and laypeople. Some people are writing about the irony of returning to the infant state of the computer interface: the command line.  While interesting, these two instances are fundamentally different: not many people would know how to write even a simple program when faced with earlier command lines, but just about anyone I know can type “Map this” into Ubiquity and get far more complex results. Even as programmers find new ways to write more and more advanced commands, ordinary Firefox users will adopt the basics of Ubiquity in greater and greater numbers. What I foresee in Ubiquity is part of a broader movement that shifts common computing further down the Web 2.0-blazed path of heightened and evolving user participation, control, and access. Instead of having the website developer determine how and where you can go, suddenly you are at the controls of an increasingly powerful and easy-to-use command center for accessing and manipulating data. And I can only dream of the day a grad student will be able highlight some archival text, type “history this” into their command line, and have a fully-compiled dissertation written before their eyes.

Review: Placing History (III)

(This is the third installment of my review of Placing History. See the first and the second parts.)

I’ve finally finished Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship. As my previous posts have made clear, I’m quite impressed with the breadth and depth of the compilation. As before, I’ll briefly recount the remaining chapters, and wrap up my thoughts at the end.

“Mapping Husbandry in Concord: GIS as a Tool for Environmental History,” by Brian Donahue. I liked this chapters for a multitude of reasons. On a personal note, his research is quite similar (though wider in scale) to the work I did in mapping property holdings and transactions of Venture Smith. So in a self-congratulatory mood, I found myself nodding with satisfied agreement at his various points about the benefits and drawbacks to mapping land deeds and parcels. On a less personal level, I liked the various angles he took in pursuing his study of Concord – especially examining seemingly disparate holdings of a variety of original families and noting patterns of land use.

“Combining Space and Time: New Potential for Temporal GIS,” by Michael Goodchild. For starters, the cover illustration for this chapter was a piece of Charles Minard’s famous “Carte Figurative,” which depicts a staggering array of geographic, temporal, and statistical information regarding Napoleon’s ill-fated Russian campaign:

Charles Minard
Charles Minard's "Carte Figurative"

Information graphic guru Edward Tufte described it as “the best statistical graphic ever drawn,” which effectively canonized it for any map and information graphic nerd such as myself. This is a roundabout way of saying I was excited to start reading Goodchild’s chapter. Goodchild doesn’t dissapoint, as he uses decades of geography experience to explore ways in which the field is gradually shifting to incorporate temporal data. Although its heavy on technical geography, it’s a rewarding chapter that covers one of the fundamental challenges of historical GIS: how do you visually display the relationship between space and time? Goodchild predicts that this challenge will rapidly diminish, as tools and systems to display things such as dynamic data, or even a history-specific model, will become more and more accessible and widespread.

“New Windows on the Peutinger Map of the Roman World,” by Richard J.A. Talbert and Tom Elliot. Talbert and Elliott present an analysis of the Peutinger Map, a nearly 7 meter long Roman map depicting the Mediterranean world and beyond, constructed around 300 CE:

Detail of Peutinger's Map
Detail of Peutinger's Map

I liked this chapter a lot, despite my complete unfamiliarity with the subject matter. The authors make compelling arguments backed by GIS analysis, such as: “the basis of the map’s design was not its network of land routes (as has always been assumed) but rather the shorelines and principal rivers and mountain ranges, together with the major settlements marked by pictorial symbols.” They present a quantitative analysis of routes, and utilize a histogram to further examine the segments and their distances.

“History and GIS: Implications for the Discipline,” by David J. Bodenhamer. This chapter, along with the first chapter and conclusion, gives the best “big-picture” perspective on historical GIS. Bodenhamer describes the field of history as a whole, in particular elements of it that relate to spatial analysis. He believes that in order for GIS to become a valuable historical tool, “it must do so within the norms embraced by historians…” GIS is well-situated to do so, because it uses a format of presenting information (the map) that historians are already familiar with, and its visualization and integration of information makes it easier to display the complexity of historical interpretation. He also discusses the challenges to historical GIS. One point I really liked was that technology as a whole, and GIS in particular, often requires a level of precision that historical documents cannot display within “a technology that requires polygons to be closed and points to be fixed by geographical coordinates.” Other challenges range from the theoretical (ex. temporal analysis) to the practical (ex. learning a completely new discipline). Finally, he succinctly sums up one of the greatest challenges: “GIS does not strike many historians as a useful technology because we are not asking questions that allow us to use it profitably.” I could not have said it better myself – until historians begin to ask the type of questions that can be addressed through spatial analysis, GIS will likely remain a technological oddity within the discipline.

“What Could Lee See At Gettysburg?” Anne Kelly Knowles. This is probably one of the most accessible chapters in the book for a layperson. It combines an engaging narrative prose with rich, stylistic maps, and a “popular” subject matter (the Battle of Gettysburg). But more importantly, it clearly presents an answer to a historical question, while contextualizing the issue and presenting possible ideas for future studies. Viewshed (line-of-sight) analysis is of obvious and particular interest to military historians, but it has other implications as well. In particular, this chapter illustrates the phenomenal power of GIS to transport the reader to the past, and get a micro sense of “being” there.

Beyond thoroughly enjoying Placing History, I believe it’s an important contribution to the field of historical methodology in general, and (of course) historical GIS in particular. The compilation gives a wonderful balance while thoroughly exploring the topic: its current state and background, case studies ranging from micro to macro and “hard” to “soft”, discussions on theory and approach, and an outline for the future. I recommend the book to educators, historians, digital humanists, or anyone with even a passing interest in a growing and valuable area of scholarship.

Scattered Links – 7/20/2008

Bill Turkel wrote a thought-provoking post titled “Towards a Computational History.” I agree completely with his section on collective intelligence. A lot of digital history spells out the methodology of tools and technology, but the more theoretical shift in production and dissemination of information is of course equally important to the future of the field.

Eric Rauchway wrote a great article for The New Republic explaining why parallels between John McCain and Teddy Roosevelt fall flat.

One of the pleasant benefits of taking a break from school and having a 9-5 job (along with a peaceful 45 minute metro commute each way), is that I can read  a ton of books that aren’t assigned to me by a course syllabus. A Pulitzer Prize, along with some interesting blog reviews, have placed Daniel Walker Howe’s What Hath God Wrought onto my short list. For similar reasons, Kate Summerscale’s The Suspicions of Mr. Whicher has been added as well.

Finally, Matthias Schulz of Spiegel Online has an interesting article on how the myth of Cyrus II as a pioneer of human rights developed. Schulz attacks this particularly insidious piece of propaganda, and isn’t afraid to take issue with heavyweights such as the United Nations and Nobel Peace Prize recipient Shirin Ebadi. The historian in me appreciates his revisionism, but would just like to see his sources.

Review: Placing History (II)

(This is the second installment of my review of Placing History. See the first and the third parts)

I’ve just finished reading about half of Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship, edited by Anne Kelly Knowles. I’ll briefly go through each one, and focus on the ones that particularly interested me.

“Creating a GIS for the History of China,” by Peter K. Bol. Bol, chair of Harvard’s Department of East Asian Languages and Civilizations, discusses his China Historical GIS project. The project attempts to create a basic framework and data source (both spatial and temporal) for geospatial analysis of Chinese history. On a theoretical note, Bol argues that in the case of China, historical GIS should utilize a greater reliance on point data in place of polygons for marking boundaries and territory, in order to better replicate the top-down administrative system of traditional Chinese cartography.

“Teaching With GIS,” by the late Robert Churchill and Amy Hillier. Churchill gives a good overview of the value of GIS in a liberal arts education. I liked his point that one of the benefits of using historical GIS is that any in-depth use of the technology requires an equally in-depth understanding of the problem you’re looking to address. Great point. Because so much of GIS is front-loaded, in that you spend a huge amount of effort in obtaining and managing the data, it requires you to really get your hands dirty in the sources themselves. Hillier gives a lot of great examples of students’ work using historical GIS, mostly Philadelphia-based data. Some of them also included a great 1896 map by W.E.B. Du Bois detailing social class in the city. She also gives some useful tips for educators who want to incorporate GIS.

“Scaling the Dust Bowl,” by Geoff Cunfer. I loved this chapter. Cunfer follows up his previous research in Knowles first book, Past Time, Past Place, by additional analysis of the Dust Bowl. In this chapter, he takes on the common perception of the dust bowl as championed by Donald Worster’s Dust Bowl: The Southern Plains in the 1930’s. While some of Cunfer’s analysis supports Worster, he takes issue with Worster’s commonly-held assertion that the capitalistic over-development of lands for farming the major factor in the fabled 1930’s dust storms. Cunfer first demonstrates through spatial analysis that, although plow-up during the 1920’s did contribute to the Dust Bowl, it was in fact instances of drought that had a much more direct correlation.

He goes on to further his critique of the notion that the Dust Bowl was an extraordinary phenomena caused by human activity. By examining and mapping newspaper accounts of dust storms from the 19th century, along with storms after the 1930’s, he finds that “dust storms are a normal part of southern plains ecology, occurring whenever there are extended dry periods.” Although extensive plowing can enhance the problem, it was not “the sole and simple cause of the Dust Bowl.” Cunfer’s analysis succeeds on many different levels. First, I like the accessibility of it. There’s always a temptation to include too much in the final products, to show off the fruits of your hours and hours of labor. Instead, his maps are clear, uncluttered, and persuasive.  Second, I like the way he blended traditionally quantitative analysis tools (GIS) with qualitative historical research (newspaper accounts). He does a good job of highlighting this tension, and aptly warns of its danger, while explaining simply how he accomplished it. Third, his work is a great example of the “right way” to use new technology to both challenge and supplement traditional historical arguments, and in doing so, present an original and different narrative.

“‘A Map is Just a Bad Graph’: Why Spatial Statistics are Important in Historical GIS,” by Ian Gregory. This chapter was much more technical, and included scary words like “regression coefficients” and “heteroscedasticity.” Although statistics in particular, and math in general, is low down on my list of skills, I got a fair amount out of the chapter. I liked his critique of the traditional thematic map, which usually displays one type of data, and with usually one variable involved. Statistical analysis can go beyond simple thematic maps and really open up the powerful underbelly of GIS.

There are several more chapters that I am looking forward to reading and reviewing in a later post.

Review: Placing History (I)

(This is the first installment of my review of Placing History. See the second and the third parts.)

I finally got around to sitting down with Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship, edited by Anne Kelly Knowles. The book addresses the growing field of combining Geographic Information System (GIS) software with historical scholarship. ((Technical aside: GIS is a broad term for digital analysis of geographic information – most commonly used for making maps – that allows users to input, store, and analyze a huge range of spatial data in a mind-boggling number of ways. Personal aside: I have a been using GIS software for about two years – I employed it extensively in my undergrad research and took a geology class constructed almost entirely around ArcGIS analysis.)) Broken into chapters comprised of individual studies conducted by a variety of scholars, it’s the more modern version of Knowles’s 2002 volume, Past Time, Past Place: GIS for History. The chapters of Placing History range from the quantitatively analytical “Scaling the Dust Bowl” and “Mapping Husbandry in Concord: GIS as a Tool for Environmental History,” to the more big-picture, theory-based “Combining Space and Time: New Potential for Temporal GIS.”

Originally I was planning on finishing all of the chapters before I posted a review of the volume as a whole, but I was too blown away by the introductory chapter of the book, “GIS and History,” written by Knowles. She gives both a wide-ranging and deep analysis of the field. Knowles begins the chapter with the optimistic assertion that, “scholars’ use of geographic information systems (GIS) is changing the practice of history.” From there she gives a brief history of the field, then delves into its current state. I think her greatest accomplishment in this chapter is to balance the obstacles to historical GIS with its huge potential for innovation and scholarship.

On the obstacle end, she writes that one major impediment to historical GIS is the fundamental divide of time vs. space – history is largely a study of subjects within a temporal framework, whereas GIS works largely within a spatial one. And she admits that, “For all practical purposes, historical GIS remains an ad hoc subfield that scholars discover serendipitously.” One reason may be a common complaint of historians concerning geography: that maps are too often seen as stand-alone, objective vessels of information. Instead, Knowles brings up the great point that any serious use of historical GIS requires rigorous examination and discovery of spatial source material, as much as any historian would need to employ in utilizing any diary, letter, or tax record for their research.

Nevertheless, Knowles does a great job of clearly outlining both the advances that have been made and the possibilities for the future. I agree with her basic outlines of the three types of historical GIS currently used:

1. History of land use and spatial economy, ex. outlining agricultural shifts in response to economic or environmental changes.

2. Reconstructing past landscapes, ex. analyzing Robert E. Lee’s line-of-sight (what he could see) during the Battle of Gettysburg.

3. Infrastructure projects, ex. scientists compiling historical landuse datasets in order to track global warming.

In actuality, though, it is nearly impossible to generalize the range of possibilities for historical GIS. The major constraint is really one of imagination and resources – are people aware of all its possibilities, and do they have access to the software/expertise. Finally, she struck a real personal chord in me with her observation that “The most exciting thing about historical GIS is often the ‘eureka’ moment when someone sees data mapped for the first time.” Much like discovering a long-sought after name or date or reference within a manuscript or microfilm, suddenly witnessing your hard work take a physical, visible shape on a computer screen is truly special.

I’m looking forward to reading the rest of the case studies and writing up a brief review, but for a superb introduction to the field of historical GIS, I couldn’t ask for anything better than what Knowles has produced in the opening chapter. At some point I would like to write a post solely dedicated to brainstorming ideas about the ways GIS could be utilized for history in particular and the humanities as a whole, in the vein of PhDinHistory’s blog post, “What I Would Like To See in Text Mining For Historians.”