The County Problem in the West

Happy GIS Day! Below is a version of a lightning talk I’m giving today at Stanford’s GIS Day.

Historians of the American West have a county problem. It’s primarily one of geographic size: counties in the West are really, really big. A “List of the Largest Counties in the United States” might as well be titled “Counties in the Western United States (and a few others)” – you have to go all the way to #30 before you find one that falls east of the 100th meridian. The problem this poses to historians is that a lot of historical data was captured at a county level, including the U.S. Census.

521px-Map_of_California_highlighting_San_Bernardino_County.svg
San Bernardino County

San Bernardino County is famous for this – the nation’s largest county by geographic area, it includes the densely populated urban sprawl of the greater Los Angeles metropolis along with vast swathes of the uninhabited Mojave Desert. Assigning a single count of anything to San Bernardino county to is to teeter on geographic absurdity. But, for nineteenth-century population counts in the national census, that’s all we’ve got.

TheWest_1871_Population-01-01

Here’s a basic map of population figures from the 1870 census. You can see some general patterns: central California is by far the most heavily populated area, with some moderate settlement around Los Angeles, Portland, Salt Lake City, and Santa Fe. But for anything more detailed, it’s not terribly useful. What if there was a way to get a more fine-grained look at settlement patterns in these gigantic western counties? This is where my work on the postal system comes in. There was a post office in (almost) every nineteenth-century American town. And because the department kept records for all of these offices – the name of the office, its county and state, and the date it was established or discontinued – a post office becomes a useful proxy to study patterns over time and space. I assembled this data for a single year (1871) and then wrote a program to geocode each office, or to identify its location by looking it up in a large database of known place-names. I then supplemented it with the the salaries of postmasters at each office for 1871. From there, I could finally put it all onto a map:

TheWest_1871_PostOffices

The result is a much more detailed regional geography than that of the U.S. Census. Look at Wyoming in both maps. In 1870, the territory was divided into five giant rectangular counties, all of them containing less than 5,000 people. But its distribution of post offices paints a different picture: rather than vertical units, it consisted largely of a single horizontal stripe along its southern border.

Wyoming_census-02   Wyoming_postoffices-02

Similarly, our view of Utah changes from a population core of Salt Lake City to a line of settlement running down the center of the territory, with a cluster in the southwestern corner completely obscured in the census map.

Utah_census-01   Utah_postoffices-01

Post offices can also reveal transportation patterns: witness the clear skeletal arc of a stage-line that ran from the Oregon/Washington border southeast to Boise, Idaho.

Dalles_Boise

Connections that didn’t mirror the geographic unit of a state or county tended to get lost in the census. One instance of this was the major cross-border corridor running from central Colorado into New Mexico. A map of post offices illustrate its size and shape; the 1870 census map can only gesture vaguely at both.

ColoradoNewMexico_census-02   ColoradoNewMexico_postoffices-02

The following question, of course, should be asked of my (and any) map: what’s missing? Well, for one, a few dozen post offices. This speaks to the challenges of geocoding more than 1,300 historical post offices, many of which might have only been in existence for a single year or two. I used a database of more than 2 million U.S. place-names and wrote a program that tried to account for messy data (spelling variations, altered state or county boundaries, etc.). The program found locations for about 90% of post offices, while the remaining offices I had to locate by hand. Not surprisingly, they were missing from the database for a reason: these post offices were extremely obscure. Finding them entailed searching through county histories, genealogy message boards, and ghost town websites – a process that is simply not scalable beyond a single year. By 1880, the number of post offices in the West had doubled. By 1890, and it doubled again. I could conceivably spend years trying to locate all of these offices. So, what are the implications of incomplete data? Is automated, 90% accuracy “good enough”?

What else is missing? Differentiation. The salary of a postmaster partially addresses this problem, as the department used a formula to determine compensation based partially on the amount of business an office conducted. But it was not perfectly proportional. If it was, the map would be one giant circle covering everything: San Francisco conducted more business than any other office by several orders of magnitude. As it is, the map downplays urban centers while highlighting tiny rural offices. A post office operates in a kind of binary schema: no office, no people (well, at least very few). If there was an office, there were people there. We just don’t know how many. The map isn’t perfect, but it does start to tackle the county problem in the West.

*Note: You can download a CSV file containing post offices, postmaster salaries, and latitude/longitude coordinates here.*

Story of a Thesis, Part 4: Maturation

(This is the fourth installment of a multi-part post detailing my undergraduate thesis. See part one, part two, and part three.)

As I continued my research into Venture Smith’s life as a free man, GIS allowed me to construct a visual narrative of his forty years in Haddam, CT. After reconstructing each of his real-estate transactions, I was left with a surprisingly nuanced and revealing portrait of a free black man riding the tumultuous waves of Revolutionary and post-Revolutionary New England. In this post I will present a selected sample of some of these recreated transactions, and briefly discuss what they reveal about Smith’s life and the broader world in which he lived.

By 1778, Smith had gone from the owner of a meager ten-acre parcel of low-quality land, to the proprietor of a sprawling 128 acres:

Venture Smith Property - 1778
Venture Smith Property - 1778

The fifty year-old Smith faced the enviable prospect of having simply too much land to effectively use. Consequently, later that year he sold a twelve-acre tract to two free black men named Whacket and Peter:

Sale to Whacket and Peter - 1778

In effect, the modest real-estate transaction provided Smith with four additional laborers for his land (Whacket, Peter, and their two wives), while allowing him to recreate a semblance of black communal life in an overwhelmingly white town and region. On several occasions in his narrative, Smith mentions buying the freedom of slaves. In exchange, the men would work under Smith for a period of time. His sale of land to Whacket and Peter marks yet another instance of utilizing black labor, possibly in the tradition of “pawnship,” a West African practice described by Paul Lovejoy and David Richardson. The 1778 real-estate transaction offers a glimpse into both the economic and social motivations of a black man deftly maneuvering within a white world.

Another revealing transaction occurred in 1787, when Smith embarked on a joint business venture with a local man named William Ackley. Smith leased a small island in the nearby Salmon River to Ackley, and in the deed, spelled out with precise detail a contract for the two men to construct a fishing seine on the island. The enterprise was divided equally, with each man supplying half the labor and equipment, including lead, hair for ropes, twine for nets, a boat, and general repairs. In attempting to geographically locate this deed, I turned yet again to GIS. The deed spelled out its locations as “off of Beaver Point.” After finding a nineteenth-century map that labeled Beaver Point, I knew roughly where the island was. Unfortunately, the GIS datasets I had been using didn’t adequately portray the island. This time, I employed aerial photographs of the region in order to locate the island:

Lease to William Ackley - 1787

As is often the case, GIS offered up as many questions as answers: The island wasn’t entirely contiguous to his property, so how did Smith come to own its leasing rights? Was it a clause within a previous land deed, or was it an entirely separate transaction? I never found answers to these questions, and this investigative process provided me with the valuable (and frustrating) realization of the limits of historical inquiry. Instead, what the transaction did reveal was the phenomenally diverse activities of an independent property owner in the eighteenth-century. Beyond fishing in the river, Smith engaged in prolific woodcutting, tended an orchard, raised livestock, and engaged in trade throughout southern Connecticut and Long Island Sound.

Of course, as with almost any rural inhabitant with a large tract of land, Smith was a farmer. In order to investigate his agricultural pursuits (which both deeds and court files allude to), I looked for farming data I could use in GIS. Fortunately, the US Department of Agriculture created an extensive dataset of soil quality data for the state of Connecticut. I imported this data into GIS and overlaid it onto Smith’s property, creating a precise summary of Smith’s agricultural activity:

Soil Quality Data of Venture Smith's Property - 1790

I filtered the data to isolate only high-quality soil well-suited for farming. Two factors go into this characterization: the slope of the land, and content of the soil. Armed with this information, I found that Smith enjoyed a particularly rich area of farmland in the upper region of his property. I could personally attest to the suitability of this area, as I had walked through it several times:

The land lent Smith with several advantages. It was a short walk to both his homestead and the Salmon River, allowing for easy access, storage, and transportation of goods. In lieu of employing an official currency (which was famously wracked with inflation at the time), Smith, along with much of the rural populace, often utilized goods and produce as a means of exchange. His lucrative pasture provided him and his family with not only sustenance, but a means of obtaining goods with which to participate in the regional economy. As a black man and former slave, his land granted him with a critical foothold within the dominant economic framework of rural New England.

Without GIS, I never would have been able to effectively analyze the relationship between Venture Smith’s freedom and his property. Instead of making abstract conjectures based solely on written primary documents, I was able to add a visual and quantitative element to my investigation. Suddenly I could answer with precise and revealing detail the questions of where, what, how much, and to what degree. I could now recognize patterns, rebuild processes, and craft a visual construction of Smith’s land. With GIS at my fingertips, “ten acres of land” no longer a set of words on a yellowed property deed, but became a deeply nuanced story of where and what the land consisted of, how it reflected and revealed Venture Smith’s motivations and decisions as a free man, and finally his place within the wider world of late eighteenth-century rural New England.

Story of a Thesis, Part 3: Growth

(This is the third installment of a multi-part post detailing my undergraduate thesis. See part one, part two, and part four.)

When faced with the challenge of exploring the real-estate transactions and land holdings of Venture Smith, I ran up against the methodological barrier of analog technology. Reading the boundary descriptions were not enough. Neither was drawing them out by hand on a piece of paper. I needed the accuracy, fluidity, and versatility of a digital environment. This challenge led me down the path towards GIS, and I introduced myself to Beverly Chomiak, a geology professor at Connecticut College who kindly let me into her computer lab and showed me the basics of the software.

It was overwhelming at first, and the simple polygons I created in the beginning felt a lot like a student driver inching their way around an empty parking lot in a Porsche. I could literally feel the power of the software, as the computer’s hard drive frantically whirred and spun just to boot up the program. But what I was doing with it was almost comically simple. As I grew more comfortable with the interface, I began to explore, and soon hit that eureka moment of placing a series of puzzle pieces together: by creating polygons of neighboring parcels and overlaying them onto a basic map of the general area where I knew his property was located, I could place his first purchase in Haddam:

Once I had those pieces in place, I quickly learned it was a matter of finding data to add to the system. Next up was a hydrography layer, which gave phenomenally detailed information about various bodies of water across the state:

Specifically, this layer revealed something important: Venture Smith’s first purchase in the town, besides being small and narrow, had its eastern portion in a swampy marsh called Dibble’s Creek. Back at Pomona, I enlisted the generous help of Warren Roberts, GIS specialist at the Claremont Colleges’ library. He suggested I look into topography layers, and showed me how to obtain a Digital Elevation Model (DEM) for the region online. DEMs allow the user to minutely examine the elevation and slope of the land, and the regional DEM for Haddam, CT turned out to be exceedingly well-detailed:

Like the hydrography layer, this additional information provided another insight into the quality of his land for that first 1775 purchase: it was incredibly hilly, especially in the eastern portion near the river. Warren also showed me how to create an elevation profile, as if one were walking from west to east across the narrow parcel:

The end conclusion was that this piece of land was not particularly valuable, especially for agriculture: marshy on one end, hilly in the middle, and with a steep bank on the other side. This evidence, supported by a clause within the deed itself, pointed towards Smith using this first parcel of real estate for two purposes: as a spatial placeholder within the town, and as a base of operations for his more lucrative pursuit: cutting timber.

Warren then taught me how to use the DEM data to render a beautiful, shaded effect. Using the elevation data, GIS can create an artificial light source and “raise up” the land to create shadows and highlights. After getting in touch with my artistic background and playing around with transparency, topo lines, and color schemes, I managed to create something that I thought looked pretty good:

While I had spent countless hours examining Smith’s land, both on a computer screen and through on-site exploration, I realized that anyone reading my thesis would have only their imaginations and my flat, two-dimensional maps with which to recreate his property holdings. Fortunately, the seemingly limitless toolkit of GIS allowed me build a 3-D tour of the land through the ever-handy DEM data:

[youtube=http://www.youtube.com/watch?v=iL28FO5WWcQ]

Beyond creating pretty pictures of Smith’s first two property transactions that I could later use as a visual supplement, GIS allowed for in-depth historical analysis of Venture Smith’s real estate. Without this tool, I would have no idea what his first ten-acre purchase actually consisted of. Instead, I knew that it was poor land, and with this knowledge, the technology gave me a glimpse into the motivations and perspective of the middle-aged Smith during those first two years in Haddam. It allowed me to recreate his experience: cutting wood on the side of a hill, moving his timber down the steep embankment and onto the cart path mentioned in the deed, and stockpiling it for transport downriver to a town market center. Using GIS, I knew that these first years were more precarious than Smith let on in his narrative. He faced the daunting prospect of providing for a wife and children, one of whom was a newborn, moving to a new town as black outsiders, and settling onto a narrow strip of land with little to no value. All of this occurred against the backdrop of a quickly-erupting war between the colonies and Great Britain. Armed with the toolkit of GIS and peering through the lens of property and land, I was ready to construct my own narrative of Venture Smith’s life as a free man.

Review: Placing History (III)

(This is the third installment of my review of Placing History. See the first and the second parts.)

I’ve finally finished Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship. As my previous posts have made clear, I’m quite impressed with the breadth and depth of the compilation. As before, I’ll briefly recount the remaining chapters, and wrap up my thoughts at the end.

“Mapping Husbandry in Concord: GIS as a Tool for Environmental History,” by Brian Donahue. I liked this chapters for a multitude of reasons. On a personal note, his research is quite similar (though wider in scale) to the work I did in mapping property holdings and transactions of Venture Smith. So in a self-congratulatory mood, I found myself nodding with satisfied agreement at his various points about the benefits and drawbacks to mapping land deeds and parcels. On a less personal level, I liked the various angles he took in pursuing his study of Concord – especially examining seemingly disparate holdings of a variety of original families and noting patterns of land use.

“Combining Space and Time: New Potential for Temporal GIS,” by Michael Goodchild. For starters, the cover illustration for this chapter was a piece of Charles Minard’s famous “Carte Figurative,” which depicts a staggering array of geographic, temporal, and statistical information regarding Napoleon’s ill-fated Russian campaign:

Charles Minard
Charles Minard's "Carte Figurative"

Information graphic guru Edward Tufte described it as “the best statistical graphic ever drawn,” which effectively canonized it for any map and information graphic nerd such as myself. This is a roundabout way of saying I was excited to start reading Goodchild’s chapter. Goodchild doesn’t dissapoint, as he uses decades of geography experience to explore ways in which the field is gradually shifting to incorporate temporal data. Although its heavy on technical geography, it’s a rewarding chapter that covers one of the fundamental challenges of historical GIS: how do you visually display the relationship between space and time? Goodchild predicts that this challenge will rapidly diminish, as tools and systems to display things such as dynamic data, or even a history-specific model, will become more and more accessible and widespread.

“New Windows on the Peutinger Map of the Roman World,” by Richard J.A. Talbert and Tom Elliot. Talbert and Elliott present an analysis of the Peutinger Map, a nearly 7 meter long Roman map depicting the Mediterranean world and beyond, constructed around 300 CE:

Detail of Peutinger's Map
Detail of Peutinger's Map

I liked this chapter a lot, despite my complete unfamiliarity with the subject matter. The authors make compelling arguments backed by GIS analysis, such as: “the basis of the map’s design was not its network of land routes (as has always been assumed) but rather the shorelines and principal rivers and mountain ranges, together with the major settlements marked by pictorial symbols.” They present a quantitative analysis of routes, and utilize a histogram to further examine the segments and their distances.

“History and GIS: Implications for the Discipline,” by David J. Bodenhamer. This chapter, along with the first chapter and conclusion, gives the best “big-picture” perspective on historical GIS. Bodenhamer describes the field of history as a whole, in particular elements of it that relate to spatial analysis. He believes that in order for GIS to become a valuable historical tool, “it must do so within the norms embraced by historians…” GIS is well-situated to do so, because it uses a format of presenting information (the map) that historians are already familiar with, and its visualization and integration of information makes it easier to display the complexity of historical interpretation. He also discusses the challenges to historical GIS. One point I really liked was that technology as a whole, and GIS in particular, often requires a level of precision that historical documents cannot display within “a technology that requires polygons to be closed and points to be fixed by geographical coordinates.” Other challenges range from the theoretical (ex. temporal analysis) to the practical (ex. learning a completely new discipline). Finally, he succinctly sums up one of the greatest challenges: “GIS does not strike many historians as a useful technology because we are not asking questions that allow us to use it profitably.” I could not have said it better myself – until historians begin to ask the type of questions that can be addressed through spatial analysis, GIS will likely remain a technological oddity within the discipline.

“What Could Lee See At Gettysburg?” Anne Kelly Knowles. This is probably one of the most accessible chapters in the book for a layperson. It combines an engaging narrative prose with rich, stylistic maps, and a “popular” subject matter (the Battle of Gettysburg). But more importantly, it clearly presents an answer to a historical question, while contextualizing the issue and presenting possible ideas for future studies. Viewshed (line-of-sight) analysis is of obvious and particular interest to military historians, but it has other implications as well. In particular, this chapter illustrates the phenomenal power of GIS to transport the reader to the past, and get a micro sense of “being” there.

Beyond thoroughly enjoying Placing History, I believe it’s an important contribution to the field of historical methodology in general, and (of course) historical GIS in particular. The compilation gives a wonderful balance while thoroughly exploring the topic: its current state and background, case studies ranging from micro to macro and “hard” to “soft”, discussions on theory and approach, and an outline for the future. I recommend the book to educators, historians, digital humanists, or anyone with even a passing interest in a growing and valuable area of scholarship.

Review: Placing History (II)

(This is the second installment of my review of Placing History. See the first and the third parts)

I’ve just finished reading about half of Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship, edited by Anne Kelly Knowles. I’ll briefly go through each one, and focus on the ones that particularly interested me.

“Creating a GIS for the History of China,” by Peter K. Bol. Bol, chair of Harvard’s Department of East Asian Languages and Civilizations, discusses his China Historical GIS project. The project attempts to create a basic framework and data source (both spatial and temporal) for geospatial analysis of Chinese history. On a theoretical note, Bol argues that in the case of China, historical GIS should utilize a greater reliance on point data in place of polygons for marking boundaries and territory, in order to better replicate the top-down administrative system of traditional Chinese cartography.

“Teaching With GIS,” by the late Robert Churchill and Amy Hillier. Churchill gives a good overview of the value of GIS in a liberal arts education. I liked his point that one of the benefits of using historical GIS is that any in-depth use of the technology requires an equally in-depth understanding of the problem you’re looking to address. Great point. Because so much of GIS is front-loaded, in that you spend a huge amount of effort in obtaining and managing the data, it requires you to really get your hands dirty in the sources themselves. Hillier gives a lot of great examples of students’ work using historical GIS, mostly Philadelphia-based data. Some of them also included a great 1896 map by W.E.B. Du Bois detailing social class in the city. She also gives some useful tips for educators who want to incorporate GIS.

“Scaling the Dust Bowl,” by Geoff Cunfer. I loved this chapter. Cunfer follows up his previous research in Knowles first book, Past Time, Past Place, by additional analysis of the Dust Bowl. In this chapter, he takes on the common perception of the dust bowl as championed by Donald Worster’s Dust Bowl: The Southern Plains in the 1930’s. While some of Cunfer’s analysis supports Worster, he takes issue with Worster’s commonly-held assertion that the capitalistic over-development of lands for farming the major factor in the fabled 1930’s dust storms. Cunfer first demonstrates through spatial analysis that, although plow-up during the 1920’s did contribute to the Dust Bowl, it was in fact instances of drought that had a much more direct correlation.

He goes on to further his critique of the notion that the Dust Bowl was an extraordinary phenomena caused by human activity. By examining and mapping newspaper accounts of dust storms from the 19th century, along with storms after the 1930’s, he finds that “dust storms are a normal part of southern plains ecology, occurring whenever there are extended dry periods.” Although extensive plowing can enhance the problem, it was not “the sole and simple cause of the Dust Bowl.” Cunfer’s analysis succeeds on many different levels. First, I like the accessibility of it. There’s always a temptation to include too much in the final products, to show off the fruits of your hours and hours of labor. Instead, his maps are clear, uncluttered, and persuasive.  Second, I like the way he blended traditionally quantitative analysis tools (GIS) with qualitative historical research (newspaper accounts). He does a good job of highlighting this tension, and aptly warns of its danger, while explaining simply how he accomplished it. Third, his work is a great example of the “right way” to use new technology to both challenge and supplement traditional historical arguments, and in doing so, present an original and different narrative.

“‘A Map is Just a Bad Graph’: Why Spatial Statistics are Important in Historical GIS,” by Ian Gregory. This chapter was much more technical, and included scary words like “regression coefficients” and “heteroscedasticity.” Although statistics in particular, and math in general, is low down on my list of skills, I got a fair amount out of the chapter. I liked his critique of the traditional thematic map, which usually displays one type of data, and with usually one variable involved. Statistical analysis can go beyond simple thematic maps and really open up the powerful underbelly of GIS.

There are several more chapters that I am looking forward to reading and reviewing in a later post.

Review: Placing History (I)

(This is the first installment of my review of Placing History. See the second and the third parts.)

I finally got around to sitting down with Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship, edited by Anne Kelly Knowles. The book addresses the growing field of combining Geographic Information System (GIS) software with historical scholarship. ((Technical aside: GIS is a broad term for digital analysis of geographic information – most commonly used for making maps – that allows users to input, store, and analyze a huge range of spatial data in a mind-boggling number of ways. Personal aside: I have a been using GIS software for about two years – I employed it extensively in my undergrad research and took a geology class constructed almost entirely around ArcGIS analysis.)) Broken into chapters comprised of individual studies conducted by a variety of scholars, it’s the more modern version of Knowles’s 2002 volume, Past Time, Past Place: GIS for History. The chapters of Placing History range from the quantitatively analytical “Scaling the Dust Bowl” and “Mapping Husbandry in Concord: GIS as a Tool for Environmental History,” to the more big-picture, theory-based “Combining Space and Time: New Potential for Temporal GIS.”

Originally I was planning on finishing all of the chapters before I posted a review of the volume as a whole, but I was too blown away by the introductory chapter of the book, “GIS and History,” written by Knowles. She gives both a wide-ranging and deep analysis of the field. Knowles begins the chapter with the optimistic assertion that, “scholars’ use of geographic information systems (GIS) is changing the practice of history.” From there she gives a brief history of the field, then delves into its current state. I think her greatest accomplishment in this chapter is to balance the obstacles to historical GIS with its huge potential for innovation and scholarship.

On the obstacle end, she writes that one major impediment to historical GIS is the fundamental divide of time vs. space – history is largely a study of subjects within a temporal framework, whereas GIS works largely within a spatial one. And she admits that, “For all practical purposes, historical GIS remains an ad hoc subfield that scholars discover serendipitously.” One reason may be a common complaint of historians concerning geography: that maps are too often seen as stand-alone, objective vessels of information. Instead, Knowles brings up the great point that any serious use of historical GIS requires rigorous examination and discovery of spatial source material, as much as any historian would need to employ in utilizing any diary, letter, or tax record for their research.

Nevertheless, Knowles does a great job of clearly outlining both the advances that have been made and the possibilities for the future. I agree with her basic outlines of the three types of historical GIS currently used:

1. History of land use and spatial economy, ex. outlining agricultural shifts in response to economic or environmental changes.

2. Reconstructing past landscapes, ex. analyzing Robert E. Lee’s line-of-sight (what he could see) during the Battle of Gettysburg.

3. Infrastructure projects, ex. scientists compiling historical landuse datasets in order to track global warming.

In actuality, though, it is nearly impossible to generalize the range of possibilities for historical GIS. The major constraint is really one of imagination and resources – are people aware of all its possibilities, and do they have access to the software/expertise. Finally, she struck a real personal chord in me with her observation that “The most exciting thing about historical GIS is often the ‘eureka’ moment when someone sees data mapped for the first time.” Much like discovering a long-sought after name or date or reference within a manuscript or microfilm, suddenly witnessing your hard work take a physical, visible shape on a computer screen is truly special.

I’m looking forward to reading the rest of the case studies and writing up a brief review, but for a superb introduction to the field of historical GIS, I couldn’t ask for anything better than what Knowles has produced in the opening chapter. At some point I would like to write a post solely dedicated to brainstorming ideas about the ways GIS could be utilized for history in particular and the humanities as a whole, in the vein of PhDinHistory’s blog post, “What I Would Like To See in Text Mining For Historians.”