This is part 2 of a 3-part blog series introducing the new historical business directories that you can explore in the Keweenaw Time Traveler.
Last week, we discussed general information on historic city and business directories. This week, we want to shift our focus to specific historic directories and how our team mapped residents in the Keweenaw Time Traveler Explore App.
What Copper Country directories exist, and where can you read them?
Our directories were published by two different printers. The earliest set of records are A. H. Holland’s Handbook and Guide to Hancock & Houghton, Michigan which was published for 1887-1888.
By 1895-1896, R. L. Polk and Company began publishing their Houghton County Directory, replacing Holland, and continued publishing through 1939.
The University Archives and Historical Collections, housed in the Van Pelt and Opie Library’s Garden Atrium at Michigan Technological University, maintains the following collection of City Directories :
Who was R. L. Polk?
Ralph Lane Polk was born on September 12, 1849 in Ohio and later educated in New Jersey. He became a successful publisher in Detroit, Michigan specializing in city business directories and state gazettes (Herringshaw 1914).
Herringshaw, Thomas William. Herringshaw’s National Library of American Biography: Contains Thirty-Five Thousand Biographies of the Acknowledged Leaders of Life and Thought of the United States; Illustrated with Three Thousand Vignette Portraits ... American Publishers’ Association, 1914.
The records within the Keweenaw Time Traveler are not static collections. We constantly discover new records for inclusion in the databases, and we constantly revise our understanding of the past as we incorporate these sources. We maintain a list of archival collections included in the Time Traveler databases.
How did we map the directory entries?
A single directory could easily exceed 1,200 pages, and preparing each for inclusion in the Time Traveler was a multi-step process requiring a great deal of time, money, and expertise.
1. Digitization - The scanning of pages into digital form and performing optical character recognition (OCR)
2. Cleaning data - Adjusting text files prior to parsing such as putting multiple line addresses into a single line
3. Parse directory data - Parsing means analyzing a string of symbols to conform to the rules of formal grammar which enables a program to understand an address format
4. Perform automated geocoding - Automated geocoding is the computerized process of converting addresses into x, y coordinates for mapping
5. Perform manual geocoding - Manual geocoding is the review of our matches for accuracy
Team members including Gary Spikberg, Ankitha Pille, Elijah Pass, Dr. Robert Pastel, and Dr. Don Lafreniere presented a poster at the 2017 Society of Industrial Archaeologists Conference explaining the methodology used to include directories within the Time Traveler. The poster is available for download in a high quality pdf here.
As useful as OCR is, it is not perfect. Each entry was reviewed although directories contain millions of characters making it difficult to locate obscure OCR errors.This is the reason you will occasionally find a strange character in an entry such as the number “1” in place of the letter “I.” Our Citizen Historians help make the Time Traveler better by finding these errors during their research. Please email us any records which you find containing strange characters. By providing detailed information, or even a screen capture of the overall entry, you help us correct these rare OCR issues.
Matching years for Sanborn maps and historic directories insured the same address sequences were used. Visitors will encounter different addresses for the same house depending on the map year, and it was important to avoid false-positive address matches as we developed the Time Traveler databases. Below is an example of how we match a 1917 business directory record to the historic Sanborn Fire Insurance Plan for Calumet in 1917.
The process is full of challenges. Addresses listed in a directory might be incomplete or omitted which meant there were instances when we could not confidently associate a person with a building. When that happened, we used the closest intersection or street based on the information listed in the directory. An address listed in the middle of the street means we lackied a complete enough address to connect someone with a specific building.
Preparing the directories for the Time Traveler represented one of our biggest expenses. It took more than 10 months to prepare these archival records for your use. The scanning, OCR, geolocating, quality inspection, and documentation all required extensive work. The process was a labor of love with plenty of challenges and lessons learned as we developed work flows and best practices. Through it all, we continually strove to build a Time Traveler which our communities would be proud of. We want to thank the National Endowment for the Humanities and Michigan Technological University's Department of Social Sciences for the financial support in building the Keweenaw Time Traveler.
What is next?
In post 3, we will give more details on the most recent records we added to the Time Traveler, business directories! We will also include examples of questions we might ask to critique our archival records.