What was advertised in a colonial American newspaper 250 years ago today?
“MRS. STEDMAN takes this method of returning her sincere thanks.”
Mrs. Stedman tells an interesting story (or, at least, parts of it) in her advertisement in the Georgia Gazette, but that story has been obscured by at least two factors. First, Stedman did not provide all of the details herself. Second, the quality of the original printing and subsequent photography and digitization makes the advertisement (and much of the rest of the issue) difficult to read. This second factor has ramifications when using digital surrogates to conduct research.
With some effort, the human eye can identify all the words in Stedman’s advertisement. Modern technologies, especially OCR (optical character recognition), often do not do as well when searching through digitized documents. I located Stedman’s advertisement because I read through the entire June 3, 1767, edition of the Georgia Gazette. I wondered, however, whether keyword searches would yield this advertisement. I’ve been frustrated in the past when I’ve had copies of advertisements from eighteenth-century newspapers that came from a particular database yet keyword searches to determine how many times those advertisements appeared did not even produce the originals although I knew very well that they did indeed exist.
I conducted an experiment with Stedman’s advertisement and the database in which I first encountered it, Readex’s America’s Historical Newspapers. What would happen if I did a keyword search for “Stedman,” limiting the date range to 1767 and considering only advertisements? This yielded only two results, both for “ALEX. STEDMAN” in the Pennsylvania Chronicle. The keyword search did not turn up an item that I knew very well had been printed in the Georgia Gazette and was part of the database!
I also recognized, however, that the problem may not have been the OCR but instead could have been a metadata error. What would happen in a keyword search for “Stedman” with the date range limited to 1767 but not specifying that advertisements were the only article types under consideration? That search yielded three results, two for “ALEX. STEDMAN” as above and one for “MRS. STEDMAN.” In this case, the problem was not the OCR but rather a human error made in coding the data. The advertisement had been mistakenly marked as a news item.
Still, I questioned the effectiveness of keyword searches for this particular advertisement because I have learned from experience that OCR often has difficulty with eighteenth-century print. I decided to try a search with an alternate keyword, one that might have been much more likely depending on the interests of the researcher: “tuition.” With the date range limited to 1767 and no specifications for particular article types, a keyword search for “tuition” yielded forty-seven results. It did not, however, uncover Stedman’s advertisement. A similar keyword search for “Stewart,” the woman Stedman recommended as her replacement, similarly did not pick up this particular advertisement among its 203 results.
This experiment demonstrated two types of problems to keep in mind when using digitized sources. OCR is not infallible. Human coding of data can also be flawed. Despite these shortcomings, digitized sources have revolutionized the way historians conduct research, the types of questions they ask, and the answers they find. As with any research tool or methodology, historians and others need to aware of both advantages and potential deficiencies.