New Arrivals    Books    Archival Products   Charts   Printing & Binding   News & How-To   Events   Contact Us  

Popular Categories

   Australia & New Zealand
   British Home Children
   Canada
      - Acadie, Acadian
      - New Brunswick
      - Newfoundland & Lab.
      - Nova Scotia
      - Ontario
      - Prince Edward Island
      - Quebec
      - Western Canada
      - First Nations, Metis
      - Military - Before 1920
      - Loyalists / UEL
      - Pioneers' Stories
      - Genealogy How-To
   England & Wales
   Ireland & Northern Ireland
   Scotland
   United States
      - American Revolution
   more countries...
   Genealogy How-To

   Archival & Acid-Free
   Conservation How-To

   Charts, Forms
   Magnifiers
   Gift Certificates


Popular Authors

   Thomas MacEntee
   Paul Milner
   Chris Paton
   Ron W. Shaw
   Gavin K. Watt


Popular Publishers

   Global Heritage Press
   Institute of Jesuit Studies
   MacDonald Research
   OGS - Ottawa Branch
   Unlock The Past



Search our store by topic, author, word or phrase:
News & How-To
Formerly branded as GlobalGazette.ca

Articles, press releases,and how-to information for everyone interested in genealogy and history

Subscribe to our free newsletter



What's distant reading?
Feb 02, 2017
By John D. Reid, Canada's Anglo-Celtic Connections


John D. Reid
What's distant reading? Only done by those with hypermetropia?

Distant reading is understanding not by studying particular texts, but by aggregating and analyzing massive amounts of textual data from a corpus.

Distant reading has evident limitations for genealogy where you need to pick through to find something particular, say, an obit of your great-grandmother. That's called, unsurprisingly, close reading.

What distant reading can do for the family historian is provide context. Your great grandmother died of influenza and you'd like to know if it was a year when the disease was prevalent.



You may be familiar with Google Ngram where you can explore how frequently a word or phrase has been used in a corpus of books over time. This example shows the profile for cholera in red and influenza in blue. There's a huge spike for cholera in 1884 and upticks for both in 1942. While there is an uptick for influenza in 1918/19 for the pandemic that killed more than 20 million, perhaps as many as 50 million, the Ngram peak is not as significant at in 1905. The problem for genealogy is the book corpus relates to the publication date which may not bear any relationship to current events. It is good for long term trends - try cigarette, aircraft, newspaper, radio, television.

Recent months have seen several articles published on distant reading using newspaper databases as the corpus. The British Newspaper Archive, Chronicling America and a Dutch newspaper database have all been explored. While newspapers cover current events there are still issues of representativeness as discussed in the article Bridging the gap between quantitative and qualitative research in digital newspaper archives.





The studies using the British Newspaper Archive have been conducted by a group from the University of Bristol led by Professor Nello Cristianini. A recent article, Content analysis of 150 years of British periodicals includes a diagram reproduced here with the bottom panels showing the difference between the frequencies through the years for cholera, influenza, smallpox and plague from newspapers (left) and books (right). Note the more prominent peak for influenza in the newspaper corpus in 1918.

That same article also gives a link where you can download a huge file giving the year-by-year frequency of occurrence of different Ngrams from the newspapers. It's a computer challenge. not for those unprepared to wrangle large data files. I've been experimenting with it. A graph produced for cholera, influenza and cancer is appended.

More later if and when time permits.

It should not go unremarked that there is no possibility of performing a similar analysis on Canadian newspapers lacking any national newspaper digitization program.





More resources from GlobalGenealogy.com:

History & Genealogy Books...
History and genealogy books, maps, CDs from a wide selection of publishers, including Global Heritage Press. Browse resources listed by country, location or topic.

Document & Artifact Preservation Products
Acid-free storage and display products to preserve and safely store your family heirloom documents and artifacts.

Printing & Binding Solutions
You've done the research, written the stories, gathered the photos and illustrations, and put them all together... Now let GlobalGenealogy.com produce a finished book you will be proud to call your own.

Family Tree Charts & Census Forms
Poster-size blank family tree charts, plus a variety of free blank letter-size charts and census forms





GlobalGenealogy.com Inc. 1992-2017
Sign up for our free newsletter!   |   Unsubscribe from our newsletter