Know Your Database

If you are interested in reading more about digital humanities, check out my other blog at

Over the last week I finally got a chance to try out the tools that Wragge (aka Tim Sherratt) has devised to mine digitised historic Australian newspapers accessed through Trove. This post is about the results of applying his tools.  If you want to do this yourself check out Wragge’s posts, Mining the Treasures of Trove (Part 1) and (Part 2). Firstly let’s look at Wragge’s graph of a topic that I have been writing about this year – floods.

Graph of the occurrence of the word "flood" in Australian newspapers, from the early 19th century to the late 1950s

Wragge's graph of the occurrence of the word "flood" in Australian newspapers since the early 19th century.

Wragge has produced the graph above  showing the occurrence of the word “floods” in Australian newspapers digitised and accessible on the Trove website.  As we would expect the word is mentioned more in years when there was severe flooding such as 1893.

Go ahead and click on the graph.  You will be taken to the original that Wragge has produced.  This is not an ordinary graph. Click on the blue line at 1893. When you do this a list appears on the right hand side of some of the articles that contain this word. You can click on one of these articles and read it in Trove, or you can scroll to the bottom of the list and click on the link to view the search results for the particular word in Trove.  Take a moment to explore the other graphs that Wragge has created.

I couldn’t wait to try this technique out.  So for the last week or so I have been carefully following Wragge’s instructions and generating data in order to create my own graphs.  Wragge used jqPlot to write his graphs.  This is on my “to do” list to learn.  For now I have to be satisfied with a simple excel graph.  I decided to graph another topic that I have been researching – secular education.

Graph of ratio of articles mentioning "secular education" to total articles published.

Originally I had graphed the total number of articles published with the phrase “secular education”.  This obscured discussion of secular education in states with a smaller number of publications such as Tasmania.  Fortunately Wragge’s tool also produces data on the ratio of articles published.

What does this graph tell us?  Originally I produced a graph to 1939, but this shows that articles which mention “secular education” are fairly insignificant in terms of the total number of articles published after World War I.  The graph above indicates that discussion of “secular education” was most vigorous in the nineteenth century.   I am most familiar with the debates about secular education in the latter half of that century.  The peaks in the graph for this period make sense to me.  There were no articles in the Trove collection containing the words “secular education” prior to 1835 – again not surprising as the discussion about education in early colonial times was different in nature to that of later years.

While I knew that South Australia had a significant debate about secular education in the late 1860s and the 1870s like most colonies, this graph indicates that the debate their occupied a much greater proportion of newspaper articles than in other colonies.  This graph also highlights press comment about secular education from 1844-1859 which I am curious to explore.

“But…” I hear you say?  Yes pretty graphs do not tell the full story.  There are numerous shortcomings that we need to keep in mind.  In a recent post on Goose Commerce we were given a timely reminder of how important it is for historians to understand what is not included in a primary source database.  And there is much that is absent from the collection of digitised Australian newspapers at the moment.  Here are a few newspapers that are currently missing:

  • Major metropolitan newspapers such as Melbourne’s The Age;
  • Newspapers produced by Australia’s early labour movement such as The Worker;
  • Religious newspapers such as The Australian Christian World; and
  • Numerous rural newspapers.

The list could go on.  Of the titles that are listed in the Australian newspaper database, not all issues have been digitised.  This graph then does not represent all newspapers that have been published in Australia.  This is a reminder that we must not let ease of use dictate which sources that we review.  We still need to spend time using microfilm readers!

Neither does the graph take into account the relative influence of the various newspapers.  A mention of secular education in a newspaper with a small circulation carries the same weight as one in a newspaper that was very influential.  Hence this graph does not measure how pervasive discussion about secular education was in society.  It points us in a certain direction, but we still need to use more traditional research methods before we can reach any conclusions about the extent and significance of the issue.

I see this type of analysis as a starting point for research not an end in itself.  It can also help us if we are in the midst of research.  Presenting data in a different way leads us to think about it differently, to ask questions we may not have otherwise asked.  Good research starts when the historian asks innovative questions.  Wragge’s tools can assist historians to do this.  For this reason I am grateful for his contribution.

Wragge’s Tools for Trove

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.