{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualise Trove newspaper searches over time\n", "\n", "You know the feeling. You enter a query into [Trove's digitised newspapers](https://trove.nla.gov.au/newspaper/) search box and...\n", "\n", "![Trove search results screen capture](images/trove-newspaper-results.png)\n", "\n", "Hmmm, **3 million results**, how do you make sense of that..?\n", "\n", "Trove tries to be as helpful as possible by ordering your results by relevance. This is great if you aim is to find a few interesting articles. But how can you get a sense of the complete results set? How can you *see* everything? Trove's web interface only shows you the first 2,000 articles matching your search. But by getting data directly from the [Trove API](https://help.nla.gov.au/trove/building-with-trove/api) we can go bigger. \n", "\n", "This notebook helps you zoom out and explore how the number of newspaper articles in your results varies over time by using the `decade` and `year` facets. We'll then combine this approach with other search facets to see how we can slice a set of results up in different ways to investigate historical changes.\n", "\n", "1. [Setting things up](#1.-Setting-things-up)\n", "2. [Find the number of articles per year using facets](#2.-Find-the-number-of-articles-per-year-using-facets)\n", "3. [How many articles in total were published each year?](#3.-How-many-articles-in-total-were-published-each-year?)\n", "4. [Charting our search results as a proportion of total articles](#4.-Charting-our-search-results-as-a-proportion-of-total-articles)\n", "5. [Comparing multiple search terms over time](#5.-Comparing-multiple-search-terms-over-time)\n", "6. [Comparing a search term across different states](#6.-Comparing-a-search-term-across-different-states)\n", "7. [Comparing a search term across different newspapers](#7.-Comparing-a-search-term-across-different-newspapers)\n", "8. [Chart changes in illustration types over time](#8.-Chart-changes-in-illustration-types-over-time)\n", "9. [But what are we searching?](#9.-But-what-are-we-searching?)\n", "10. [Next steps](#10.-Next-steps)\n", "11. [Related resources](#11.-Related-resources)\n", "12. [Further reading](#12.-Further-reading)\n", "\n", "If you're interested in exploring the possibilities examined in this notebook, but are feeling a bit intimidated by the code, skip to the [Related resources](#11.-Related-resources) section for some alternative starting points. But once you've got a bit of confidence, please come back here to learn more about how it all works!\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
If you haven't used one of these notebooks before, they're basically web pages in which you can write, edit, and run live code. They're meant to encourage experimentation, so don't feel nervous. Just try running a few cells and see what happens!
\n", "\n", "\n", " Some tips:\n", "
Is this thing on? If you can't edit or run any of the code cells, you might be viewing a static (read only) version of this notebook. Click here to load a live version running on Binder.
\n", "\n", "\n", " | year | \n", "total_results | \n", "
---|---|---|
0 | \n", "1828 | \n", "2 | \n", "
1 | \n", "1830 | \n", "1 | \n", "
2 | \n", "1831 | \n", "1 | \n", "
3 | \n", "1832 | \n", "1 | \n", "
4 | \n", "1833 | \n", "2 | \n", "
\n", " | year | \n", "total_results | \n", "total_articles | \n", "proportion | \n", "
---|---|---|---|---|
0 | \n", "1828 | \n", "2 | \n", "7335 | \n", "0.000273 | \n", "
1 | \n", "1830 | \n", "1 | \n", "8977 | \n", "0.000111 | \n", "
2 | \n", "1831 | \n", "1 | \n", "10989 | \n", "0.000091 | \n", "
3 | \n", "1832 | \n", "1 | \n", "14814 | \n", "0.000068 | \n", "
4 | \n", "1833 | \n", "2 | \n", "15622 | \n", "0.000128 | \n", "
\n", " | ill_type | \n", "total_results | \n", "
---|---|---|
0 | \n", "Photo | \n", "6209770 | \n", "
1 | \n", "Illustration | \n", "2724048 | \n", "
2 | \n", "Cartoon | \n", "843536 | \n", "
3 | \n", "Map | \n", "300771 | \n", "
4 | \n", "Cartoons | \n", "57561 | \n", "
5 | \n", "Graph | \n", "31842 | \n", "
6 | \n", "Chart | \n", "20 | \n", "
7 | \n", "Unknown | \n", "9 | \n", "
8 | \n", "Diagram | \n", "5 | \n", "