{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Harvest data from Papers Past\n", "\n", "This notebooks lets you harvest large amounts of data for Papers Past (via DigitalNZ) for further analysis. It saves the results as a CSV file that you can open in any spreadsheet program. It currently includes the OCRd text of all the newspaper articles, but I might make this optional in the future — thoughts?\n", "\n", "You can edit this notebook to harvest other collections in DigitalNZ — see the notes below for pointers. However, this is currently only saving a small subset of the available metadata, so you'd probably want to adjust the fields as well. Add an [issue on GitHub](https://github.com/GLAM-Workbench/digitalnz/issues) if you need help creating a custom harvester.\n", "\n", "There's only two things you **have** to change — you need to enter your API key, and supply a search term. There are additional options for limiting your search results." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
If you haven't used one of these notebooks before, they're basically web pages in which you can write, edit, and run live code. They're meant to encourage experimentation, so don't feel nervous. Just try running a few cells and see what happens!.
\n", "\n", "\n", " Some tips:\n", "