{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "119ccc36",
   "metadata": {},
   "source": [
    "# RecordSearch\n",
    "\n",
    "Current version: [v1.1.1](https://github.com/GLAM-Workbench/recordsearch/releases/tag/v1.1.1)\n",
    "\n",
    "This repository contains Jupyter notebooks to work with data from the National Archives of Australia's RecordSearch database.\n",
    "\n",
    "[RecordSearch](https://recordsearch.naa.gov.au/) is the online collection database of the National Archives of Australia. Based on the [series system](https://www.naa.gov.au/help-your-research/getting-started/commonwealth-record-series-crs-system), RecordSearch provides rich, contextual information about series, items, agencies, and functions.\n",
    "\n",
    "Unfortunately RecordSearch doesn't provide access to machine-readable data through an API, so we have to resort to screen scraping. The notebooks here make use of the [RecordSearch Data Scraper](https://wragge.github.io/recordsearch_data_scraper/).\n",
    "\n",
    "See the [RecordSearch section](https://glam-workbench.net/recordsearch/) of the GLAM Workbench for more details.\n",
    "\n",
    "## Notebook topics\n",
    "\n",
    "### Harvesting data\n",
    "\n",
    "* [**Harvest items from a search in RecordSearch**](harvesting_items_from_a_search.ipynb) – save the results of an item search in RecordSearch as a downloadable dataset, you can also save images and PDFs from digitised files\n",
    "* [**Harvest files with the access status of 'closed'**](harvest_closed_files.ipynb) – find out what we're not allowed to see by harvesting details of 'closed' files\n",
    "* [**Harvest recently digitised files from RecordSearch**](harvest_recently_digitised_files.ipynb) – save details of files digitised in the past month\n",
    "* [**Harvest details of all series in RecordSearch**](harvest_series_data.ipynb) – get details of all series registered in RecordSearch, also generates a summary dataset with the total number of items digitised, described and in each access category\n",
    "* [**Harvesting functions from the RecordSearch interface**](harvesting_functions_from_recordsearch.ipynb) – extract information from the RecordSearch interface about the hierarchy of functions it uses to describe the work of government agencies\n",
    "* [**Harvest agencies associated with *all* functions**](get_all_agencies_by_function.ipynb) – loops through the list of functions saving details of the agencies associated with each\n",
    "\n",
    "### Analysing data\n",
    "\n",
    "* [**Exploring harvested series data, 2021**](series_harvest_basic_stats.ipynb) – generates some basic statistics from the harvest of series data\n",
    "* [**Exploring harvested series data, 2022**](series_harvest_basic_stats_2022.ipynb) – generates some basic statistics from the harvest of series data in 2022 and compares the results to the previous year\n",
    "* [**Summary of records digitised in the previous week**](recently_digitised_update.ipynb) – run this notebook to analyse the most recent dataset of recently digitised files, summarising the results by series\n",
    "* [**How many of the functions are actually used?**](how_many_functions_are_used.ipynb) – looks at the harvest of functions to see how many are actually in use\n",
    "* [**Who's responsible?**](display_agencies_by_function.ipynb) – pick a function to which which agencies are have been responsible for it over time\n",
    "\n",
    "### Useful tools\n",
    "\n",
    "* [**DIY Redaction Art Collages**](diy_redaction_collage.ipynb) – generates a random sample of ASIO redactions and packs them into one big image\n",
    "* [**Download the contents of a digitised file**](get_images_from_a_digitised_file.ipynb) – get a digitised files as a folder full of images\n",
    "* [**Get a list of agencies associated with a function**](get_agencies_associated_with_function.ipynb) - pick a function and create a downloadable list of agencies responsible for it\n",
    "* [**DFAT Cable Finder**](Find_cables.ipynb) – helps you find numbered cables created by DFAT\n",
    "\n",
    "## Data downloads\n",
    "\n",
    "* [Summary data about all series in RecordSearch, May 2021](https://github.com/GLAM-Workbench/recordsearch/blob/master/series_totals_May_2021.csv) (15mb CSV) – contains basic descriptive information about all the series currently registered on RecordSearch (May 2021) as well as the total number of items described, digitised, and in each access category.\n",
    "* [Summary data about all series in RecordSearch, April 2022](https://github.com/GLAM-Workbench/recordsearch/blob/master/series_totals_April_2022.csv) (15mb CSV) – contains basic descriptive information about all the series currently registered on RecordSearch (May 2021) as well as the total number of items described, digitised, and in each access category.\n",
    "* [Recently digitised files](https://github.com/GLAM-Workbench/recordsearch/blob/master/data/recently-digitised-20210327) (CSV) – containing details of files digitised between 25 February and 26 March 2021, for an ongoing record of digitised files see [this repository](https://github.com/wragge/naa-recently-digitised) which creates weekly snapsots."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b6ae58b",
   "metadata": {},
   "source": [
    "## Cite as\n",
    "\n",
    "See the GLAM Workbench or [Zenodo](https://doi.org/10.5281/zenodo.3544753) for up-to-date citation details.\n",
    "\n",
    "----\n",
    "\n",
    "This repository is part of the [GLAM Workbench](https://glam-workbench.github.io/).  \n",
    "If you think this project is worthwhile, you might like [to sponsor me on GitHub](https://github.com/sponsors/wragge?o=esb)."
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "cell_metadata_filter": "-all"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}