{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Your first API request" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section we're going to learn how to send a request for information to the Trove API.\n", "\n", "API requests are just like normal urls. However, instead of sending us back a web page, they deliver data in a form that computers can understand. We can then use that data in our own programs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We're going to use the Python [Requests](http://docs.python-requests.org/en/master/) module to handle our API queries, so let's import it now." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Make the Requests module available\n", "import requests" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting an API key\n", "Any requests you make to the Trove API need to be authenticated with a 'key'. For non-commercial projects, you just fill out a simple form and your API key is generated instantly. Follow the instructions in the Trove Help to [obtain your own Trove API Key](http://help.nla.gov.au/trove/building-with-trove/api).\n", "\n", "Once you've created a key, you can access it at any time on the 'For developers' tab of your Trove user profile.\n", "\n", "Copy your API key now, and paste it in the cell below, between the quotes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This creates a variable called 'api_key', paste your key between the quotes\n", "api_key = ''\n", "\n", "# This displays a message with your key\n", "print('Your API key is: {}'.format(api_key))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparing parameters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All search queries to the Trove API start with the same base url. We'll save it as a variable here." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a variable called 'api_search_url' and give it a value\n", "api_search_url = 'https://api.trove.nla.gov.au/result'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Trove API queries are constructed by adding parameters to the base url. Most of the parameters are optional, but a few are mandatory:\n", "\n", "* `q` – 'q' for query, this is where search terms go\n", "* `zone` – which Trove zone (or zones) do you want to search, use 'all' for everything\n", "* `key` – your Trove API key\n", "\n", "If you don't want to specify a search term, you can just use a space or a plus sign – ' ' or '+' – as the value for `q`. Of course, this means that you're asking for *everything*, so Trove might take a bit longer to respond.\n", "\n", "We'll meet some other parameters later, but for now let's create a Python dictionary to store our basic parameters. The `requests` library will take this dictionary, turn it into a string, and add it to the base url.\n", "\n", "For our first API request we're going to search Trove's digitised newspapers, so we'll assign the value 'newspaper' to the `zone` parameter. Feel free to edit the `q` value to search for something that interests you.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This creates a dictionary called 'params' and sets values for the API's mandatory parameters\n", "params = {\n", " 'q': 'cyclone', # Search for this keyword -- feel free to change!\n", " 'zone': 'newspaper', # Search in the newspaper zone\n", " 'key': api_key\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The default output of the API is XML. For most applications it's easier to work with JSON. You set this using the `encoding` parameter. Let's add this into `params` and view the result." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This adds a value for 'encoding' to our dictionary\n", "params['encoding'] = 'json'\n", "\n", "# Let's view the updated dictionary\n", "params" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making a request" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, we're now now ready to make our first query!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# This sends our request to the Trove API and stores the result in a variable called 'response'\n", "response = requests.get(api_search_url, params=params)\n", "\n", "# This shows us the url that's sent to the API\n", "print('Here\\'s the formatted url that gets sent to the Trove API:\\n{}\\n'.format(response.url)) \n", "\n", "# This checks the status code of the response to make sure there were no errors\n", "if response.status_code == requests.codes.ok:\n", " print('All ok')\n", "elif response.status_code == 403:\n", " print('There was an authentication error. Did you paste your API above?')\n", "else:\n", " print('There was a problem. Error code: {}'.format(response.status_code))\n", " print('Try running this cell again.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See how `requests` has taken our parameters and turned them into a string with '&' between each one? \n", "\n", "The url above is live – try clicking on it to see the raw results from Trove." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Look at the results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `response` variable contains all the data returned to us by the Trove API. Let's get it out in a usable form." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the Trove API's JSON results and make them available as a Python variable called 'data'\n", "data = response.json()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Let's prettify the raw JSON data and then display it.\n", "\n", "# We're using the Pygments library to add some colour to the output, so we need to import it\n", "import json\n", "from pygments import highlight, lexers, formatters\n", "\n", "# This uses Python's JSON module to output the results as nicely indented text\n", "formatted_data = json.dumps(data, indent=2)\n", "\n", "# This colours the text\n", "highlighted_data = highlight(formatted_data, lexers.JsonLexer(), formatters.TerminalFormatter())\n", "\n", "# And now display the results\n", "print(highlighted_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Do something with the data\n", "\n", "As you can see, the API results are fairly complex. Individual item records are quite deeply nested. In a future section we'll explore this structure in more detail. But for now, let's run a simple script to display the basic details of each of our matching articles." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Loop through all the newspaper articles\n", "# The articles themselves are quite deeply nested, so we have to go down several levels to get them\n", "for article in data['response']['zone'][0]['records']['article']: \n", " \n", " # Display a string containing the date, title, newspaper, and page for each article\n", " print('{}, \"{}\", {}, page {}'.format(article['date'], article['heading'], article['title']['value'], article['page']))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Congratulations!\n", "\n", "You've made your first Trove API request. Now let's move on to learn a bit [about Trove's zones](3-Working-with-zones.ipynb)." ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }