{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Mit Binder oder Colab kann das Jupyter-Notebook interaktiv im Browser gestartet werden:\n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/opendatazurich/opendatazurich.github.io/master?filepath=zt-api/ZuerichTourismusAPI-Beispiele.ipynb)\n", "\n", "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/opendatazurich/opendatazurich.github.io/blob/master/zt-api/ZuerichTourismusAPI-Beispiele.ipynb)\n", "\n", "\n", "\n", "# Python-Beispiele für das Zürich Tourismus API\n", "\n", "## Inhaltsverzeichnis\n", "\n", "1. [Restaurants vom Zürich Tourismus API auf einer Karte darstellen](#Restaurants-vom-Zürich-Tourismus-API-auf-einer-Karte-darstellen)\n", "1. [CSV Download](#CSV-Download)\n", "1. [Bilder zu einem Thema](#Bilder-zu-einem-Thema)\n", "1. [Kategorien der Elemente im API](#Kategorien-der-Elemente-im-API)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install requests pandas folium branca anytree" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import requests\n", "import pandas as pd\n", "import folium\n", "import branca\n", "import anytree" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SSL_VERIFY = False\n", "# evtl. SSL_VERIFY auf False setzen wenn die Verbindung zu https://www.zuerich.com nicht klappt (z.B. wegen Proxy)\n", "# Um die SSL Verifikation auszustellen, bitte die nächste Zeile einkommentieren (\"#\" entfernen)\n", "# SSL_VERIFY = False\n", "if not SSL_VERIFY:\n", " import urllib3\n", " urllib3.disable_warnings()\n", " \n", "def get_de(field):\n", " try:\n", " return field['de']\n", " except (KeyError, TypeError):\n", " try:\n", " return field['en']\n", " except (KeyError, TypeError):\n", " return field" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Restaurants vom Zürich Tourismus API auf einer Karte darstellen" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Daten von der API laden\n", "\n", "**Alle Elemente mit dem Tag \"gastronomy\" vom API abrufen**:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "headers = {'Accept': 'application/json'}\n", "r = requests.get('https://www.zuerich.com/en/api/v2/data?id=101', headers=headers, verify=SSL_VERIFY)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Die JSON Daten vom API in ein Python dictionary umwandeln:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = r.json()\n", "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Die Daten haben viele mehrsprachige Felder, der folgende Code holt sich jeweils die deutschen Inhalte:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "de_data = [{k: get_de(v) for (k,v) in f.items()} for f in data]\n", "de_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Die Daten in einem pandas DataFrame ablegen:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.DataFrame(de_data)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Daten zur Karte hinzufügen\n", "\n", "`folium` ist ein Python Wrapper für OpenLayers. Der nachfolgende Code erstellt eine neue Karte und verwendet den Übersichtsplan als Hintergrund (eingebunden als WMS)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m = folium.Map(location=[47.36, 8.53], zoom_start=13, tiles=None)\n", "folium.raster_layers.WmsTileLayer(\n", " url='https://www.ogd.stadt-zuerich.ch/wms/geoportal/Basiskarte_Zuerich_Raster',\n", " layers='Basiskarte Zürich Raster',\n", " name='Zürich - Basiskarte',\n", " fmt='image/png',\n", " overlay=False,\n", " control=False,\n", " autoZindex=False,\n", ").add_to(m)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nun iterieren wir über das pandas DataFrame und erstellen einen Marker für jedes Restaurant. Falls vorhanden wird ein Photo in den Beschreibungstext eingefügt." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gastro = folium.FeatureGroup(\"Restaurants\")\n", "isna = df.isna()\n", "for i, row in df.iterrows():\n", " print(row['geoCoordinates'])\n", " print(row['name'])\n", " geo = row['geoCoordinates']\n", " if not isna.geoCoordinates[i]:\n", " print(\"%s, %s, %s\" % (float(geo['latitude']), float(geo['longitude']), row['name']))\n", " \n", " try:\n", " photo = row['photo'][0]['url']\n", " photo_html = f''\n", " except (IndexError, KeyError, TypeError):\n", " photo_html = ''\n", " html = (\n", " f'

{row[\"name\"]}

'\n", " f'{photo_html}'\n", " f'

{row[\"disambiguatingDescription\"]}

'\n", " )\n", " #popup = folium.Popup(branca.element.IFrame(html=html, width=420))\n", " gastro.add_child(folium.Marker(location=[float(geo['latitude']), float(geo['longitude'])], popup=html)) \n", "m.add_child(gastro)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Hier ist die fertige Karte:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "folium.LayerControl().add_to(m)\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CSV Download\n", "\n", "Für die weitere Verarbeitung kann es nützlich sein, die Daten aus dem API in tabellarischer Form zu haben.\n", "Der nachfolgende Code wandelt das JSON vom API in ein CSV um (ohne jedoch alle Attribute zu _flatten_)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# download CSV\n", "import base64\n", "from IPython.display import HTML\n", "\n", "def create_download_link( df, title = \"Download CSV file\", filename = \"data.csv\"): \n", " csv = df.to_csv(None, index=False)\n", " b64 = base64.b64encode(csv.encode())\n", " payload = b64.decode()\n", " html = '

{title}

'\n", " html = html.format(payload=payload,title=title,filename=filename)\n", " return HTML(html)\n", "\n", "def generate_csv_download(endpoint, name):\n", " headers = {'Accept': 'application/json'}\n", " r = requests.get(endpoint, headers=headers, verify=SSL_VERIFY)\n", " data = r.json()\n", " df = pd.DataFrame(data)\n", " return create_download_link(df, f'Download {name}_data.csv', f'data_{name}.csv')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "generate_csv_download('https://www.zuerich.com/en/api/v2/data?id=72', 'attractions')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bilder zu einem Thema\n", "\n", "Das Zürich Tourismus API bietet sehr hochwertige Bilder an, die sich zur Illustration eigenen.\n", "\n", "### Lade alle Einträge zum Thema \"Sehenswürdigkeiten\" (engl. _attractions_):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "headers = {'Accept': 'application/json'}\n", "r = requests.get('https://www.zuerich.com/en/api/v2/data?id=72', headers=headers, verify=SSL_VERIFY)\n", "data = r.json()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alle Bilder zum Thema \"einsammeln\", diese sind in den Attributen `image` und `photo` hinterlegt:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "images = []\n", "images.extend([d['image']['url'] for d in data])\n", "images.extend([p['url'] or '' for y in [d['photo'] or '' for d in data] for p in y])\n", "images" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Anzeige von zufälligen Bildern zum Thema:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.display import HTML, display\n", "import random\n", "\n", "# wähle zufällig 8 Einträge aus der Liste aus\n", "sample = random.sample(images, k=8)\n", "\n", "def img_html(url):\n", " return ''.format(url)\n", "\n", "display(HTML(''.join([img_html(url) for url in sample])))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Kategorien der Elemente im API\n", "\n", "Dieses Beispiel zeigt, wie man sich durch API \"hangeln\" kann, d.h. den Links zu folgen und zu sehen, welche Elemente es gibt." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from anytree import Node, RenderTree\n", "from urllib.parse import urljoin" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "headers = {'Accept': 'application/json'}\n", "base_url = 'https://www.zuerich.com'\n", "data_url = urljoin(base_url, '/en/api/v2/data')\n", "r = requests.get(data_url, headers=headers, verify=SSL_VERIFY)\n", "data = r.json()\n", "data" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Baum der Kategorien\n", "\n", "Mit dem Aufruf von `https://www.zuerich.com/en/api/v2/data` bekommt man das oberste Level des Baums (d.h. eine Liste aller Kategorien inkl. ihrer Hierarchie). Damit können wir uns einen Python-Baum basteln:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Finde die direkten Kind-Nodes des angegebenem Eltern-Nodes (rekursiv)\n", "def find_children(data, parent):\n", " children = [e for e in data if e['parent'] == parent.id]\n", " for c in children:\n", " node = Node(id=c['id'], name=c['name'].get('de', c['name']), urlpath=c['path'], parent=parent)\n", " find_children(data, node)\n", "\n", "root = Node(id='0', name=\"Root\", urlpath=\"/data\")\n", "find_children(data, root)\n", "\n", "# Zeige den Baum an\n", "for pre, _, node in RenderTree(root):\n", " print(\"%s%s\" % (pre, node.name))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Anzahl Elemente pro Kategorie" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Jetzt holen wir via API die Anzahl Elemente für jede Kategorie\n", "# ACHTUNG: es wird für jede Kategorie ein Request gemacht, das dauert einige Minuten!\n", "de_data = [{k: get_de(v) for (k,v) in f.items()} for f in data]\n", "categories = pd.DataFrame(de_data)\n", "\n", "def get_category_count(path):\n", " headers = {'Accept': 'application/json'}\n", " base_url = 'https://www.zuerich.com'\n", " data_url = urljoin(base_url, path)\n", " print(f\"Request data from {data_url}\")\n", " r = requests.get(data_url, headers=headers, verify=SSL_VERIFY)\n", " return len(r.json())\n", "\n", "categories['count'] = categories['path'].apply(get_category_count)\n", "categories" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Liste aller Kategorien mit keinen Elementen (count == 0)\n", "categories[categories['count'] == 0].sort_values(by=['id'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Aggregation auf \"Eltern-Elemente\" und der zugehörigen Anzahl\n", "categories_with_parents = categories.merge(categories, how='right', left_on='id', right_on='parent', suffixes=('_parent', ''))\n", "categories_with_parents.name_parent.fillna(categories_with_parents.name, inplace=True)\n", "categories_with_parents = categories_with_parents[['name_parent', 'count']]\n", "aggregated_categories = categories_with_parents.groupby('name_parent').sum().sort_values(by=['count'], ascending=False)\n", "aggregated_categories" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ax = aggregated_categories.plot.bar(figsize=(20,15))\n", "for p in ax.patches:\n", " ax.annotate(str(p.get_height()), (p.get_x() + 0.01, p.get_height() + 5))\n", "ax" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" }, "vscode": { "interpreter": { "hash": "b82ccba0675d9b8d03fc74b9125e06fe4dd4c077445b5c038ae2e61c788a0619" } } }, "nbformat": 4, "nbformat_minor": 2 }