{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"\n",
"import json\n",
"from pandas.io.json import json_normalize\n",
"\n",
"from IPython.display import IFrame"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Datengui.de Praxis @SciCar"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Disclaimer: \n",
"\n",
"Datengui.de befindet sich in der Entwicklung. \n",
"Es gibt eine API und erste Prototypen, um die API zu benutzen. \n",
"Die eigentliche Anwendung ist aber noch nicht da. \n",
"Deswegen ist der hier beschriebene Weg etwas umständlich und beeinhaltet technische Hürden. \n",
"Trotzdem wollten wir Euch die API schon einmal ausprobieren lassen, denn jeder Hinweis von Euch hilft, um die API in der späteren Anwendung besser nutzbar zu machen. Und falls ihr die Hürden nehmt, könnt ihr die Daten schon nutzen :)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Schritt 1: Datensatz suchen"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Simon Wörpel hat dazu folgende Flask App gebaut: \n",
"**[Regionalstatistik: docs.genesapi.org](https://docs.genesapi.org)**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sucht doch mal zum Beispiel nach 'Kühe' "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ergebnis:\n",
" \n",
"**https://docs.genesapi.org/AI0902**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Runterscrollen bis zum [**Beispiel**](https://api-next.datengui.de/graphql?query=%7B%0A%20%20allRegions%20%7B%0A%20%20%20%20page%0A%20%20%20%20total%0A%20%20%20%20itemsPerPage%0A%20%20%20%20regions%20%7B%0A%20%20%20%20%20%20id%0A%20%20%20%20%20%20name%0A%20%20%20%20%20%20AI0902%20%7B%0A%20%20%20%20%20%20%20%20year%0A%20%20%20%20%20%20%20%20value%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%0A%20%20%7D%0A%7D%0A)"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"{\n",
" allRegions {\n",
" page\n",
" total\n",
" itemsPerPage\n",
" regions {\n",
" id\n",
" name\n",
" AI0902 {\n",
" year\n",
" value\n",
" }\n",
" }\n",
" }\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dieser Link führt zu: \n",
" [**Datenguide API Playground - alpha**](https://api-next.datengui.de/graphql)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Auf ▶️ Play-Drücken:\n",
" \n",
"Daten werden abgerufen und rechts angezeigt."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Was könnt und solltet ihr anpassen:** "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- `itemsPerPage`: Wie viele Elemente enthält der Datensatz\n",
" \n",
"- `nuts`: Das ist die lustige, aber offizielle EU-weite Bezeichnung für Gebiets-Ebenen. Nicht alle Daten sind auf allen Gebiets-Ebenen verfügbar.\n",
" - `nuts: 1` steht für die Ebene Bundesländer\n",
" - `nuts: 3` steht für die Ebene Kreise und kreisfreie Städte\n",
" - Statt `nuts` könnt ihr auch `lau: 1` benutzen, um Daten auf Gemeinde-Ebene abzufragen.\n",
" \n",
" \n",
"Einschränken der Datenabfrage mittels Filter:\n",
" \n",
"Ihr könnt in Klammern hinter dem Merkmal (`AI0902`) noch Filter setzen:\n",
" \n",
"- `year`: Zeitpunkt der Daten einschränken, zum Beispiel `year: 2016`, um nur Daten für 2016 zu zeigen. Aktuell sind in unserer (experimentellen) API Daten bis 2017 enthalten.\n",
"- *Ausprägungen des Merkmals:*\n",
" Viele Merkmale haben noch verschiedene Ausprägungen. Zum Beispiel [Bevölkerungsstand](https://docs.genesapi.org/BEVSTD) – hier könnt ihr die Abfrage durch Attribute eingränzen, zum Beispiel nur Daten für das weibliche Geschlecht abfragen ([Beispiel](https://docs.genesapi.org/BEVSTD/GES))"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"{\n",
" allRegions (itemsPerPage: 50) {\n",
" page\n",
" total\n",
" itemsPerPage\n",
" regions (nuts: 1) {\n",
" id\n",
" name\n",
" AI0902 (year: 2016) {\n",
" year\n",
" value\n",
" }\n",
" }\n",
" }\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Schritt 2: Daten speichern"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Rechte Seite komplett markieren und als `kuehe.json` speichern"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Alternative für Experten: Daten direkt in dieses Notebook laden\n",
"\n",
"Ihr könnt mit der `requests`-Library ganz einfach auch die Antwort der API-Abfrage direkt laden:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"\n",
"url = 'https://api-next.datengui.de/graphql'\n",
"\n",
"query = \"\"\"\n",
"{\n",
" allRegions (itemsPerPage: 50) {\n",
" page\n",
" total\n",
" itemsPerPage\n",
" regions (nuts: 1) {\n",
" id\n",
" name\n",
" AI0902 (year: 2016) {\n",
" year\n",
" value\n",
" }\n",
" }\n",
" }\n",
"}\n",
"\"\"\"\n",
"\n",
"res = requests.post(url, data={'query': query})\n",
"d = res.json() # das weiter unten nutzen anstatt es aus der Datei (nächster Schritt) zu laden"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Schritt 3: .json einlesen"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In Zukunft soll das Ganze natürlich in verschiedenen Formaten (zum Beispiel .csv) herunterladbar sein.\n",
"Gregor Aisch hat da mal einen Prototypen gebaut: https://gka.me/datenguide/. Rückmeldungen, Wünsche dazu gerne. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hier kommt der Python-Weg:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"#load json object\n",
"with open('kuehe.json') as f:\n",
" d = json.load(f)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'data': {'allRegions': {'page': 0,\n",
" 'total': 16,\n",
" 'itemsPerPage': 50,\n",
" 'regions': [{'id': '10',\n",
" 'name': 'Saarland',\n",
" 'AI0902': [{'year': 2016, 'value': 61.7}]},\n",
" {'id': '11', 'name': 'Berlin', 'AI0902': [{'year': 2016, 'value': 23.6}]},\n",
" {'id': '12',\n",
" 'name': 'Brandenburg',\n",
" 'AI0902': [{'year': 2016, 'value': 40.4}]},\n",
" {'id': '13',\n",
" 'name': 'Mecklenburg-Vorpommern',\n",
" 'AI0902': [{'year': 2016, 'value': 40.6}]},\n",
" {'id': '14', 'name': 'Sachsen', 'AI0902': [{'year': 2016, 'value': 53.7}]},\n",
" {'id': '15',\n",
" 'name': 'Sachsen-Anhalt',\n",
" 'AI0902': [{'year': 2016, 'value': 28.6}]},\n",
" {'id': '16',\n",
" 'name': 'Thüringen',\n",
" 'AI0902': [{'year': 2016, 'value': 41.5}]},\n",
" {'id': '01',\n",
" 'name': 'Schleswig-Holstein',\n",
" 'AI0902': [{'year': 2016, 'value': 110.7}]},\n",
" {'id': '02', 'name': 'Hamburg', 'AI0902': [{'year': 2016, 'value': 42.1}]},\n",
" {'id': '03',\n",
" 'name': 'Niedersachsen',\n",
" 'AI0902': [{'year': 2016, 'value': 100.3}]},\n",
" {'id': '04', 'name': 'Bremen', 'AI0902': [{'year': 2016, 'value': 124.1}]},\n",
" {'id': '05',\n",
" 'name': 'Nordrhein-Westfalen',\n",
" 'AI0902': [{'year': 2016, 'value': 98.1}]},\n",
" {'id': '06', 'name': 'Hessen', 'AI0902': [{'year': 2016, 'value': 57.2}]},\n",
" {'id': '07',\n",
" 'name': 'Rheinland-Pfalz',\n",
" 'AI0902': [{'year': 2016, 'value': 49.1}]},\n",
" {'id': '08',\n",
" 'name': 'Baden-Württemberg',\n",
" 'AI0902': [{'year': 2016, 'value': 69.5}]},\n",
" {'id': '09',\n",
" 'name': 'Bayern',\n",
" 'AI0902': [{'year': 2016, 'value': 101.9}]}]}}}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"d = d['data']['allRegions']['regions']"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'id': '10', 'name': 'Saarland', 'AI0902': [{'year': 2016, 'value': 61.7}]},\n",
" {'id': '11', 'name': 'Berlin', 'AI0902': [{'year': 2016, 'value': 23.6}]},\n",
" {'id': '12',\n",
" 'name': 'Brandenburg',\n",
" 'AI0902': [{'year': 2016, 'value': 40.4}]},\n",
" {'id': '13',\n",
" 'name': 'Mecklenburg-Vorpommern',\n",
" 'AI0902': [{'year': 2016, 'value': 40.6}]},\n",
" {'id': '14', 'name': 'Sachsen', 'AI0902': [{'year': 2016, 'value': 53.7}]},\n",
" {'id': '15',\n",
" 'name': 'Sachsen-Anhalt',\n",
" 'AI0902': [{'year': 2016, 'value': 28.6}]},\n",
" {'id': '16', 'name': 'Thüringen', 'AI0902': [{'year': 2016, 'value': 41.5}]},\n",
" {'id': '01',\n",
" 'name': 'Schleswig-Holstein',\n",
" 'AI0902': [{'year': 2016, 'value': 110.7}]},\n",
" {'id': '02', 'name': 'Hamburg', 'AI0902': [{'year': 2016, 'value': 42.1}]},\n",
" {'id': '03',\n",
" 'name': 'Niedersachsen',\n",
" 'AI0902': [{'year': 2016, 'value': 100.3}]},\n",
" {'id': '04', 'name': 'Bremen', 'AI0902': [{'year': 2016, 'value': 124.1}]},\n",
" {'id': '05',\n",
" 'name': 'Nordrhein-Westfalen',\n",
" 'AI0902': [{'year': 2016, 'value': 98.1}]},\n",
" {'id': '06', 'name': 'Hessen', 'AI0902': [{'year': 2016, 'value': 57.2}]},\n",
" {'id': '07',\n",
" 'name': 'Rheinland-Pfalz',\n",
" 'AI0902': [{'year': 2016, 'value': 49.1}]},\n",
" {'id': '08',\n",
" 'name': 'Baden-Württemberg',\n",
" 'AI0902': [{'year': 2016, 'value': 69.5}]},\n",
" {'id': '09', 'name': 'Bayern', 'AI0902': [{'year': 2016, 'value': 101.9}]}]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"d = json_normalize(d, 'AI0902', ['id', 'name', ['year', 'value']], errors='ignore')"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" value | \n",
" id | \n",
" name | \n",
" year.value | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2016 | \n",
" 61.7 | \n",
" 10 | \n",
" Saarland | \n",
" NaN | \n",
"
\n",
" \n",
" 1 | \n",
" 2016 | \n",
" 23.6 | \n",
" 11 | \n",
" Berlin | \n",
" NaN | \n",
"
\n",
" \n",
" 2 | \n",
" 2016 | \n",
" 40.4 | \n",
" 12 | \n",
" Brandenburg | \n",
" NaN | \n",
"
\n",
" \n",
" 3 | \n",
" 2016 | \n",
" 40.6 | \n",
" 13 | \n",
" Mecklenburg-Vorpommern | \n",
" NaN | \n",
"
\n",
" \n",
" 4 | \n",
" 2016 | \n",
" 53.7 | \n",
" 14 | \n",
" Sachsen | \n",
" NaN | \n",
"
\n",
" \n",
" 5 | \n",
" 2016 | \n",
" 28.6 | \n",
" 15 | \n",
" Sachsen-Anhalt | \n",
" NaN | \n",
"
\n",
" \n",
" 6 | \n",
" 2016 | \n",
" 41.5 | \n",
" 16 | \n",
" Thüringen | \n",
" NaN | \n",
"
\n",
" \n",
" 7 | \n",
" 2016 | \n",
" 110.7 | \n",
" 01 | \n",
" Schleswig-Holstein | \n",
" NaN | \n",
"
\n",
" \n",
" 8 | \n",
" 2016 | \n",
" 42.1 | \n",
" 02 | \n",
" Hamburg | \n",
" NaN | \n",
"
\n",
" \n",
" 9 | \n",
" 2016 | \n",
" 100.3 | \n",
" 03 | \n",
" Niedersachsen | \n",
" NaN | \n",
"
\n",
" \n",
" 10 | \n",
" 2016 | \n",
" 124.1 | \n",
" 04 | \n",
" Bremen | \n",
" NaN | \n",
"
\n",
" \n",
" 11 | \n",
" 2016 | \n",
" 98.1 | \n",
" 05 | \n",
" Nordrhein-Westfalen | \n",
" NaN | \n",
"
\n",
" \n",
" 12 | \n",
" 2016 | \n",
" 57.2 | \n",
" 06 | \n",
" Hessen | \n",
" NaN | \n",
"
\n",
" \n",
" 13 | \n",
" 2016 | \n",
" 49.1 | \n",
" 07 | \n",
" Rheinland-Pfalz | \n",
" NaN | \n",
"
\n",
" \n",
" 14 | \n",
" 2016 | \n",
" 69.5 | \n",
" 08 | \n",
" Baden-Württemberg | \n",
" NaN | \n",
"
\n",
" \n",
" 15 | \n",
" 2016 | \n",
" 101.9 | \n",
" 09 | \n",
" Bayern | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" year value id name year.value\n",
"0 2016 61.7 10 Saarland NaN\n",
"1 2016 23.6 11 Berlin NaN\n",
"2 2016 40.4 12 Brandenburg NaN\n",
"3 2016 40.6 13 Mecklenburg-Vorpommern NaN\n",
"4 2016 53.7 14 Sachsen NaN\n",
"5 2016 28.6 15 Sachsen-Anhalt NaN\n",
"6 2016 41.5 16 Thüringen NaN\n",
"7 2016 110.7 01 Schleswig-Holstein NaN\n",
"8 2016 42.1 02 Hamburg NaN\n",
"9 2016 100.3 03 Niedersachsen NaN\n",
"10 2016 124.1 04 Bremen NaN\n",
"11 2016 98.1 05 Nordrhein-Westfalen NaN\n",
"12 2016 57.2 06 Hessen NaN\n",
"13 2016 49.1 07 Rheinland-Pfalz NaN\n",
"14 2016 69.5 08 Baden-Württemberg NaN\n",
"15 2016 101.9 09 Bayern NaN"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"d = d[['value', 'name']]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" value | \n",
" name | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 61.7 | \n",
" Saarland | \n",
"
\n",
" \n",
" 1 | \n",
" 23.6 | \n",
" Berlin | \n",
"
\n",
" \n",
" 2 | \n",
" 40.4 | \n",
" Brandenburg | \n",
"
\n",
" \n",
" 3 | \n",
" 40.6 | \n",
" Mecklenburg-Vorpommern | \n",
"
\n",
" \n",
" 4 | \n",
" 53.7 | \n",
" Sachsen | \n",
"
\n",
" \n",
" 5 | \n",
" 28.6 | \n",
" Sachsen-Anhalt | \n",
"
\n",
" \n",
" 6 | \n",
" 41.5 | \n",
" Thüringen | \n",
"
\n",
" \n",
" 7 | \n",
" 110.7 | \n",
" Schleswig-Holstein | \n",
"
\n",
" \n",
" 8 | \n",
" 42.1 | \n",
" Hamburg | \n",
"
\n",
" \n",
" 9 | \n",
" 100.3 | \n",
" Niedersachsen | \n",
"
\n",
" \n",
" 10 | \n",
" 124.1 | \n",
" Bremen | \n",
"
\n",
" \n",
" 11 | \n",
" 98.1 | \n",
" Nordrhein-Westfalen | \n",
"
\n",
" \n",
" 12 | \n",
" 57.2 | \n",
" Hessen | \n",
"
\n",
" \n",
" 13 | \n",
" 49.1 | \n",
" Rheinland-Pfalz | \n",
"
\n",
" \n",
" 14 | \n",
" 69.5 | \n",
" Baden-Württemberg | \n",
"
\n",
" \n",
" 15 | \n",
" 101.9 | \n",
" Bayern | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" value name\n",
"0 61.7 Saarland\n",
"1 23.6 Berlin\n",
"2 40.4 Brandenburg\n",
"3 40.6 Mecklenburg-Vorpommern\n",
"4 53.7 Sachsen\n",
"5 28.6 Sachsen-Anhalt\n",
"6 41.5 Thüringen\n",
"7 110.7 Schleswig-Holstein\n",
"8 42.1 Hamburg\n",
"9 100.3 Niedersachsen\n",
"10 124.1 Bremen\n",
"11 98.1 Nordrhein-Westfalen\n",
"12 57.2 Hessen\n",
"13 49.1 Rheinland-Pfalz\n",
"14 69.5 Baden-Württemberg\n",
"15 101.9 Bayern"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"d = d.set_index('name')"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" value | \n",
"
\n",
" \n",
" name | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Saarland | \n",
" 61.7 | \n",
"
\n",
" \n",
" Berlin | \n",
" 23.6 | \n",
"
\n",
" \n",
" Brandenburg | \n",
" 40.4 | \n",
"
\n",
" \n",
" Mecklenburg-Vorpommern | \n",
" 40.6 | \n",
"
\n",
" \n",
" Sachsen | \n",
" 53.7 | \n",
"
\n",
" \n",
" Sachsen-Anhalt | \n",
" 28.6 | \n",
"
\n",
" \n",
" Thüringen | \n",
" 41.5 | \n",
"
\n",
" \n",
" Schleswig-Holstein | \n",
" 110.7 | \n",
"
\n",
" \n",
" Hamburg | \n",
" 42.1 | \n",
"
\n",
" \n",
" Niedersachsen | \n",
" 100.3 | \n",
"
\n",
" \n",
" Bremen | \n",
" 124.1 | \n",
"
\n",
" \n",
" Nordrhein-Westfalen | \n",
" 98.1 | \n",
"
\n",
" \n",
" Hessen | \n",
" 57.2 | \n",
"
\n",
" \n",
" Rheinland-Pfalz | \n",
" 49.1 | \n",
"
\n",
" \n",
" Baden-Württemberg | \n",
" 69.5 | \n",
"
\n",
" \n",
" Bayern | \n",
" 101.9 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" value\n",
"name \n",
"Saarland 61.7\n",
"Berlin 23.6\n",
"Brandenburg 40.4\n",
"Mecklenburg-Vorpommern 40.6\n",
"Sachsen 53.7\n",
"Sachsen-Anhalt 28.6\n",
"Thüringen 41.5\n",
"Schleswig-Holstein 110.7\n",
"Hamburg 42.1\n",
"Niedersachsen 100.3\n",
"Bremen 124.1\n",
"Nordrhein-Westfalen 98.1\n",
"Hessen 57.2\n",
"Rheinland-Pfalz 49.1\n",
"Baden-Württemberg 69.5\n",
"Bayern 101.9"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"# d.to_clipboard()\n",
"d.to_csv('kuehe.csv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Schritt 4: Datawrapper"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Neue Karte\n",
"- Deutschland: Bundesländer \n",
"- Daten einfügen\n",
"- Anpassen"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"IFrame('//datawrapper.dwcdn.net/AWDSM/2/', width='600', height='400')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}