{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "
\n", " \n", "
Download Notebook
\n", "
\n", "
\n", " \n", "
\n", " \n", "
View on GitHub
\n", "
\n", "
\n", "
\n", "\n", "# Data model for Zürich datasets \n", "\n", "The Zürich Statistical Office collects data on the city and its residents. This data is published as [Linked Data](https://en.wikipedia.org/wiki/Linked_data). \n", "\n", "In this tutorial, we will **explain the linked data model** behind it. Mainly, we will guide you through the data structure. The tutorial will show you available datasets, and the shape they take. We will look into queries explaining the data structure as well as available dimensions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[1. Graphs](#Graphs) \n", "[2. Graph by Zürich Statistical Office](#Graph-by-Zürich-Statistical-Office) \n", "[3. Data cubes](#Data-cubes) \n", "[4. Observation set](#Observation-set) \n", "[5. Observations](#Observations) \n", "[6. Data slices](#Data-slices) \n", "[7. SPEX](#SPEX) \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### SPARQL endpoint\n", "\n", "Data by the Zürich Statistical Office is published as Linked Data. It can be accessed with [SPARQL queries](https://www.w3.org/TR/rdf-sparql-query/). You can send queries using HTTP requests. The API endpoint is **[https://ld.stadt-zuerich.ch/query](https://ld.stadt-zuerich.ch/query).** \n", " \n", " \n", "We use the `SPARQL` kernel in a jupyter notebook to communicate with the database (you may need to configure this via menu Kernel -> Change kernel -> SPARQL). First, let's configure the endpoint." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Endpoint set to: https://ld.stadt-zuerich.ch/query
HTTP authentication: method=basic, user=public, passwd set
Display: table
Result maximum size: 50
" ], "text/plain": [ "Endpoint set to: https://ld.stadt-zuerich.ch/query\n", "HTTP authentication: method=basic, user=public, passwd set\n", "Display: table\n", "Result maximum size: 50\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#Params\n", "%endpoint https://ld.stadt-zuerich.ch/query\n", "%auth basic public public\n", "%display table\n", "\n", "%show 50" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graphs\n", "\n", "One endpoint can contain multiple [named graphs](https://en.wikipedia.org/wiki/Named_graph). \n", " \n", " \n", "You can see what graphs are available using `WHERE` and `GRAPH` statements. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
graphs
https://linked.opendata.swiss/graph/eCH-0071
https://linked.opendata.swiss/graph/BK/FederalDirectory
https://linked.opendata.swiss/graph/sfa/meta
https://lindas.admin.ch/nl/isil
https://lindas.admin.ch/sbb/didok
https://lindas.admin.ch/sbb/nova
https://lindas.admin.ch/sbb/setactual
https://lindas.admin.ch/sbb/constructionsite
https://linked.opendata.swiss/fsvo/animalpest
https://lindas.admin.ch/meta
https://lindas.admin.ch/sfa/opendataswiss
https://lindas.admin.ch/sfa/stapfer_enquete
https://lindas.admin.ch/ontologies
https://lindas.admin.ch/fsvo/themes
https://lindas.admin.ch/elcom/electricityprice
https://lindas.admin.ch/fsvo/plazi
https://lindas.admin.ch/stadtzuerich/stat
https://lindas.admin.ch/stadtzuerich/stat/views
https://lindas.admin.ch/cube/dimension
https://lindas.admin.ch/fso/agvch
https://lindas.admin.ch/egad/ric-o
https://lindas.admin.ch/foj/zefix
https://lindas.admin.ch/territorial
https://lindas.admin.ch/fch/staatskalender
Total: 24, Shown: 24
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# List graphs in endpoint\n", "SELECT DISTINCT ?graphs \n", "WHERE {\n", " GRAPH ?graphs {\n", " ?s ?p ?o;\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These graphs have potentilly nothing in common, they are simply stored in the same database. \n", "\n", "Here, we have one endpoint storing information on subject as diverse as the train timetable, electricity prices, or animal diseases:\n", "\n", "* trains timetable (https://lindas.admin.ch/sbb/setactual)\n", "* electricity prices (https://lindas.admin.ch/elcom/electricityprice)\n", "* animal diseases (https://linked.opendata.swiss/fsvo/animalpest)\n", "\n", "All those graphs can be accessed from the same SPARQL endpoint." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graph by Zürich Statistical Office" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Datasets from the Zürich Statistical Office are published in the following named graph: ``\n", "\n", "You can limit your queries to one named graph using `WHERE` and `FROM` statements. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
spo
https://ld.stadt-zuerich.ch/statistics/code/HEO0663http://schema.org/nameFrauenkappelen
https://ld.stadt-zuerich.ch/statistics/code/GEO0663http://schema.org/nameFrauenkappelen
https://ld.stadt-zuerich.ch/statistics/code/ORT0880http://schema.org/nameRüeggisberg
https://ld.stadt-zuerich.ch/statistics/code/HEO0880http://schema.org/nameRüeggisberg
https://ld.stadt-zuerich.ch/statistics/code/GEO0880http://schema.org/nameRüeggisberg
https://ld.stadt-zuerich.ch/statistics/code/ORT1024http://schema.org/nameEmmen
https://ld.stadt-zuerich.ch/statistics/code/HEO1024http://schema.org/nameEmmen
https://ld.stadt-zuerich.ch/statistics/code/GEO1024http://schema.org/nameEmmen
https://ld.stadt-zuerich.ch/statistics/code/ORT1701http://schema.org/nameBaar
https://ld.stadt-zuerich.ch/statistics/code/HEO1701http://schema.org/nameBaar
Total: 10, Shown: 10
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# List random triples in the graph\n", "\n", "SELECT *\n", "FROM \n", "WHERE {?s ?p ?o}\n", "LIMIT 10\n", "OFFSET 100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data cubes\n", "Datasets published by Zürich Statistial Office are stored as [data cubes](https://en.wikipedia.org/wiki/OLAP_cube). Let's check the cube definition in its [documentation](https://cube.link/):\n", "\n", "**Cube** \n", "*Represents the entry point for a collection of observations, conforming to some common dimensional structure.*\n", "\n", "\n", "We can find all data cubes by defining `` as object." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
cubes
https://ld.stadt-zuerich.ch/statistics/WHG
https://ld.stadt-zuerich.ch/statistics/WHA-ZIM
https://ld.stadt-zuerich.ch/statistics/ZIM
https://ld.stadt-zuerich.ch/statistics/ZIM-WHA
https://ld.stadt-zuerich.ch/statistics/WHA
https://ld.stadt-zuerich.ch/statistics/APZ
https://ld.stadt-zuerich.ch/statistics/WHG-ZIM
https://ld.stadt-zuerich.ch/statistics/BEW
https://ld.stadt-zuerich.ch/statistics/BEW-SEX
https://ld.stadt-zuerich.ch/statistics/BEW-HEL-SEX
https://ld.stadt-zuerich.ch/statistics/ANT-GGH-HEL
https://ld.stadt-zuerich.ch/statistics/BEW-HEL
https://ld.stadt-zuerich.ch/statistics/BEW-ALT
https://ld.stadt-zuerich.ch/statistics/GES-ALT-SEX-TOU
https://ld.stadt-zuerich.ch/statistics/GES-SEX-TOU
https://ld.stadt-zuerich.ch/statistics/GES-ALT-SEX
https://ld.stadt-zuerich.ch/statistics/GES-SEX
https://ld.stadt-zuerich.ch/statistics/BEW-ALT-HEL-SEX
https://ld.stadt-zuerich.ch/statistics/BEW-ALT-HEL
https://ld.stadt-zuerich.ch/statistics/BEW-ALT-SEX
https://ld.stadt-zuerich.ch/statistics/AST-BTA
https://ld.stadt-zuerich.ch/statistics/TIA-BTA-TIG
https://ld.stadt-zuerich.ch/statistics/WRT-BTA-EAP
https://ld.stadt-zuerich.ch/statistics/TII-BTA-TIG
https://ld.stadt-zuerich.ch/statistics/TII-BTA
https://ld.stadt-zuerich.ch/statistics/TIA-BTA
https://ld.stadt-zuerich.ch/statistics/AST-BEW-BTA
https://ld.stadt-zuerich.ch/statistics/ZUS-BTA-SEX
https://ld.stadt-zuerich.ch/statistics/ZUS-BTA
https://ld.stadt-zuerich.ch/statistics/ZUS-BTA-HEL
https://ld.stadt-zuerich.ch/statistics/BES-BTA-SEX
https://ld.stadt-zuerich.ch/statistics/BES-BTA
https://ld.stadt-zuerich.ch/statistics/ZUS-BTA-ZSA
https://ld.stadt-zuerich.ch/statistics/QMP-EIG-HAA-OBJ-ZIM
Total: 34, Shown: 34
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# List all data cubes\n", "\n", "SELECT ?cubes\n", "FROM \n", "WHERE {\n", " ?cubes a .\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All cubes have the same data structure. This means we can write very similar queries to get data from different cubes.\n", "\n", "Data cubes have the following properties:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Cube properties\n", "\n", "SELECT DISTINCT ?properties\n", "FROM \n", "WHERE {\n", " ?s a ;\n", " ?properties ?o.\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's learn more about available data cubes. We can access cube description using `` property. We will also get its id using ``. By defining schema and cube as `PREFIX`, we can save some typing." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
cubedescriptionid
https://ld.stadt-zuerich.ch/statistics/WHGStatistische WohnungenWHG
https://ld.stadt-zuerich.ch/statistics/WHGStatistische Wohnungen nach ZeitWHG
https://ld.stadt-zuerich.ch/statistics/WHA-ZIMWohnungen (ohne Appartements) nach Zeit, Zimmerzahl einer WohnungWHA-ZIM
https://ld.stadt-zuerich.ch/statistics/ZIMZimmerZIM
https://ld.stadt-zuerich.ch/statistics/ZIMZimmer nach ZeitZIM
https://ld.stadt-zuerich.ch/statistics/ZIM-WHAZimmer nach Zeit, WohnungsartZIM-WHA
https://ld.stadt-zuerich.ch/statistics/WHAWohnungen (ohne Appartements)WHA
https://ld.stadt-zuerich.ch/statistics/WHAWohnungen (ohne Appartements) nach ZeitWHA
https://ld.stadt-zuerich.ch/statistics/APZAppartementzimmerAPZ
https://ld.stadt-zuerich.ch/statistics/APZAppartementzimmer nach ZeitAPZ
https://ld.stadt-zuerich.ch/statistics/WHG-ZIMStatistische Wohnungen nach Zeit, Zimmerzahl einer WohnungWHG-ZIM
https://ld.stadt-zuerich.ch/statistics/BEWWirtschaftliche WohnbevölkerungBEW
https://ld.stadt-zuerich.ch/statistics/BEWWirtschaftliche Wohnbevölkerung nach ZeitBEW
https://ld.stadt-zuerich.ch/statistics/BEW-SEXWirtschaftliche Wohnbevölkerung nach Geschlecht, ZeitBEW-SEX
https://ld.stadt-zuerich.ch/statistics/BEW-HEL-SEXWirtschaftliche Wohnbevölkerung nach Heimatland, Geschlecht, ZeitBEW-HEL-SEX
https://ld.stadt-zuerich.ch/statistics/ANT-GGH-HELAnteil nach Grundgesamtheit, Heimatland, ZeitANT-GGH-HEL
https://ld.stadt-zuerich.ch/statistics/BEW-HELWirtschaftliche Wohnbevölkerung nach Heimatland, ZeitBEW-HEL
https://ld.stadt-zuerich.ch/statistics/BEW-ALTWirtschaftliche Wohnbevölkerung nach Alter, ZeitBEW-ALT
https://ld.stadt-zuerich.ch/statistics/GES-ALT-SEX-TOUSterbefälle (wirtschaftlich) nach Alter, Geschlecht, Zeit, TodesursachenGES-ALT-SEX-TOU
https://ld.stadt-zuerich.ch/statistics/GES-SEX-TOUSterbefälle (wirtschaftlich) nach Geschlecht, Zeit, TodesursachenGES-SEX-TOU
https://ld.stadt-zuerich.ch/statistics/GES-ALT-SEXSterbefälle (wirtschaftlich) nach Alter, Geschlecht, ZeitGES-ALT-SEX
https://ld.stadt-zuerich.ch/statistics/GES-SEXSterbefälle (wirtschaftlich) nach Geschlecht, ZeitGES-SEX
https://ld.stadt-zuerich.ch/statistics/BEW-ALT-HEL-SEXWirtschaftliche Wohnbevölkerung nach Alter, Heimatland, Geschlecht, ZeitBEW-ALT-HEL-SEX
https://ld.stadt-zuerich.ch/statistics/BEW-ALT-HELWirtschaftliche Wohnbevölkerung nach Alter, Heimatland, ZeitBEW-ALT-HEL
https://ld.stadt-zuerich.ch/statistics/BEW-ALT-SEXWirtschaftliche Wohnbevölkerung nach Alter, Geschlecht, ZeitBEW-ALT-SEX
https://ld.stadt-zuerich.ch/statistics/AST-BTAArbeitsstätten nach Betriebsart, ZeitAST-BTA
https://ld.stadt-zuerich.ch/statistics/TIA-BTA-TIGTierarten nach Betriebsart, Tiergattung, ZeitTIA-BTA-TIG
https://ld.stadt-zuerich.ch/statistics/WRT-BTA-EAPWert nach Betriebsart, Erfolgsrechnung, ZeitWRT-BTA-EAP
https://ld.stadt-zuerich.ch/statistics/TII-BTA-TIGTierindividuen nach Betriebsart, Tiergattung, ZeitTII-BTA-TIG
https://ld.stadt-zuerich.ch/statistics/TII-BTATierindividuen nach Betriebsart, ZeitTII-BTA
https://ld.stadt-zuerich.ch/statistics/TIA-BTATierarten nach Betriebsart, ZeitTIA-BTA
https://ld.stadt-zuerich.ch/statistics/AST-BEW-BTAArbeitsstätten nach Bewilligung, Betriebsart, ZeitAST-BEW-BTA
https://ld.stadt-zuerich.ch/statistics/ZUS-BTA-SEXZuschauer/innen, Besucher/innen nach Betriebsart, Geschlecht, ZeitZUS-BTA-SEX
https://ld.stadt-zuerich.ch/statistics/ZUS-BTAZuschauer/innen, Besucher/innen nach Betriebsart, ZeitZUS-BTA
https://ld.stadt-zuerich.ch/statistics/ZUS-BTA-HELZuschauer/innen, Besucher/innen nach Betriebsart, Heimatland, ZeitZUS-BTA-HEL
https://ld.stadt-zuerich.ch/statistics/BES-BTA-SEXBeschäftigte nach Betriebsart, Geschlecht, ZeitBES-BTA-SEX
https://ld.stadt-zuerich.ch/statistics/BES-BTABeschäftigte nach Betriebsart, ZeitBES-BTA
https://ld.stadt-zuerich.ch/statistics/ZUS-BTA-ZSAZuschauer/innen, Besucher/innen nach Betriebsart, Zeit, Zuschauer- bzw. BesucherartZUS-BTA-ZSA
https://ld.stadt-zuerich.ch/statistics/QMP-EIG-HAA-OBJ-ZIMQuadratmeterpreis nach Eigentümerart, Handänderungsart, Objektart, Zeit, Zimmerzahl einer WohnungQMP-EIG-HAA-OBJ-ZIM
Total: 39, Shown: 39
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "\n", "SELECT *\n", "FROM \n", "WHERE {\n", " ?cube a cube:Cube ;\n", " schema:name ?description ;\n", " schema:identifier ?id .\n", "} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see that some cubes have more than one description." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Observation set\n", "\n", "Observations in data cubes are stored in observation set (``). Let's check the observationSet definition in its [documentation](https://cube.link/):\n", "\n", "**observationSet** \n", "*Connects a cube with a set of observations.*\n", "\n", "\n", "We can find all observations in a cube using `cube:observationSet/cube:observation` as predicate. \n", "Let's query all observations in `` cube." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# List all observations in a cube\n", "\n", "PREFIX schema: \n", "PREFIX cube: \n", "\n", "SELECT ?obs\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs.\n", "}\n", "LIMIT 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can count the number of observations using `COUNT` statement" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
count
9512
Total: 1, Shown: 1
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Count observations in a data cube\n", "\n", "PREFIX schema: \n", "PREFIX cube: \n", "\n", "SELECT (COUNT(?obs) AS ?count)\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs.\n", " # Equivalent to:\n", " # cube:observationSet ?observationSet.\n", " #?observationSet cube:observation ?obs.\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So in `BEW-HEL-SEX` cube we have almost 10k observations about the population of Zürich! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Observations\n", "\n", "Let's take a closer look at the observations. We can get all observation properties using this query:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
obsProperty
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
https://ld.stadt-zuerich.ch/statistics/property/ZEIT
https://ld.stadt-zuerich.ch/statistics/property/RAUM
https://ld.stadt-zuerich.ch/statistics/attribute/KORREKTUR
https://ld.stadt-zuerich.ch/statistics/measure/WHG
http://schema.org/identifier
https://cube.link/observedBy
https://ld.stadt-zuerich.ch/statistics/property/TIME
https://ld.stadt-zuerich.ch/statistics/property/ZIM
https://ld.stadt-zuerich.ch/statistics/measure/WHA
https://ld.stadt-zuerich.ch/statistics/measure/ZIM
https://ld.stadt-zuerich.ch/statistics/property/WHA
https://ld.stadt-zuerich.ch/statistics/measure/APZ
https://ld.stadt-zuerich.ch/statistics/measure/BEW
https://ld.stadt-zuerich.ch/statistics/property/SEX
https://ld.stadt-zuerich.ch/statistics/property/HEL
https://ld.stadt-zuerich.ch/statistics/property/GGH
https://ld.stadt-zuerich.ch/statistics/measure/ANT
https://ld.stadt-zuerich.ch/statistics/property/ALT
https://ld.stadt-zuerich.ch/statistics/measure/GES
https://ld.stadt-zuerich.ch/statistics/property/TOU
https://ld.stadt-zuerich.ch/statistics/property/BTA
https://ld.stadt-zuerich.ch/statistics/measure/AST
https://ld.stadt-zuerich.ch/statistics/property/TIG
https://ld.stadt-zuerich.ch/statistics/measure/TIA
https://ld.stadt-zuerich.ch/statistics/measure/WRT
https://ld.stadt-zuerich.ch/statistics/property/EAP
https://ld.stadt-zuerich.ch/statistics/measure/TII
https://ld.stadt-zuerich.ch/statistics/property/BEW
https://ld.stadt-zuerich.ch/statistics/measure/ZUS
https://ld.stadt-zuerich.ch/statistics/measure/BES
https://ld.stadt-zuerich.ch/statistics/property/ZSA
https://ld.stadt-zuerich.ch/statistics/property/OBJ
https://ld.stadt-zuerich.ch/statistics/property/EIG
https://ld.stadt-zuerich.ch/statistics/property/HAA
https://ld.stadt-zuerich.ch/statistics/measure/QMP
Total: 36, Shown: 36
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Observation properties\n", "\n", "PREFIX schema: \n", "PREFIX cube: \n", "\n", "SELECT DISTINCT ?obsProperty\n", "FROM \n", "WHERE {\n", " ?dataset a cube:Cube;\n", " cube:observationSet/cube:observation ?observation. \n", " ?observation ?obsProperty ?obsVal.\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Quite a bit! Now, not every dataset will have all those properties. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get a list of properties for the `` data cube. We will also get its labels using `schema:name`, and descrption using `schema:description`.\n", "\n", "Since not all properties have a description, we will use the `OPTIONAL` statement to query it." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
obsPropertylabeldescription
https://ld.stadt-zuerich.ch/statistics/property/ZEITZeitRepräsentation der Zeit als Konzept, repräsentiert die vollständige Komplexität.
https://ld.stadt-zuerich.ch/statistics/property/RAUMRaum
https://ld.stadt-zuerich.ch/statistics/measure/BEWWirtschaftliche WohnbevölkerungWirtschaftliche Wohnbevölkerung
https://ld.stadt-zuerich.ch/statistics/property/HELHeimatland
https://ld.stadt-zuerich.ch/statistics/property/SEXGeschlecht
https://ld.stadt-zuerich.ch/statistics/property/TIMEZeitRepräsentation der Zeit als ISO 8601 Datum (vereinfacht).
Total: 6, Shown: 6
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "\n", "SELECT DISTINCT ?obsProperty ?label ?description \n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?observation. \n", " ?observation ?obsProperty ?obsVal.\n", " ?obsProperty schema:name ?label .\n", " OPTIONAL {\n", " ?obsProperty schema:description ?description .\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that `BEW-HEL-SEX` data cube contains data on the population across:\n", "* time (german: *Zeit*)\n", "* place (german: *Raum*)\n", "* gender (german: *Geschlecht*)\n", "* origin (german: *Heimatland*)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Knowing properties, we can directlty query the measurements. Now, we are able to find the actual data: number of inhabitants across time, place, gender, and origin.\n", "\n", "We will find the measurements using `` as the observation's property. We will also the cube's dimensions using:\n", "* `` for place \n", "* `` for time \n", "* `` for origin\n", "* `` for gender\n", "\n", "And again, making use of `PREFIX` makes the query more readable." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
timeplaceorigingendercount
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00012https://ld.stadt-zuerich.ch/statistics/code/HEL2000https://ld.stadt-zuerich.ch/statistics/code/SEX000294.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00013https://ld.stadt-zuerich.ch/statistics/code/HEL2000https://ld.stadt-zuerich.ch/statistics/code/SEX0001157.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00014https://ld.stadt-zuerich.ch/statistics/code/HEL1000https://ld.stadt-zuerich.ch/statistics/code/SEX0002267.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00024https://ld.stadt-zuerich.ch/statistics/code/HEL2000https://ld.stadt-zuerich.ch/statistics/code/SEX00021686.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00033https://ld.stadt-zuerich.ch/statistics/code/HEL1000https://ld.stadt-zuerich.ch/statistics/code/SEX00024694.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00042https://ld.stadt-zuerich.ch/statistics/code/HEL2000https://ld.stadt-zuerich.ch/statistics/code/SEX00021709.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00044https://ld.stadt-zuerich.ch/statistics/code/HEL2000https://ld.stadt-zuerich.ch/statistics/code/SEX00012687.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00052https://ld.stadt-zuerich.ch/statistics/code/HEL1000https://ld.stadt-zuerich.ch/statistics/code/SEX00012256.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00072https://ld.stadt-zuerich.ch/statistics/code/HEL2000https://ld.stadt-zuerich.ch/statistics/code/SEX00021613.0
2017-12-31https://ld.stadt-zuerich.ch/statistics/code/R00074https://ld.stadt-zuerich.ch/statistics/code/HEL1000https://ld.stadt-zuerich.ch/statistics/code/SEX00024453.0
Total: 10, Shown: 10
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "PREFIX property: \n", "\n", "SELECT ?time ?place ?origin ?gender ?count\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs. \n", " \n", " ?obs ?count ;\n", " property:RAUM ?place ;\n", " property:TIME ?time ;\n", " property:HEL ?origin ;\n", " property:SEX ?gender .\n", " \n", "}\n", "LIMIT 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make the output more readable, let's replace the IRIs with the corresponding labels. This can be achieved by using `schema:name` for each property. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
timeplaceorigingendercount
2017-12-31HochschulenAuslandweiblich94.0
2017-12-31LindenhofAuslandmännlich157.0
2017-12-31CitySchweizweiblich267.0
2017-12-31EngeAuslandweiblich1686.0
2017-12-31FriesenbergSchweizweiblich4694.0
2017-12-31LangstrasseAuslandweiblich1709.0
2017-12-31HardAuslandmännlich2687.0
2017-12-31Escher WyssSchweizmännlich2256.0
2017-12-31HottingenAuslandweiblich1613.0
2017-12-31WitikonSchweizweiblich4453.0
Total: 10, Shown: 10
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "PREFIX property: \n", "\n", "SELECT ?time ?place ?origin ?gender ?count\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs. \n", " \n", " ?obs ?count ;\n", " property:RAUM/schema:name ?place ;\n", " property:TIME ?time ;\n", " property:HEL/schema:name ?origin ;\n", " property:SEX/schema:name ?gender .\n", " \n", "}\n", "LIMIT 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Volià! We now got all measurements across all dimensions in the `BEW-ALT-SEX` data cube. The data could be further refined, for example by filtering on `gender` or `place`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's do the same exercise for the `QMP-EIG-HAA-OBJ-ZIM` data cube. \n", "\n", "First, let's get observations' properties." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
obsPropertylabeldescription
https://ld.stadt-zuerich.ch/statistics/property/ZEITZeitRepräsentation der Zeit als Konzept, repräsentiert die vollständige Komplexität.
https://ld.stadt-zuerich.ch/statistics/property/RAUMRaum
https://ld.stadt-zuerich.ch/statistics/property/ZIMZimmerzahl einer Wohnung
https://ld.stadt-zuerich.ch/statistics/property/OBJObjektart
https://ld.stadt-zuerich.ch/statistics/property/EIGEigentümerart
https://ld.stadt-zuerich.ch/statistics/property/HAAHandänderungsart
https://ld.stadt-zuerich.ch/statistics/measure/QMPQuadratmeterpreisPreis in Franken pro Quadratmeter
https://ld.stadt-zuerich.ch/statistics/property/TIMEZeitRepräsentation der Zeit als ISO 8601 Datum (vereinfacht).
Total: 8, Shown: 8
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "\n", "SELECT DISTINCT ?obsProperty ?label ?description \n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?observation. \n", " ?observation ?obsProperty ?obsVal.\n", " ?obsProperty schema:name ?label .\n", " OPTIONAL {\n", " ?obsProperty schema:description ?description .\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's use these properties to query the data." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Error: HTTP error: 504 Gateway Time-out: \r", "
504 Gateway Time-out\r", "
\r", "
504 Gateway Time-out\r", "
nginx\r", "
\r", "
\r", "
" ], "text/plain": [ "Error: HTTP error: 504 Gateway Time-out: \r\n", "504 Gateway Time-out\r\n", "\r\n", "504 Gateway Time-out\r\n", "nginx\r\n", "\r\n", "\r\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "PREFIX property: \n", "\n", "SELECT ?time ?place ?rooms ?ownership ?haa ?type ?price\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs . \n", " \n", " ?obs ?price ;\n", " property:TIME ?time ;\n", " property:RAUM/schema:name ?place ;\n", " property:ZIM/schema:name ?rooms ;\n", " property:EIG/schema:name ?ownership ;\n", " property:HAA/schema:name ?market ;\n", " property:OBJ/schema:name ?type .\n", " \n", "}\n", "LIMIT 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see that some prices come back as zero. Getting an appartment for free is extremely rare in Switzerland. That's presumably a data issue. \n", " \n", "Let's remove those invalid observations using a `FILTER` statement." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "PREFIX property: \n", "\n", "SELECT ?time ?place ?rooms ?ownership ?haa ?type ?price\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs . \n", " \n", " ?obs ?price ;\n", " property:TIME ?time ;\n", " property:RAUM/schema:name ?place ;\n", " property:ZIM/schema:name ?rooms ;\n", " property:EIG/schema:name ?ownership ;\n", " property:HAA/schema:name ?market ;\n", " property:OBJ/schema:name ?type .\n", " FILTER (?price > 0) \n", "}\n", "LIMIT 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's better!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data slices\n", "\n", "Thus far, we queried all observations from the data cube. Sometimes, you are interested in just some of the dimensions. You can aggregate the results using `GROUP BY` and `SUM` statements.\n", "\n", "Let's find the total number of people in Zurich accross `property:TIME`. To do this, we sum the remaining dimensions:\n", "* `property:RAUM`\n", "* `property:HEL`\n", "* `property:SEX` \n", "\n", "This can be achieved by leaving them out of the `GROUP BY` statement. All unmentioned variables will be aggregated." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "PREFIX property: \n", "\n", "SELECT ?time (SUM(?count) AS ?aggCount)\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs. \n", " \n", " ?obs ?count ;\n", " property:TIME ?time .\n", " \n", "}\n", "GROUP BY ?time\n", "LIMIT 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also filter the results to include only one dimension. This can be done by setting the object to the value you are interested in.\n", "\n", "Let's get the number of female inhabitants over time. First, we will find IRIs for gender." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "PREFIX property: \n", "\n", "SELECT DISTINCT ?genderIRI ?gender\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs. \n", " \n", " ?obs property:SEX ?genderIRI.\n", " ?obs property:SEX/schema:name ?gender .\n", " \n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`` stands for female (german *weiblich*). We will use it to filter observations for female only. \n", "\n", "In SPARQL, this can be done by setting observation predicate to `property:SEX`, and its object to ``." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "PREFIX schema: \n", "PREFIX cube: \n", "PREFIX property: \n", "\n", "SELECT ?time ?place ?origin ?count\n", "FROM \n", "WHERE {\n", " a cube:Cube;\n", " cube:observationSet/cube:observation ?obs. \n", " \n", " ?obs ?count ;\n", " property:RAUM/schema:name ?place ;\n", " property:TIME ?time ;\n", " property:HEL/schema:name ?origin ;\n", " property:SEX .\n", " \n", "} \n", "LIMIT 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Volià, the number of female inhabitants across time, origin, and city district." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SPEX\n", "\n", "There are many ways to learn about the structure of linked data. SPARQL queries may seem dense at first and may take some time to get you to the right place. However, as you get better with SPARQL, they are a great tool to slice data.\n", "\n", "Another way to discover linked data is [SPEX](https://spex.zazuko.com/#/?url=https%3A%2F%2Fld.stadt-zuerich.ch%2Fquery&graph=https%3A%2F%2Flindas.admin.ch%2Fstadtzuerich%2Fstat%2Fviews&prefixes=wdt%3Ahttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F&forceIntrospection=false), a web based tool that visualizes the structure of a linked data source.\n", "\n", "

SPEX

\n", "" ] } ], "metadata": { "kernelspec": { "display_name": "SPARQL", "language": "sparql", "name": "sparql" }, "language_info": { "codemirror_mode": { "name": "sparql" }, "mimetype": "application/sparql-query", "name": "sparql", "pygments_lexer": "sparql-nb" }, "title": "Data Model" }, "nbformat": 4, "nbformat_minor": 4 }