{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualizing Spatial Information - California Housing\n", "\n", "This demo shows a simple workflow when working with geospatial data:\n", "\n", " * Obtaining a dataset which includes geospatial references.\n", " * Obtaining a desired geometries (boundaries etc.)\n", " * Visualisation \n", " \n", "In this example we will make a simple **proportional symbols map** using the `California Housing` dataset in `sklearn` package." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:31.345427Z", "iopub.status.busy": "2024-04-26T11:57:31.345427Z", "iopub.status.idle": "2024-04-26T11:57:32.290739Z", "shell.execute_reply": "2024-04-26T11:57:32.290739Z" } }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import geopandas as gpd\n", "\n", "from lets_plot import *" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:32.290739Z", "iopub.status.busy": "2024-04-26T11:57:32.290739Z", "iopub.status.idle": "2024-04-26T11:57:32.306897Z", "shell.execute_reply": "2024-04-26T11:57:32.306897Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "LetsPlot.setup_html()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare the dataset" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:32.322652Z", "iopub.status.busy": "2024-04-26T11:57:32.322652Z", "iopub.status.idle": "2024-04-26T11:57:32.840257Z", "shell.execute_reply": "2024-04-26T11:57:32.840257Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MedIncHouseAgeAveRoomsAveBedrmsPopulationAveOccupLatitudeLongitudeValue($)
08.325241.06.9841271.023810322.02.55555637.88-122.23452600.0
18.301421.06.2381370.9718802401.02.10984237.86-122.22358500.0
27.257452.08.2881361.073446496.02.80226037.85-122.24352100.0
35.643152.05.8173521.073059558.02.54794537.85-122.25341300.0
43.846252.06.2818531.081081565.02.18146737.85-122.25342200.0
\n", "
" ], "text/plain": [ " MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude \\\n", "0 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88 \n", "1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86 \n", "2 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85 \n", "3 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85 \n", "4 3.8462 52.0 6.281853 1.081081 565.0 2.181467 37.85 \n", "\n", " Longitude Value($) \n", "0 -122.23 452600.0 \n", "1 -122.22 358500.0 \n", "2 -122.24 352100.0 \n", "3 -122.25 341300.0 \n", "4 -122.25 342200.0 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.datasets import fetch_california_housing\n", "\n", "california_housing_bunch = fetch_california_housing()\n", "data = pd.DataFrame(california_housing_bunch.data, columns=california_housing_bunch.feature_names)\n", "\n", "# Add $-value field to the dataframe.\n", "# dataset.target: numpy array of shape (20640,)\n", "# Each value corresponds to the average house value in units of 100,000.\n", "data['Value($)'] = california_housing_bunch.target * 100000\n", "data.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:32.842697Z", "iopub.status.busy": "2024-04-26T11:57:32.842697Z", "iopub.status.idle": "2024-04-26T11:57:32.856156Z", "shell.execute_reply": "2024-04-26T11:57:32.856156Z" } }, "outputs": [], "source": [ "# Draw a random sample from the data set.\n", "data = data.sample(n=1000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Static map\n", "\n", "Let's create a static map using regular `ggplot2` geometries.\n", "\n", "Various shape files related to the state of California are available at https://data.ca.gov web site.\n", "\n", "For the purpose of this demo the Calofornia State Boundaty zip was downloaded from \n", "https://data.ca.gov/dataset/ca-geographic-boundaries and unpacked to `ca-state-boundary` subdirectory." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use `geopandas` to read a shape file to GeoDataFrame" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:32.856156Z", "iopub.status.busy": "2024-04-26T11:57:32.856156Z", "iopub.status.idle": "2024-04-26T11:57:33.455463Z", "shell.execute_reply": "2024-04-26T11:57:33.455463Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The geodata is provided by © OpenStreetMap contributors and is made available here under the Open Database License (ODbL).\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
statefound namegeometry
0CACaliforniaMULTIPOLYGON (((-124.32694 40.61620, -124.3118...
\n", "
" ], "text/plain": [ " state found name geometry\n", "0 CA California MULTIPOLYGON (((-124.32694 40.61620, -124.3118..." ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#CA = gpd.read_file(\"./ca-state-boundary/CA_State_TIGER2016.shp\")\n", "\n", "from lets_plot.geo_data import *\n", "\n", "CA = geocode_states('CA').scope('US').inc_res(2).get_boundaries()\n", "CA.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Keeping in mind that our target is the housing value, fill the choropleth over the state contours using `geom_map()`function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a plot out of polygon and points\n", "\n", "The color of the points will reflect the house age and\n", "the size of the points will reflect the value of the house." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:33.455463Z", "iopub.status.busy": "2024-04-26T11:57:33.455463Z", "iopub.status.idle": "2024-04-26T11:57:33.533837Z", "shell.execute_reply": "2024-04-26T11:57:33.533837Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The plot base \n", "p = ggplot() + scale_color_gradient(name='House Age', low='red', high='green')\n", "\n", "# The points layer\n", "points = geom_point(aes(x='Longitude',\n", " y='Latitude',\n", " size='Value($)',\n", " color='HouseAge'), \n", " data=data,\n", " alpha=0.8)\n", "\n", "# The map\n", "p + geom_polygon(data=CA, fill='#F8F4F0', color='#B71234')\\\n", " + points\\\n", " + theme_void()\\\n", " + ggsize(600, 500)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Interactive map\n", "\n", "The `geom_livemap()` function creates an interactive base-map super-layer to which other geometry layers are added." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configuring map tiles\n", "\n", "By default *Lets-PLot* offers high quality vector map tiles but also can fetch raster tiles from a 3d-party Z-X-Y [tile servers](https://wiki.openstreetmap.org/wiki/Tile_servers).\n", "\n", "For the sake of the demo lets use *CARTO Antique* tiles by [CARTO](https://carto.com/attribution/) as our basemap." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:33.533837Z", "iopub.status.busy": "2024-04-26T11:57:33.533837Z", "iopub.status.idle": "2024-04-26T11:57:33.550432Z", "shell.execute_reply": "2024-04-26T11:57:33.549699Z" } }, "outputs": [], "source": [ "LetsPlot.set(\n", " maptiles_zxy(\n", " url='https://cartocdn_c.global.ssl.fastly.net/base-antique/{z}/{x}/{y}@2x.png',\n", " attribution='© OpenStreetMap contributors © CARTO, © CARTO'\n", " )\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a plot similar to the one above but interactive" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:33.550432Z", "iopub.status.busy": "2024-04-26T11:57:33.550432Z", "iopub.status.idle": "2024-04-26T11:57:33.612798Z", "shell.execute_reply": "2024-04-26T11:57:33.612798Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "p + geom_livemap()\\\n", " + geom_polygon(data=CA, fill='white', color='#B71234', alpha=0.5)\\\n", " + points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adjust the initial viewport\n", "\n", "Use parameters `location` and `zoom` to define the initial viewport." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:57:33.612798Z", "iopub.status.busy": "2024-04-26T11:57:33.612798Z", "iopub.status.idle": "2024-04-26T11:57:33.675523Z", "shell.execute_reply": "2024-04-26T11:57:33.675523Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Pass `[lon,lat]` value to the `location` (near Los Angeles)\n", "p + geom_livemap(location=[-118.15, 33.96], zoom=7)\\\n", " + geom_polygon(data=CA, fill='white', color='#B71234', alpha=0.5, size=1)\\\n", " + points" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 4 }