{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualization of the Titanic's Voyage" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:16.583380Z", "iopub.status.busy": "2026-01-27T17:14:16.583301Z", "iopub.status.idle": "2026-01-27T17:14:16.586080Z", "shell.execute_reply": "2026-01-27T17:14:16.585780Z" } }, "outputs": [], "source": [ "import pandas as pd\n", "\n", "from lets_plot import *" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:16.586881Z", "iopub.status.busy": "2026-01-27T17:14:16.586801Z", "iopub.status.idle": "2026-01-27T17:14:16.588401Z", "shell.execute_reply": "2026-01-27T17:14:16.588184Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "LetsPlot.setup_html()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Data Preparation\n", "\n", "The Titanic dataset is available at [kaggle](https://www.kaggle.com) : [\"Titanic: cleaned data\" dataset](https://www.kaggle.com/jamesleslie/titanic-cleaned-data?select=train_clean.csv) (train_clean.csv)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:16.601845Z", "iopub.status.busy": "2026-01-27T17:14:16.601722Z", "iopub.status.idle": "2026-01-27T17:14:17.013010Z", "shell.execute_reply": "2026-01-27T17:14:17.012678Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AgeCabinEmbarkedFareNameParchPassengerIdPclassSexSibSpSurvivedTicketTitleFamily_Size
022.0NaNS7.2500Braund, Mr. Owen Harris013male10.0A/5 21171Mr1
138.0C85C71.2833Cumings, Mrs. John Bradley (Florence Briggs Th...021female11.0PC 17599Mrs1
226.0NaNS7.9250Heikkinen, Miss. Laina033female01.0STON/O2. 3101282Miss0
\n", "
" ], "text/plain": [ " Age Cabin Embarked Fare \\\n", "0 22.0 NaN S 7.2500 \n", "1 38.0 C85 C 71.2833 \n", "2 26.0 NaN S 7.9250 \n", "\n", " Name Parch PassengerId \\\n", "0 Braund, Mr. Owen Harris 0 1 \n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... 0 2 \n", "2 Heikkinen, Miss. Laina 0 3 \n", "\n", " Pclass Sex SibSp Survived Ticket Title Family_Size \n", "0 3 male 1 0.0 A/5 21171 Mr 1 \n", "1 1 female 1 1.0 PC 17599 Mrs 1 \n", "2 3 female 0 1.0 STON/O2. 3101282 Miss 0 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/titanic.csv\")\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this Titanic dataset the column `Embarked`contains a single-letter codes of the Titanic's ports of embarkation:\n", "- S: Southampton (UK)\n", "- C: Cherbourg (France)\n", "- Q: Cobh (Ireland)\n", "\n", "Let's add new colum \"Port\" to the data:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.013931Z", "iopub.status.busy": "2026-01-27T17:14:17.013846Z", "iopub.status.idle": "2026-01-27T17:14:17.021179Z", "shell.execute_reply": "2026-01-27T17:14:17.020954Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AgeCabinEmbarkedFareNameParchPassengerIdPclassSexSibSpSurvivedTicketTitleFamily_SizePort
022.0NaNS7.2500Braund, Mr. Owen Harris013male10.0A/5 21171Mr1Southampton
138.0C85C71.2833Cumings, Mrs. John Bradley (Florence Briggs Th...021female11.0PC 17599Mrs1Cherbourg
226.0NaNS7.9250Heikkinen, Miss. Laina033female01.0STON/O2. 3101282Miss0Southampton
\n", "
" ], "text/plain": [ " Age Cabin Embarked Fare \\\n", "0 22.0 NaN S 7.2500 \n", "1 38.0 C85 C 71.2833 \n", "2 26.0 NaN S 7.9250 \n", "\n", " Name Parch PassengerId \\\n", "0 Braund, Mr. Owen Harris 0 1 \n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... 0 2 \n", "2 Heikkinen, Miss. Laina 0 3 \n", "\n", " Pclass Sex SibSp Survived Ticket Title Family_Size \\\n", "0 3 male 1 0.0 A/5 21171 Mr 1 \n", "1 1 female 1 1.0 PC 17599 Mrs 1 \n", "2 3 female 0 1.0 STON/O2. 3101282 Miss 0 \n", "\n", " Port \n", "0 Southampton \n", "1 Cherbourg \n", "2 Southampton " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def to_port_name (row):\n", " if row['Embarked'] == 'S' :\n", " return 'Southampton'\n", " if row['Embarked'] == 'C' :\n", " return 'Cherbourg'\n", " if row['Embarked'] == 'Q' :\n", " return 'Cobh'\n", " return 'Other'\n", "\n", "df['Port']=df.apply (lambda row: to_port_name(row), axis=1)\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1 Travellers Survival Rates by the Port of Embarkation" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.022087Z", "iopub.status.busy": "2026-01-27T17:14:17.022012Z", "iopub.status.idle": "2026-01-27T17:14:17.060406Z", "shell.execute_reply": "2026-01-27T17:14:17.060120Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c_surv=\"#E1A439\"\n", "c_lost=\"#6B9993\"\n", "\n", "bars = (ggplot(df) +\n", " geom_bar(\n", " aes('Port', fill=as_discrete('Survived')), \n", " tooltips=layer_tooltips()\n", " .line('@{..count..} (@{..prop..})')\n", " .format('@{..prop..}', '.0%'),\n", " position='dodge') +\n", " scale_fill_manual(values=[c_lost, c_surv], labels=['no', 'yes']) +\n", " scale_x_discrete(limits=['Cobh', 'Cherbourg', 'Southampton']) +\n", " labs(x=\"\", y=\"Travellers count\") + \n", " ggtitle(\"Survival by the Port of Embarkation\")\n", ") \n", "\n", "bars + ggsize(800, 300)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Ports of Embarkation on Map\n", "\n", "Titanic's ports of of embarkation were:\n", "- Southampton (UK)\n", "- Cherbourg (France)\n", "- Cobh (Ireland)\n", "\n", "Let's find geographical coordinates of these cities using `Lets-Plot` geocoding module." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.061485Z", "iopub.status.busy": "2026-01-27T17:14:17.061381Z", "iopub.status.idle": "2026-01-27T17:14:17.090340Z", "shell.execute_reply": "2026-01-27T17:14:17.089999Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The geodata is provided by © OpenStreetMap contributors and is made available here under the Open Database License (ODbL).\n" ] } ], "source": [ "from lets_plot.geo_data import *\n", "\n", "ports = ['Southampton', 'Cherbourg', 'Cobh']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1 Spatial DataFrame (Pandas GeoDataFrame)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.091471Z", "iopub.status.busy": "2026-01-27T17:14:17.091391Z", "iopub.status.idle": "2026-01-27T17:14:17.546604Z", "shell.execute_reply": "2026-01-27T17:14:17.546127Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cityfound namegeometry
0SouthamptonSouthamptonPOINT (-1.40254 50.91837)
1CherbourgCherbourgPOINT (-1.60901 49.62728)
2CobhCobhPOINT (-8.29428 51.85315)
\n", "
" ], "text/plain": [ " city found name geometry\n", "0 Southampton Southampton POINT (-1.40254 50.91837)\n", "1 Cherbourg Cherbourg POINT (-1.60901 49.62728)\n", "2 Cobh Cobh POINT (-8.29428 51.85315)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ports_gcoder = (geocode_cities(ports)\n", " .where(ports[0], scope='England')\n", " .where(ports[1], scope='France'))\n", "ports_gdf = ports_gcoder.get_centroids()\n", "ports_gdf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2 Markers on Map\n", "\n", "`Lets-Plot` API makes it easy to create an interactive basemap layer using either its own vector tiles service or \n", "by configuring a 3-rd party Z-X-Y raster tile providers.\n", "\n", "In this notebook we will use beautifull *CARTO Antique* raster tiles by [CARTO](https://carto.com/attribution/) as our basemap." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.547649Z", "iopub.status.busy": "2026-01-27T17:14:17.547522Z", "iopub.status.idle": "2026-01-27T17:14:17.549659Z", "shell.execute_reply": "2026-01-27T17:14:17.549331Z" } }, "outputs": [], "source": [ "from lets_plot import tilesets\n", "\n", "LetsPlot.set(tilesets.CARTO_ANTIQUE_HIRES)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.550437Z", "iopub.status.busy": "2026-01-27T17:14:17.550321Z", "iopub.status.idle": "2026-01-27T17:14:17.556281Z", "shell.execute_reply": "2026-01-27T17:14:17.555977Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "basemap = ggplot() + geom_livemap() + ggsize(800, 350)\n", "\n", "\n", "port_markers = geom_point(\n", " map=ports_gdf, \n", " size=7, \n", " shape=21, \n", " color=\"black\", \n", " fill=\"yellow\")\n", "\n", "basemap + port_markers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. The \"Titanic's site\" Marker" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.557054Z", "iopub.status.busy": "2026-01-27T17:14:17.556937Z", "iopub.status.idle": "2026-01-27T17:14:17.562439Z", "shell.execute_reply": "2026-01-27T17:14:17.562092Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from shapely.geometry import Point, LineString\n", "titanic_site = Point(-38.056641, 46.920255)\n", "\n", "titanic_site_marker = geom_point(x=titanic_site.x, y = titanic_site.y, size=10, shape=9, color='red')\n", "\n", "basemap + port_markers + titanic_site_marker" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. The New York City Marker\n", "\n", "New York City was the Titanic's destination." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.563564Z", "iopub.status.busy": "2026-01-27T17:14:17.563445Z", "iopub.status.idle": "2026-01-27T17:14:17.981148Z", "shell.execute_reply": "2026-01-27T17:14:17.980794Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "NYC = geocode_cities(['New York']).get_centroids().geometry[0]\n", "NYC_marker = geom_point(x=NYC.x, y=NYC.y, size=7, shape=21, color='black', fill='white')\n", "\n", "(basemap + \n", " port_markers +\n", " titanic_site_marker +\n", " NYC_marker\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Connecting Markers on Map\n", "\n", "To connect markers on the map we will create a `LineString` object (from `Shaply` package)." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.982897Z", "iopub.status.busy": "2026-01-27T17:14:17.982773Z", "iopub.status.idle": "2026-01-27T17:14:17.985965Z", "shell.execute_reply": "2026-01-27T17:14:17.985656Z" } }, "outputs": [], "source": [ "from geopandas import GeoSeries\n", "from geopandas import GeoDataFrame\n", "\n", "# Points of embarkation (GeoSeries).\n", "port_points = ports_gdf.geometry\n", "path_points = pd.concat([port_points, GeoSeries([titanic_site, NYC], crs=ports_gdf.crs)], ignore_index=True)\n", "\n", "# Create a new GeoDataFrame containing a `LineString` geometry.\n", "path_gdf = GeoDataFrame(\n", " dict(geometry=[ LineString(path_points) ])\n", ")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.987244Z", "iopub.status.busy": "2026-01-27T17:14:17.987127Z", "iopub.status.idle": "2026-01-27T17:14:17.993693Z", "shell.execute_reply": "2026-01-27T17:14:17.993382Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Add \"path\" to the map.\n", "titanic_path = geom_path(\n", " map=path_gdf, \n", " color='dark-blue', \n", " linetype='dotted', size=1.2)\n", "\n", "(basemap +\n", " titanic_path +\n", " port_markers +\n", " titanic_site_marker +\n", " NYC_marker\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Pie-chart Markers on Map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.1 Travellers Survival Rates, Now with Pie" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:17.994750Z", "iopub.status.busy": "2026-01-27T17:14:17.994630Z", "iopub.status.idle": "2026-01-27T17:14:18.008935Z", "shell.execute_reply": "2026-01-27T17:14:18.008638Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pies = (ggplot(df) +\n", " geom_pie(\n", " aes(x='Port', y=\"..sum..\", fill=as_discrete('Survived'), size=\"..sum..\"), \n", " labels=layer_labels()\n", " .line('@{..count..}')\n", " .line('(@{..prop..})').format('@{..prop..}', '.0%'),\n", " tooltips=layer_tooltips().title(\"@Port (@{..sum..})\"),\n", " stroke=1.5, \n", " hole=0.5) +\n", " scale_fill_manual(values=[c_lost, c_surv], labels=['no', 'yes']) +\n", " scale_x_discrete(limits=['Cobh', 'Cherbourg', 'Southampton'], expand=[0, 0.3]) +\n", " scale_size(range=[3, 10], guide=\"none\") +\n", " ylim(0, 800) +\n", " labs(x=\"\", y=\"Travellers count\") + \n", " ggtitle(\"Survival by the Port of Embarkation\")\n", ")\n", "\n", "pies + ggsize(800, 300)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.2 Spatial Pies" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:18.009913Z", "iopub.status.busy": "2026-01-27T17:14:18.009821Z", "iopub.status.idle": "2026-01-27T17:14:18.034198Z", "shell.execute_reply": "2026-01-27T17:14:18.033952Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "spatial_pies = (\n", " geom_pie(\n", " aes(x='Port', fill=as_discrete('Survived'), size=\"..sum..\"), \n", " data=df,\n", " map=ports_gdf, \n", " map_join=['Port','city'],\n", " tooltips=layer_tooltips()\n", " .title(\"@Port (@{..sum..})\")\n", " .line('@{..count..} (@{..prop..})')\n", " .format('@{..prop..}', '.0%'),\n", " stroke=1.5, \n", " hole=0.5,\n", " color='white') +\n", " scale_fill_manual(values=[c_lost, c_surv], labels=['lost', 'survived']) +\n", " scale_size(range=[3, 10], guide=\"none\") +\n", " theme(legend_position=[0.5, 1], \n", " legend_justification=[0.5, 1], \n", " legend_direction='horizontal',\n", " legend_title=element_blank())\n", ") \n", "\n", "(basemap + \n", " titanic_path +\n", " spatial_pies +\n", " titanic_site_marker +\n", " NYC_marker\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.3 Adjusting the Map Zoom Level and Position" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2026-01-27T17:14:18.035218Z", "iopub.status.busy": "2026-01-27T17:14:18.035133Z", "iopub.status.idle": "2026-01-27T17:14:18.056992Z", "shell.execute_reply": "2026-01-27T17:14:18.056743Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(ggplot() + ggsize(900, 600) + ggtitle(\"Titanic Survival Rate by Port of Embarkation\") +\n", " geom_livemap(zoom=6, location=[-8.29, 51.85, -1.61, 49.63]) + \n", " titanic_path +\n", " port_markers +\n", " spatial_pies +\n", " titanic_site_marker +\n", " NYC_marker +\n", " theme(text=element_text(family=\"Garamond\"),\n", " plot_title=element_text(size=30))\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 4 }