{ "cells": [ { "cell_type": "markdown", "id": "edea84d1-7afd-4b54-b8c7-a68fe85ca6a2", "metadata": {}, "source": [ "# Bathymetry data from the Caribbean \n", "\n", "This dataset is a compilation of several single-beam bathymetry surveys downloaded from the [NOAA NCEI](https://ngdc.noaa.gov/mgg/geodas/trackline.html). The data are originally in MGD77 format and include a header with metadata on each survey ([`MGD77_921744.h77t`](MGD77_921744.h77t)). The original data file was compressed with LZMA to save space and make it possible to upload it to this GitHub repository.\n", "\n", "License: Public domain\n", "\n", "Original source: https://ngdc.noaa.gov/mgg/geodas/trackline.html (region selected from the web interface; no direct download link)" ] }, { "cell_type": "code", "execution_count": 1, "id": "6f3df718-0b5e-458b-b63c-2f6af1f5366b", "metadata": {}, "outputs": [], "source": [ "import os\n", "import lzma\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import pyproj\n", "import verde as vd\n", "import pooch" ] }, { "cell_type": "markdown", "id": "a6d25c2d-a8f1-4c85-862d-c00af6363562", "metadata": {}, "source": [ "## Load and clean the data\n", "\n", "The dataset are LZMA compressed and are basically a tab delimited table. We can use pandas to read this directly." ] }, { "cell_type": "code", "execution_count": 2, "id": "d05412f1-20d7-42b1-bd42-e65a43893bb7", "metadata": {}, "outputs": [], "source": [ "data_full = pd.read_csv(\n", " \"MGD77_921744.m77t.xz\", \n", " sep=\"\\t\", \n", " usecols=[0, 4, 5, 9],\n", " dtype=dict(SURVEY_ID=\"str\", LON=\"float64\", LAT=\"float64\", CORR_DEPTH=\"float64\"),\n", ").dropna().reset_index(drop=True)" ] }, { "cell_type": "markdown", "id": "9c5aab55-2592-49ce-beed-cd110904c1d8", "metadata": {}, "source": [ "Rename the columns to something easier to type, use full names, and include units." ] }, { "cell_type": "code", "execution_count": 3, "id": "e85b4449-db66-4dfc-92e6-95c6fb6f9dc9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | survey_id | \n", "latitude | \n", "longitude | \n", "depth_m | \n", "
|---|---|---|---|---|
| 0 | \n", "FM0501 | \n", "24.77290 | \n", "-89.58530 | \n", "3559.0 | \n", "
| 1 | \n", "FM0501 | \n", "24.76070 | \n", "-89.57550 | \n", "3561.0 | \n", "
| 2 | \n", "FM0501 | \n", "24.74840 | \n", "-89.56560 | \n", "3555.0 | \n", "
| 3 | \n", "FM0501 | \n", "24.73600 | \n", "-89.55570 | \n", "3553.0 | \n", "
| 4 | \n", "FM0501 | \n", "24.72380 | \n", "-89.54580 | \n", "3553.0 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2354729 | \n", "EW0003 | \n", "9.92284 | \n", "-84.72557 | \n", "20.0 | \n", "
| 2354730 | \n", "EW0003 | \n", "9.92282 | \n", "-84.72565 | \n", "20.0 | \n", "
| 2354731 | \n", "EW0003 | \n", "9.92283 | \n", "-84.72570 | \n", "20.0 | \n", "
| 2354732 | \n", "EW0003 | \n", "9.92284 | \n", "-84.72574 | \n", "21.0 | \n", "
| 2354733 | \n", "EW0003 | \n", "9.92286 | \n", "-84.72576 | \n", "20.0 | \n", "
2354734 rows × 4 columns
\n", "| \n", " | survey_id | \n", "latitude | \n", "longitude | \n", "depth_m | \n", "
|---|---|---|---|---|
| 0 | \n", "FM0501 | \n", "23.13070 | \n", "-87.99680 | \n", "75.0 | \n", "
| 1 | \n", "FM0501 | \n", "23.11940 | \n", "-87.98640 | \n", "75.0 | \n", "
| 2 | \n", "FM0501 | \n", "23.10810 | \n", "-87.97610 | \n", "73.0 | \n", "
| 3 | \n", "FM0501 | \n", "23.09670 | \n", "-87.96580 | \n", "73.0 | \n", "
| 4 | \n", "FM0501 | \n", "23.08540 | \n", "-87.95540 | \n", "73.0 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 1938090 | \n", "EW0003 | \n", "9.92284 | \n", "-84.72557 | \n", "20.0 | \n", "
| 1938091 | \n", "EW0003 | \n", "9.92282 | \n", "-84.72565 | \n", "20.0 | \n", "
| 1938092 | \n", "EW0003 | \n", "9.92283 | \n", "-84.72570 | \n", "20.0 | \n", "
| 1938093 | \n", "EW0003 | \n", "9.92284 | \n", "-84.72574 | \n", "21.0 | \n", "
| 1938094 | \n", "EW0003 | \n", "9.92286 | \n", "-84.72576 | \n", "20.0 | \n", "
1938095 rows × 4 columns
\n", "| \n", " | survey_id | \n", "latitude | \n", "longitude | \n", "depth_m | \n", "
|---|---|---|---|---|
| 0 | \n", "FM0501 | \n", "23.13070 | \n", "-87.99680 | \n", "75 | \n", "
| 1 | \n", "FM0501 | \n", "23.11940 | \n", "-87.98640 | \n", "75 | \n", "
| 2 | \n", "FM0501 | \n", "23.10810 | \n", "-87.97610 | \n", "73 | \n", "
| 3 | \n", "FM0501 | \n", "23.09670 | \n", "-87.96580 | \n", "73 | \n", "
| 4 | \n", "FM0501 | \n", "23.08540 | \n", "-87.95540 | \n", "73 | \n", "
| \n", " | survey_id | \n", "latitude | \n", "longitude | \n", "depth_m | \n", "
|---|---|---|---|---|
| 0 | \n", "FM0501 | \n", "23.13070 | \n", "-87.99680 | \n", "75 | \n", "
| 1 | \n", "FM0501 | \n", "23.11940 | \n", "-87.98640 | \n", "75 | \n", "
| 2 | \n", "FM0501 | \n", "23.10810 | \n", "-87.97610 | \n", "73 | \n", "
| 3 | \n", "FM0501 | \n", "23.09670 | \n", "-87.96580 | \n", "73 | \n", "
| 4 | \n", "FM0501 | \n", "23.08540 | \n", "-87.95540 | \n", "73 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 1938090 | \n", "EW0003 | \n", "9.92284 | \n", "-84.72557 | \n", "20 | \n", "
| 1938091 | \n", "EW0003 | \n", "9.92282 | \n", "-84.72565 | \n", "20 | \n", "
| 1938092 | \n", "EW0003 | \n", "9.92283 | \n", "-84.72570 | \n", "20 | \n", "
| 1938093 | \n", "EW0003 | \n", "9.92284 | \n", "-84.72574 | \n", "21 | \n", "
| 1938094 | \n", "EW0003 | \n", "9.92286 | \n", "-84.72576 | \n", "20 | \n", "
1938095 rows × 4 columns
\n", "