{ "cells": [ { "cell_type": "markdown", "id": "8541cc6a", "metadata": {}, "source": [ "# MTU Notebook Workflow\n", "\n", "Upload geospatial data to Mapbox as tilesets. Default mode is **dry run** (no actual upload).\n", "\n", "## Quick Start\n", "\n", "1. Run **Cell 1** to install `mtu` (skip if already installed).\n", "2. Run **Cell 2** to verify the package is available.\n", "3. Edit **Cell 3** — set your file path, credentials, and tileset info.\n", "4. Run **Cell 4**, **Cell 5**, then **Cell 6** to execute.\n", "5. To do a real upload, set `DRY_RUN = False` in Cell 3 and re-run Cell 6.\n", "\n", "## Supported Formats\n", "\n", "GeoJSON, TopoJSON, Shapefile (ZIP/SHP), GeoPackage, KML/KMZ, FlatGeobuf, GeoParquet, GPX.\n", "\n", "## Key Limits\n", "\n", "- Required token scopes: `tilesets:read`, `tilesets:write`, `tilesets:list`.\n", "- Default upload cap: 1 GB (opt into 20 GB with `USE_MAPBOX_FULL_UPLOAD_CAP = True`).\n", "- Default zoom range: `4–8` (Mapbox supports `0–22`).\n", "\n", "## Reading Output (Cell 6)\n", "\n", "| Field | Meaning |\n", "|---|---|\n", "| `success` | Whether the workflow completed without error |\n", "| `dry_run` | Simulation or real upload |\n", "| `tileset_id` | Target tileset identifier |\n", "| `steps` | Pipeline steps executed |\n", "| `warnings` | Non-fatal issues to review |\n", "\n", "## Colab Users\n", "\n", "- Add `MAPBOX_ACCESS_TOKEN` and `MAPBOX_USERNAME` via the **Secrets** panel (key icon). Cell 3 auto-detects them.\n", "- Upload files: `from google.colab import files; uploaded = files.upload()`, then set `SOURCE_FILE = Path('/content/')` in Cell 3." ] }, { "cell_type": "markdown", "id": "5928cb6f", "metadata": {}, "source": [ "## Cell 1 — Install MTU (Optional)\n", "\n", "Install the `mtu` package into the current kernel. Skip this cell if `mtu` is already installed. Set `INSTALL_MTU = True` to enable." ] }, { "cell_type": "code", "execution_count": null, "id": "f680d84c", "metadata": {}, "outputs": [], "source": [ "import subprocess\n", "import sys\n", "\n", "INSTALL_MTU = True\n", "INSTALL_TARGET = \"mtu\" # or pin a version, e.g. \"mtu==0.1.0\"\n", "\n", "if INSTALL_MTU:\n", " print(f\"Installing {INSTALL_TARGET} into: {sys.executable}\")\n", " subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"--upgrade\", INSTALL_TARGET])\n", " print(\"Done. Run Cell 2 to verify.\")\n", "else:\n", " print(\"Skipped. Set INSTALL_MTU = True to install.\")" ] }, { "cell_type": "markdown", "id": "8c459ee4", "metadata": {}, "source": [ "## Cell 2 — Check Dependencies\n", "\n", "Verify that `mtu` is importable in the active kernel. If it's missing, go back and run Cell 1." ] }, { "cell_type": "code", "execution_count": null, "id": "9bed06fa", "metadata": {}, "outputs": [], "source": [ "import importlib.util\n", "\n", "if importlib.util.find_spec('mtu'):\n", " print('✓ mtu is available.')\n", "else:\n", " print('✗ mtu is NOT installed. Run Cell 1 with INSTALL_MTU = True, then re-run this cell.')" ] }, { "cell_type": "markdown", "id": "de31b37d", "metadata": {}, "source": [ "## Cell 3 — Settings\n", "\n", "**This is the only cell you need to edit.** Set your file path, credentials, tileset info, and run options below.\n", "\n", "Key inputs:\n", "- `SOURCE_FILE` — path to your geospatial file.\n", "- `MAPBOX_ACCESS_TOKEN` / `MAPBOX_USERNAME` — your Mapbox credentials (auto-detected from env or Colab Secrets).\n", "- `DRY_RUN` — set to `False` for a real upload." ] }, { "cell_type": "code", "execution_count": null, "id": "5529969d", "metadata": {}, "outputs": [], "source": [ "import os\n", "from pathlib import Path\n", "from typing import Literal, TypedDict, cast\n", "\n", "# ── USER INPUTS (edit these) ─────────────────────────────────\n", "\n", "# Source file or URL.\n", "SOURCE_FILE = Path(r'...\\data.geojson') # local file path\n", "SOURCE_URL = 'https://example.com/data.geojson' # remote URL\n", "SOURCE_MODE_RAW = 'file' # 'file' or 'url'\n", "FORMAT_HINT: str | None = None # None = auto-detect\n", "\n", "# Mapbox credentials (auto-filled from env / Colab Secrets if blank).\n", "MAPBOX_ACCESS_TOKEN = os.environ.get('MAPBOX_ACCESS_TOKEN', '')\n", "MAPBOX_USERNAME = os.environ.get('MAPBOX_USERNAME', '')\n", "\n", "# Tileset info.\n", "TILESET_ID = 'demo-tileset-id'\n", "TILESET_NAME = 'Demo Tileset'\n", "LAYER_NAME = 'data'\n", "MIN_ZOOM = 4\n", "MAX_ZOOM = 8\n", "\n", "# Run mode.\n", "DRY_RUN = True # False = real upload\n", "\n", "# Advanced (usually no changes needed).\n", "USE_MAPBOX_FULL_UPLOAD_CAP = False # True = 20 GB cap instead of 1 GB\n", "CAPACITY_LIMIT_MB = 0.0\n", "CAPACITY_USED_MB = 0.0\n", "\n", "# ── DO NOT EDIT BELOW THIS LINE ──────────────────────────────\n", "\n", "APP_DEFAULT_UPLOAD_CAP_GB = 1.0\n", "MAPBOX_FULL_UPLOAD_CAP_GB = 20.0\n", "MAPBOX_ZOOM_MIN = 0\n", "MAPBOX_ZOOM_MAX = 22\n", "\n", "\n", "class Settings(TypedDict):\n", " mapbox_access_token: str\n", " mapbox_username: str\n", " dry_run: bool\n", " source_mode: Literal['file', 'url']\n", " source_file: Path\n", " source_url: str\n", " format_hint: str | None\n", " tileset_id: str\n", " tileset_name: str\n", " layer_name: str\n", " min_zoom: int\n", " max_zoom: int\n", " mapbox_zoom_min: int\n", " mapbox_zoom_max: int\n", " app_default_upload_cap_gb: float\n", " mapbox_full_upload_cap_gb: float\n", " use_mapbox_full_upload_cap: bool\n", " capacity_limit_mb: float\n", " capacity_used_mb: float\n", "\n", "\n", "# Auto-detect Colab secrets.\n", "_token_colab = _user_colab = ''\n", "try:\n", " from google.colab import userdata # type: ignore\n", " _token_colab = str(userdata.get('MAPBOX_ACCESS_TOKEN') or '')\n", " _user_colab = str(userdata.get('MAPBOX_USERNAME') or '')\n", "except Exception:\n", " pass\n", "\n", "if SOURCE_MODE_RAW not in {'file', 'url'}:\n", " raise ValueError(\"SOURCE_MODE_RAW must be 'file' or 'url'\")\n", "SOURCE_MODE = cast(Literal['file', 'url'], SOURCE_MODE_RAW)\n", "\n", "SETTINGS: Settings = {\n", " 'mapbox_access_token': _token_colab or MAPBOX_ACCESS_TOKEN,\n", " 'mapbox_username': _user_colab or MAPBOX_USERNAME,\n", " 'dry_run': DRY_RUN,\n", " 'source_mode': SOURCE_MODE,\n", " 'source_file': SOURCE_FILE,\n", " 'source_url': SOURCE_URL,\n", " 'format_hint': FORMAT_HINT,\n", " 'tileset_id': TILESET_ID,\n", " 'tileset_name': TILESET_NAME,\n", " 'layer_name': LAYER_NAME,\n", " 'min_zoom': MIN_ZOOM,\n", " 'max_zoom': MAX_ZOOM,\n", " 'mapbox_zoom_min': MAPBOX_ZOOM_MIN,\n", " 'mapbox_zoom_max': MAPBOX_ZOOM_MAX,\n", " 'app_default_upload_cap_gb': APP_DEFAULT_UPLOAD_CAP_GB,\n", " 'mapbox_full_upload_cap_gb': MAPBOX_FULL_UPLOAD_CAP_GB,\n", " 'use_mapbox_full_upload_cap': USE_MAPBOX_FULL_UPLOAD_CAP,\n", " 'capacity_limit_mb': CAPACITY_LIMIT_MB,\n", " 'capacity_used_mb': CAPACITY_USED_MB,\n", "}\n", "\n", "print(f'✓ Config loaded dry_run={DRY_RUN} mode={SOURCE_MODE} zoom={MIN_ZOOM}-{MAX_ZOOM}')\n", "print(f' source: {SOURCE_FILE if SOURCE_MODE == \"file\" else SOURCE_URL}')\n", "print(f' token set: {bool(SETTINGS[\"mapbox_access_token\"])} username set: {bool(SETTINGS[\"mapbox_username\"])}')" ] }, { "cell_type": "markdown", "id": "ccfcf4af", "metadata": {}, "source": [ "## Cell 4 — Import Libraries\n", "\n", "Load required modules. No edits needed." ] }, { "cell_type": "code", "execution_count": null, "id": "5a36eff6", "metadata": {}, "outputs": [], "source": [ "import json\n", "from pathlib import Path\n", "from typing import Any\n", "from urllib.parse import urlparse\n", "\n", "from mtu.uploader import TilesetConfig, TilesetUploader" ] }, { "cell_type": "markdown", "id": "d8063472", "metadata": {}, "source": [ "## Cell 5 — Helper Functions\n", "\n", "Reusable helpers for building config objects, the uploader, and previewing source data. No edits needed." ] }, { "cell_type": "code", "execution_count": null, "id": "9a52a29a", "metadata": {}, "outputs": [], "source": [ "def build_tileset_config(settings: Settings) -> TilesetConfig:\n", " return TilesetConfig(\n", " tileset_id=settings['tileset_id'],\n", " tileset_name=settings['tileset_name'],\n", " layer_name=settings['layer_name'],\n", " min_zoom=settings['min_zoom'],\n", " max_zoom=settings['max_zoom'],\n", " )\n", "\n", "\n", "def build_uploader(settings: Settings) -> TilesetUploader:\n", " return TilesetUploader(\n", " access_token=settings['mapbox_access_token'],\n", " username=settings['mapbox_username'],\n", " validate_geometry=True,\n", " use_mapbox_full_upload_cap=settings['use_mapbox_full_upload_cap'],\n", " )\n", "\n", "\n", "def get_active_upload_cap_gb(settings: Settings) -> float:\n", " return (\n", " settings['mapbox_full_upload_cap_gb']\n", " if settings['use_mapbox_full_upload_cap']\n", " else settings['app_default_upload_cap_gb']\n", " )\n", "\n", "\n", "def describe_source(settings: Settings) -> None:\n", " cap_gb = get_active_upload_cap_gb(settings)\n", " cap_mb = cap_gb * 1024\n", " print(f'Active upload cap: {cap_gb} GB')\n", "\n", " if settings['source_mode'] == 'url':\n", " source_url = settings['source_url']\n", " print(f'URL source selected: {source_url}')\n", " parsed = urlparse(source_url)\n", " if not parsed.scheme or not parsed.netloc:\n", " raise ValueError('SOURCE_URL must be a valid absolute URL when source_mode is url')\n", " if settings['capacity_limit_mb'] > 0:\n", " print('Note: local size/capacity projection is skipped for URL sources.')\n", " return\n", "\n", " source_file = settings['source_file']\n", " if not source_file.exists():\n", " print(f'WARNING: file not found: {source_file}')\n", " return\n", "\n", " size_mb = source_file.stat().st_size / (1024 * 1024)\n", " print(f'Input size: {size_mb:.2f} MB')\n", " if size_mb > cap_mb:\n", " print('WARNING: input file exceeds active upload cap.')\n", "\n", " capacity_limit_mb = settings['capacity_limit_mb']\n", " if capacity_limit_mb > 0:\n", " projected = settings['capacity_used_mb'] + size_mb\n", " print(f'Projected usage: {projected:.2f} / {capacity_limit_mb:.2f} MB')\n", " if projected > capacity_limit_mb:\n", " print('WARNING: projected usage exceeds configured capacity limit.')\n", "\n", " if source_file.suffix.lower() not in {'.geojson', '.json'}:\n", " return\n", "\n", " try:\n", " import folium\n", " from IPython.display import display\n", "\n", " with source_file.open('r', encoding='utf-8') as f:\n", " geojson_data: Any = json.load(f)\n", "\n", " m = folium.Map(location=[0, 0], zoom_start=settings['min_zoom'], control_scale=True)\n", " layer = folium.GeoJson(geojson_data, name='source data')\n", " layer.add_to(m)\n", "\n", " try:\n", " bounds = layer.get_bounds()\n", " if bounds and len(bounds) == 2:\n", " sw, ne = bounds\n", " if all(v is not None for v in [sw[0], sw[1], ne[0], ne[1]]):\n", " m.fit_bounds([[float(sw[0]), float(sw[1])], [float(ne[0]), float(ne[1])]])\n", " except Exception:\n", " pass\n", "\n", " display(m)\n", " except Exception as ex:\n", " print('Map preview skipped (folium not installed or invalid GeoJSON):', ex)" ] }, { "cell_type": "markdown", "id": "3f72ba50", "metadata": {}, "source": [ "## Cell 6 — Run Upload\n", "\n", "Execute the upload (or dry run). Reviews source, builds config, and runs the pipeline. Re-run this cell after changing `DRY_RUN` in Cell 3." ] }, { "cell_type": "code", "execution_count": null, "id": "72c1f343", "metadata": {}, "outputs": [], "source": [ "describe_source(SETTINGS)\n", "\n", "if not SETTINGS['mapbox_access_token'] or not SETTINGS['mapbox_username']:\n", " raise ValueError(\n", " 'Missing Mapbox credentials. Set MAPBOX_ACCESS_TOKEN and MAPBOX_USERNAME '\n", " 'in Cell 3, then re-run from Cell 3.'\n", " )\n", "\n", "config = build_tileset_config(SETTINGS)\n", "uploader = build_uploader(SETTINGS)\n", "\n", "if SETTINGS['source_mode'] == 'url':\n", " result = uploader.upload_from_url(\n", " url=SETTINGS['source_url'],\n", " config=config,\n", " format_hint=SETTINGS['format_hint'],\n", " dry_run=SETTINGS['dry_run'],\n", " )\n", "else:\n", " result = uploader.upload_from_file(\n", " file_path=SETTINGS['source_file'],\n", " config=config,\n", " format_hint=SETTINGS['format_hint'],\n", " dry_run=SETTINGS['dry_run'],\n", " )\n", "\n", "print(f'success: {result.success}')\n", "print(f'dry_run: {result.dry_run}')\n", "print(f'tileset_id: {result.tileset_id}')\n", "print(f'steps: {result.steps}')\n", "print(f'warnings: {result.warnings}')\n", "if result.error:\n", " print(f'ERROR: {result.error}')\n", "if result.job_id:\n", " print(f'job_id: {result.job_id}')\n", "if result.job_status:\n", " print(f'job_status: {result.job_status}')" ] } ], "metadata": { "kernelspec": { "display_name": ".venv (3.11.11)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 5 }