{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 读取全球站点观测日值观测数据\n", "\n", "#### —— nmc_met_io程序库使用说明\n", "\n", "国家气象中心天气预报技术研发室 \n", "June, 2020 \n", "Kan Dai \n", "\n", "[GHCH (Global Historical Climate Network)-Daily](https://www.ncdc.noaa.gov/ghcn-daily-description)是由美国NOAA的环境信息中心提供的 \n", "一套完整的全球地面气象站观测日值资料. GHCN-Daily由不同数据来源的站点观测整合形成,并经过一系列的质量控制。 \n", "\n", "### GHCN-Daily数据特征\n", "* 包含了来自180个国家或地区的10万多个站点观测;\n", "* 提供不同的日值数据,如最高/低温度,日降水量,降雪及雪深(注意约有一半的站点不提供降雪观测);\n", "* 数据的时长从小于1年到超过175年不等;\n", "* 数据已经经过一系列的质量控制,此外在每个周末,还有25个额外的数据源加入进来;\n", "* GHCN-Daily数据每升级一次,就会拥有不同的版本号;\n", "* 通常实时更新的数据, 会在当月结束后的45-60天,替换为质量更好的归档数据;\n", "* 数据升级或处理系统的变化会在https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/status.txt中找到;\n", "* 数据源可以概况为4类:\n", " * the U.S. Collection\n", " * the International Collection\n", " * Government Exchange Data\n", " * the Global Summary of the Day\n", "\n", "### retrieve_ghcn模块主要功能:\n", "* get_ghcnd_stn_metadata 函数从[网站](https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt)上下载站点信息,默认下载后会缓存该文件.\n", "* get_ghcnd_data 函数, 给定站点的站号, 然后下载改站点的观测信息, 目前支持多个站点的查询.\n", "* nearest_stn, 是一个辅助函数, 用于给出离指定经纬度点最近的n个站点信息.\n", "\n", "### 参考网站\n", "* https://www.ncdc.noaa.gov/ghcn-daily-description\n", "* https://nbviewer.jupyter.org/github/scott-hosking/get-station-data/blob/master/Examples/ghcn_daily_data.ipynb\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The autoreload extension is already loaded. To reload it, use:\n", " %reload_ext autoreload\n" ] } ], "source": [ "# set up things\n", "%matplotlib inline\n", "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING:root:Setting cartopy.config[\"pre_existing_data_dir\"] to /home/kan-dai/anaconda3/share/cartopy. Don't worry, this is probably intended behaviour to avoid failing downloads of geological data behind a firewall.\n" ] } ], "source": [ "# load libraries\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import proplot as plot\n", "from ipyleaflet import Map, Marker, MarkerCluster\n", "from nmc_met_io.retrieve_ghcn import get_ghcnd_stn_metadata, get_ghcnd_data, nearest_stn\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 读取站点信息\n", "\n", "如果没有缓存站点信息文件, 则会下载该文件, 需要等待一段时间." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "stn_md = get_ghcnd_stn_metadata()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | station | \n", "lat | \n", "lon | \n", "elev | \n", "name | \n", "
|---|---|---|---|---|---|
| 0 | \n", "ACW00011604 | \n", "17.1167 | \n", "-61.7833 | \n", "10.1 | \n", "ST JOHNS COOLIDGE FLD | \n", "
| 1 | \n", "ACW00011647 | \n", "17.1333 | \n", "-61.7833 | \n", "19.2 | \n", "ST JOHNS | \n", "
| 2 | \n", "AE000041196 | \n", "25.3330 | \n", "55.5170 | \n", "34.0 | \n", "SHARJAH INTER. AIRP | \n", "
| 3 | \n", "AEM00041194 | \n", "25.2550 | \n", "55.3640 | \n", "10.4 | \n", "DUBAI INTL | \n", "
| 4 | \n", "AEM00041217 | \n", "24.4330 | \n", "54.6510 | \n", "26.8 | \n", "ABU DHABI INTL | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 115077 | \n", "ZI000067969 | \n", "-21.0500 | \n", "29.3670 | \n", "861.0 | \n", "WEST NICHOLSON | \n", "
| 115078 | \n", "ZI000067975 | \n", "-20.0670 | \n", "30.8670 | \n", "1095.0 | \n", "MASVINGO | \n", "
| 115079 | \n", "ZI000067977 | \n", "-21.0170 | \n", "31.5830 | \n", "430.0 | \n", "BUFFALO RANGE | \n", "
| 115080 | \n", "ZI000067983 | \n", "-20.2000 | \n", "32.6160 | \n", "1132.0 | \n", "CHIPINGE | \n", "
| 115081 | \n", "ZI000067991 | \n", "-22.2170 | \n", "30.0000 | \n", "457.0 | \n", "BEITBRIDGE | \n", "
115082 rows × 5 columns
\n", "| element | \n", "PRCP | \n", "SNWD | \n", "TAVG | \n", "TMAX | \n", "TMIN | \n", "
|---|---|---|---|---|---|
| date | \n", "\n", " | \n", " | \n", " | \n", " | \n", " |
| 1945-10-01 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 1945-10-02 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 1945-10-03 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 1945-10-04 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 1945-10-05 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2020-03-27 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 2020-03-28 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 2020-03-29 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 2020-03-30 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 2020-03-31 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
25840 rows × 5 columns
\n", "