{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "provenance": [] }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "source": [ "# **HKIA Historical Flight Information API Tutorial**" ], "metadata": { "id": "NbPkvwosmTaL" } }, { "cell_type": "markdown", "source": [ "# Introduction\n", "\n", "This tutorial guides you through using the API to access historical flight information from Hong Kong International Airport (HKIA)\n", "\n", "* DATA.GOV.HK Website: https://data.gov.hk/en-data/dataset/aahk-team1-flight-info/resource/8f41b55c-a2ef-4963-bb25-96d8b21f3db4\n", "* Data Specification: https://www.hongkongairport.com/iwov-resources/misc/opendata/Flight_Information_DataSpec_en.pdf\n", "\n", "***Only data from TODAY-91 to TODAY+14 is available.***" ], "metadata": { "id": "Wkzto9SymivK" } }, { "cell_type": "code", "execution_count": 200, "metadata": { "id": "yt2SzaeSf1nk" }, "outputs": [], "source": [ "# import necessary libraries:\n", "import requests\n", "import pandas as pd\n", "import json" ] }, { "cell_type": "markdown", "source": [ "# **Fetching Flight Information**\n", "\n", "## **The `get_json_flight_info` Function**\n", "This function fetches HKIA flight information for a specific date and flight type through API:\n", "\n", "### **Input**\n", "* **date**: The target date for the flight information in 'YYYY-MM-DD' format.\n", "* **arrival**: Set to True for arrival flights, False for departures.\n", "* **cargo**: Set to True for cargo flights, False for passenger flights.\n", "* **lang**: Language of the response ('en', 'zh_HK', or 'zh_CN'). Default is 'en'.\n", "\n", "### **Returns**\n", "* A JSON containing detailed flight information for the given parameters and date range." ], "metadata": { "id": "xxPkHLbAol3Y" } }, { "cell_type": "code", "source": [ "def get_json_flight_info(date, arrival, cargo, lang='en'):\n", "\n", " arrival_str = 'true' if arrival else 'false'\n", " cargo_str = 'true' if cargo else 'false'\n", "\n", " url = f\"https://www.hongkongairport.com/flightinfo-rest/rest/flights/past?date={date}&arrival={arrival_str}&cargo={cargo_str}&lang={lang}\"\n", "\n", " response = requests.get(url)\n", "\n", " if response.status_code == 200:\n", " data = response.json()\n", " return data\n", " else:\n", " print(f\"Error: {response.status_code}\")" ], "metadata": { "id": "pq8rksJRU6FH" }, "execution_count": 201, "outputs": [] }, { "cell_type": "markdown", "source": [ "### **Example of Usages**" ], "metadata": { "id": "MItwzT-ttR_a" } }, { "cell_type": "code", "source": [ "# Valid\n", "yesterday = (pd.to_datetime('today')-pd.Timedelta('1 days')).strftime('%Y-%m-%d')\n", "flight_info = get_json_flight_info(date=yesterday, arrival=False, cargo=False, lang='en')\n", "# print(json.dumps(flight_info, indent=2))" ], "metadata": { "id": "J4zx46HJhLGs" }, "execution_count": 202, "outputs": [] }, { "cell_type": "code", "source": [ "# Invalid\n", "one_year_before = (pd.to_datetime('today')-pd.Timedelta('365 days')).strftime('%Y-%m-%d')\n", "flight_info = get_json_flight_info(date=one_year_before, arrival=False, cargo=False, lang='en')\n", "\n", "print(f\"https://www.hongkongairport.com/flightinfo-rest/rest/flights/past?date={one_year_before}&arrival=false&cargo=false&lang=en\")\n", "# \"message\": \"The combination of parameter date [2023-12-01] and span [1] is out of valid range (D-91 to D+14).\"" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Ln02EqeErt2B", "outputId": "5f04c3f4-236e-44e6-8986-75ae93465f35" }, "execution_count": 203, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Error: 400\n", "https://www.hongkongairport.com/flightinfo-rest/rest/flights/past?date=2023-03-24&arrival=false&cargo=false&lang=en\n" ] } ] }, { "cell_type": "markdown", "source": [ "# **Preprocessing**\n", "\n", "## **The `preprocessing` Function**\n", "This function preprocesses JSON flight information.\n", "\n", "### **Input**\n", "* `json_flight_info` returned from `get_json_flight_info` function\n", "* **col**: Optional; a list of column names as strings to specify which columns to include. Defaults to \"None,\" indicating all columns are returned.\n", "\n", "### **Returns**\n", "* A pandas DataFrame containing detailed flight information based on `json_flight_info`" ], "metadata": { "id": "M0mbjjbLt3-x" } }, { "cell_type": "code", "source": [ "def preprocessing(json_flight_info, col=None):\n", " df = pd.json_normalize(json_flight_info, 'list')\n", " df.insert(loc=0, column='info_date', value=json_flight_info[0]['date'])\n", " if col is not None:\n", " df = df[col]\n", " if 'flight' in df.columns:\n", " df['flight'] = df['flight'].apply(lambda x: [f\"{flight['no']} ({flight['airline']})\" for flight in x])\n", " return df" ], "metadata": { "id": "i5ZWTFwiiyjr" }, "execution_count": 204, "outputs": [] }, { "cell_type": "markdown", "source": [ "## **The `get_df_flight_info` Function**\n", "This function uses the above two function to get pandas dataframe format flight info within day ranges\n", "\n", "### **Input**\n", "* **start_date, end_date**: Specifies the query range ('YYYY-MM-DD').\n", "> *(For a single-day query, set both start_date and end_date to the same value.)*\n", "\n", "\n", "\n", "* **arrival, cargo, lang**: Same as those in `get_json_flight_info`\n", "\n", "### **Returns**\n", "* A pandas DataFrame containing detailed flight information for the given parameters and date range." ], "metadata": { "id": "t5PoPBidweWA" } }, { "cell_type": "code", "source": [ "def get_df_flight_info(start_date, end_date, arrival, cargo, lang='en', col=None):\n", "\n", " date_list = pd.date_range(start=start_date, end=end_date).strftime('%Y-%m-%d').tolist()\n", "\n", " df_list = []\n", " for date in date_list:\n", " json_flight_info = get_json_flight_info(date, arrival, cargo, lang)\n", " df = preprocessing(json_flight_info, col)\n", " df_list += [df]\n", "\n", " return pd.concat(df_list).reset_index()" ], "metadata": { "id": "d1Kw8H1_BW21" }, "execution_count": 205, "outputs": [] }, { "cell_type": "markdown", "source": [ "### **Example of Usages**" ], "metadata": { "id": "4Q4p3biGzC1p" } }, { "cell_type": "code", "source": [ "# single-day query\n", "yesterday = (pd.to_datetime('today')-pd.Timedelta('1 days')).strftime('%Y-%m-%d')\n", "flight_info = get_df_flight_info(start_date=yesterday, end_date=yesterday, arrival=False, cargo=False)\n", "flight_info.head()" ], "metadata": { "id": "p_9PqPB_S2BH", "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "outputId": "d58ccf03-ed0a-49c1-cd04-af6bf5f0dae9" }, "execution_count": 206, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " index info_date time \\\n", "0 0 2024-03-21 23:35 \n", "1 1 2024-03-21 23:45 \n", "2 2 2024-03-21 00:05 \n", "3 3 2024-03-21 00:15 \n", "4 4 2024-03-21 00:15 \n", "\n", " flight status \\\n", "0 [CX 255 (CPA)] Dep 00:04 (22/03/2024) \n", "1 [ET 645 (ETH)] Dep 00:07 (22/03/2024) \n", "2 [CX 261 (CPA)] Dep 00:15 \n", "3 [CX 289 (CPA), LH 7015 (DLH)] Dep 00:22 \n", "4 [CX 880 (CPA), MH 9190 (MAS), OM 5880 (MGL), A... Dep 00:24 \n", "\n", " statusCode destination terminal aisle gate \n", "0 None [LHR] T1 BC 2 \n", "1 None [ADD] T1 D 31 \n", "2 None [CDG] T1 A 49 \n", "3 None [FRA] T1 A 66 \n", "4 None [LAX] T1 BC 9 " ], "text/html": [ "\n", "
\n", " | index | \n", "info_date | \n", "time | \n", "flight | \n", "status | \n", "statusCode | \n", "destination | \n", "terminal | \n", "aisle | \n", "gate | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "2024-03-21 | \n", "23:35 | \n", "[CX 255 (CPA)] | \n", "Dep 00:04 (22/03/2024) | \n", "None | \n", "[LHR] | \n", "T1 | \n", "BC | \n", "2 | \n", "
1 | \n", "1 | \n", "2024-03-21 | \n", "23:45 | \n", "[ET 645 (ETH)] | \n", "Dep 00:07 (22/03/2024) | \n", "None | \n", "[ADD] | \n", "T1 | \n", "D | \n", "31 | \n", "
2 | \n", "2 | \n", "2024-03-21 | \n", "00:05 | \n", "[CX 261 (CPA)] | \n", "Dep 00:15 | \n", "None | \n", "[CDG] | \n", "T1 | \n", "A | \n", "49 | \n", "
3 | \n", "3 | \n", "2024-03-21 | \n", "00:15 | \n", "[CX 289 (CPA), LH 7015 (DLH)] | \n", "Dep 00:22 | \n", "None | \n", "[FRA] | \n", "T1 | \n", "A | \n", "66 | \n", "
4 | \n", "4 | \n", "2024-03-21 | \n", "00:15 | \n", "[CX 880 (CPA), MH 9190 (MAS), OM 5880 (MGL), A... | \n", "Dep 00:24 | \n", "None | \n", "[LAX] | \n", "T1 | \n", "BC | \n", "9 | \n", "