{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"text-align:center\">\n",
    "    <a href=\"https://nbviewer.jupyter.org/github/twMr7/Python-Machine-Learning/blob/master/09-Other_Utilities.ipynb\">\n",
    "        Open In Jupyter nbviewer\n",
    "        <img style=\"float: center;\" src=\"https://nbviewer.jupyter.org/static/img/nav_logo.svg\" width=\"120\" />\n",
    "    </a>\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/twMr7/Python-Machine-Learning/blob/master/09-Other_Utilities.ipynb)\n",
    "\n",
    "# 9. 其他實用工具 Other Utilities\n",
    "\n",
    "Python 的通用性來自於豐富的標準函式庫,本章介紹以下幾種常用的工具模組。\n",
    "+ [**9.1 日期與時間(Date and Time)**](#module-datetime)\n",
    "+ [**9.2 物件序列化(Python Object Serialization)**](#module-pickle)\n",
    "+ [**9.3 Jason**](#module-json)\n",
    "+ [**9.4 亂數(Random Numbers)**](#module-random)\n",
    "+ [**9.5 數學函數(Math Functions)**](#module-math)\n",
    "+ [**9.6 檔案系統路徑(File System Paths)**](#module-pathlib)\n",
    "+ [**9.7 資料型別提示(Type Hints)**](#type-hints)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"module-datetime\"></a>\n",
    "\n",
    "## 9.1 日期與時間 Date and Time\n",
    "\n",
    "Python 標準函式庫中的 [`datetime`](https://docs.python.org/3/library/datetime.html#datetime-objects) 模組可以用來處理日期時間相關的資料,包含了 `date`, `time`, `datetime`, `timedelta`, `timezone` 等型別。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### § timedelta\n",
    "\n",
    "**`timedelta`** 物件是用來表示時間差、時間概念上的距離,不是特定某天幾點幾分的時間,可以用來進行加減乘除的四則運算。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.timedelta(days=64, seconds=29156, microseconds=10)"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 載入 timedelta 模組\n",
    "from datetime import timedelta\n",
    "\n",
    "delta = timedelta(\n",
    "    days=50,\n",
    "    seconds=27,\n",
    "    microseconds=10,\n",
    "    milliseconds=29000,\n",
    "    minutes=5,\n",
    "    hours=8,\n",
    "    weeks=2\n",
    ")\n",
    "# Only days, seconds and microseconds are stored internally\n",
    "delta"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.timedelta(days=3650)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ten_years = timedelta(days=365) * 10\n",
    "ten_years"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ten_years.days // 365"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### § date\n",
    "\n",
    "`date` 有 **year**、**month**、**day** 屬性,沒有時分秒的概念,算數運算與比較會忽略 `timedelta.seconds` 和 `timedelta.microseconds`。\n",
    "\n",
    "| 操作範例                        | \n",
    "|---------------------------------|\n",
    "| **`date2 = date1 + timedelta`** |\n",
    "| **`date2 = date1 - timedelta`** |\n",
    "| **`timedelta = date1 - date2`** |\n",
    "| **`date1 < date2`**             |\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.date(2021, 3, 17)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 載入 date 模組\n",
    "from datetime import date\n",
    "\n",
    "today = date.today()\n",
    "today"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.date(2022, 3, 17)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "today.replace(year=2022)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2011-03-20'"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(today - ten_years).isoformat()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.timedelta(days=366)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 記得 timedelta 只有天數的概念,沒有年月的概念\n",
    "date.today() - date(2020, 3, 16)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### § time\n",
    "\n",
    "`time` 是時區性的時分秒概念的物件,有**hour**、**minute**、**second**、**microsecond**、**tzinfo**的屬性。\n",
    "\n",
    "\n",
    "### § datetime\n",
    "\n",
    "`datetime` 綜合了 `date` 與 `time` 物件資訊的物件。\n",
    "\n",
    "| 操作範例                                | \n",
    "|-----------------------------------------|\n",
    "| **`datetime2 = datetime1 + timedelta`** |\n",
    "| **`datetime2 = datetime1 - timedelta`** |\n",
    "| **`timedelta = datetime1 - datetime2`** |\n",
    "| **`datetime1 < datetime2`**             |\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2021, 3, 17, 11, 39, 4, 937520)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 載入 datetime 模組\n",
    "from datetime import datetime\n",
    "\n",
    "# 現在的日期時間,返回 datetime 型別\n",
    "t1 = datetime.now()\n",
    "t1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2021-03-17 11:39:04.937520'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 轉成字串 str 型別\n",
    "str(t1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'03/17/2021 11:39:04'"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 現在的日期時間,轉成指定格式的字串\n",
    "t1.strftime('%m/%d/%Y %H:%M:%S')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "t1 比 t2 晚 139 days, 20:29:04.937520\n"
     ]
    }
   ],
   "source": [
    "# 從日期時間的字串轉成 datetime 型別\n",
    "t2 = datetime.strptime('2020-10-28  15:10:00', '%Y-%m-%d %H:%M:%S')\n",
    "\n",
    "# 比較兩個 datetime\n",
    "if (t1 > t2):\n",
    "    print('t1 比 t2 晚', t1 - t2)\n",
    "else:\n",
    "    print('t2 比 t1 晚', t2 - t1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "標準函式庫還有另外一個 `time` 模組,提供了專門用來處理時間相關的函式,大多是從系統的C函式庫來的比較低階的處理。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'20210317 113905'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# time 模組也有 strftime 可以用來格式化時間字串\n",
    "import time\n",
    "time.strftime('%Y%m%d %H%M%S')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"module-pickle\"></a>\n",
    "\n",
    "## 9.2 物件序列化 Python Object Serialization\n",
    "\n",
    "Python 標準函式庫中的 [`pickle`](https://docs.python.org/3/library/pickle.html#module-pickle) 模組,提供了將 Python 物件序列化(serializing)及解序列化(de-serializing)的方法。 序列化指的是將物件階層轉換成位元組串流(byte stream),以方便物件的儲存、網路傳送、以及不同平臺的互通交換,反向的解序列化操作則是將位元組串流轉換成物件階層。\n",
    "\n",
    "+ `pickle` 模組可以將物件儲存至檔案,或從檔案載入物件,檔案的存取需使用 binary 模式。\n",
    "+ `pickle` 模組提供的序列化功能只適用於 Python 物件專用,標準函式庫中另外有跨平臺及程式語言的通用型的序列化模組 [`json`](https://docs.python.org/3/library/json.html#module-json),但 `json` 只支援較少的 Python 內建物件型別。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 載入 pickle 模組\n",
    "import pickle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 建立一個數據記錄的結構\n",
    "tformat = '%Y-%m-%d %H:%M:%S'\n",
    "record = [\n",
    "    {'時間':datetime.strptime('2019-04-03 10:35:58', tformat), '體溫':37.0, '速度':35.0, '心率':92},\n",
    "    {'時間':datetime.strptime('2019-04-03 10:37:00', tformat), '體溫':37.1, '速度':33.8, '心率':97},\n",
    "    {'時間':datetime.strptime('2019-04-03 10:37:59', tformat), '體溫':37.4, '速度':35.5, '心率':99}\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 開啟新的 binary 檔案,用 pickle 將 record 物件 serialize\n",
    "# 注意: pickle 的檔案是 binary 的格式\n",
    "pfile = open('record.pkl', 'wb')\n",
    "pickle.dump(record, pfile)\n",
    "pfile.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 學過 context manager 了,應該這樣寫比較ok\n",
    "with open('record.pkl', 'wb') as pfile:\n",
    "    pickle.dump(record, pfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 讀入檔案,將 record 物件 de-serialize\n",
    "pfile = open('record.pkl', 'rb')\n",
    "record2 = pickle.load(pfile)\n",
    "pfile.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'時間': datetime.datetime(2019, 4, 3, 10, 35, 58),\n",
       "  '體溫': 37.0,\n",
       "  '速度': 35.0,\n",
       "  '心率': 92},\n",
       " {'時間': datetime.datetime(2019, 4, 3, 10, 37),\n",
       "  '體溫': 37.1,\n",
       "  '速度': 33.8,\n",
       "  '心率': 97},\n",
       " {'時間': datetime.datetime(2019, 4, 3, 10, 37, 59),\n",
       "  '體溫': 37.4,\n",
       "  '速度': 35.5,\n",
       "  '心率': 99}]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 載入 pickle 物件也改成 context manager 的寫法\n",
    "with open('record.pkl', 'rb') as pfile:\n",
    "    record2 = pickle.load(pfile)\n",
    "\n",
    "record2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"module-json\"></a>\n",
    "\n",
    "## 9.3 Json\n",
    "\n",
    "JSON(JavaScript Object Notation)是常用的公開規格的資料交換格式,副檔名慣例為 **`.json`**。 Python 標準函式庫中的 [`json`](https://docs.python.org/3/library/json.html) 模組提供了類似 `pickle` 的方法,可以用來將內建的物件型別輸出成 JSON 檔,或反過來載入用 JSON 格式編碼的物件。 支援的物件類型與 JSON 編碼的對應表列如下:\n",
    "\n",
    "| JSON 物件 | Python 物件  |\n",
    "|-----------|--------------|\n",
    "| *object*  | **dict**     |\n",
    "| *array*   | **list**     |\n",
    "| *string*  | **str**      |\n",
    "| *int*     | **int**      |\n",
    "| *real*    | **float**    |\n",
    "| *true*    | **True**     |\n",
    "| *false*   | **False**    |\n",
    "| *null*    | **None**     |\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'image': {'Width': 600, 'Height': 800, 'Title': 'Portrait'},\n",
       " 'person': {'firstName': 'John',\n",
       "  'lastName': 'Doe',\n",
       "  'isAlive': True,\n",
       "  'age': 27,\n",
       "  'phoneNumbers': [{'type': 'home', 'number': '212 555-1234'},\n",
       "   {'type': 'office', 'number': '646 555-4567'}],\n",
       "  'spouse': None}}"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 準備要儲存成 JSON 檔的物件\n",
    "card = {\n",
    "    \"image\": {\n",
    "        \"Width\":  600,\n",
    "        \"Height\": 800,\n",
    "        \"Title\":  \"Portrait\",\n",
    "    },\n",
    "    \n",
    "    \"person\": {\n",
    "        \"firstName\": \"John\",\n",
    "        \"lastName\": \"Doe\",\n",
    "        \"isAlive\": True,\n",
    "        \"age\": 27,\n",
    "        \"phoneNumbers\": [\n",
    "            {\n",
    "                \"type\": \"home\",\n",
    "                \"number\": \"212 555-1234\"\n",
    "            },\n",
    "            {\n",
    "                \"type\": \"office\",\n",
    "                \"number\": \"646 555-4567\"\n",
    "            }\n",
    "        ],\n",
    "        \"spouse\": None\n",
    "    }\n",
    "}\n",
    "\n",
    "card"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 載入 json 模組\n",
    "import json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 開啟新的文字檔案,將 python 物件編碼成 JSON 輸出到檔案\n",
    "# 注意: .json 的檔案是文字格式\n",
    "with open('card.json', 'w') as jfile:\n",
    "    json.dump(card, jfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'image': {'Width': 600, 'Height': 800, 'Title': 'Portrait'},\n",
       " 'person': {'firstName': 'John',\n",
       "  'lastName': 'Doe',\n",
       "  'isAlive': True,\n",
       "  'age': 27,\n",
       "  'phoneNumbers': [{'type': 'home', 'number': '212 555-1234'},\n",
       "   {'type': 'office', 'number': '646 555-4567'}],\n",
       "  'spouse': None}}"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 開啟.json檔案,將 JSON 編碼的物件載入轉成 Python 物件\n",
    "with open('card.json', 'r') as jfile:\n",
    "    card2 = json.load(jfile)\n",
    "\n",
    "card2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"module-random\"></a>\n",
    "\n",
    "## 9.4 亂數 Random Numbers\n",
    "\n",
    "Python 標準函式庫中的 [`random`](https://docs.python.org/3/library/random.html) 模組,提供了擬隨機(pseudo-random)亂數產生的方法。\n",
    "\n",
    "+ `random()` - 返回下一個 [0.0, 1.0) 區間內的隨機實數。\n",
    "+ `randrange(start, stop[, step])` - 返回下一個 [start, stop) 區間內的隨機整數。\n",
    "+ `randint(a, b)` - 返回下一個 [a, b] 區間內的隨機整數,同 `randrange(a, b+1)`。\n",
    "+ `choice(seq)` - 從 seq 序列中隨機選取其中一個成員。\n",
    "+ `shuffle(seq)` - 將 seq 序列中的元素順序重新隨機排列,序列必須是可就地變更的容器類別。\n",
    "+ `sample(seq, k)` - 從 seq 序列或集合中,返回隨機選取 k 個樣本的 List 清單。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 載入 random 模組\n",
    "import random"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0.49356032291668317, 0.2411096099519041, 0.24274807400132448, 0.6806583376348618, 0.4007445154008329, 0.5020029135428243, 0.8753881901813494, 0.2064575923259282, 0.5471457366330188, 0.6100255703399993, 0.6754079019577277, 0.028047998071924818, 0.9113825389832221, 0.12481777147365836, 0.9197373983810999, 0.4972709758446857, 0.3109592944098044, 0.6135724808834165, 0.9030384725303868, 0.4016312744745454, 0.4224038103832404, 0.4288471001262948, 0.466593180358435, 0.47192722041625734, 0.23088689632757775, 0.5080219321975132, 0.23119624893044544, 0.766038585063923, 0.9432255999781156, 0.4438950384837139, 0.008875158371981162, 0.7747935214178607, 0.8328097496865488, 0.03820973930946581, 0.2425404000214182, 0.20378756529358255, 0.9011823798074147, 0.9429615434171732, 0.008365499494094153, 0.30710204474646563, 0.7714685577914125, 0.44228662030116717, 0.47293711004645833, 0.9272615793168927, 0.6274046238104688, 0.06124138720195915, 0.20303402087805467, 0.00551746765636052, 0.29608613992703825, 0.8392178754932461, 0.4092118352756213, 0.5371365032419317, 0.45466517861888456, 0.07374675487497973, 0.38185686745620595, 0.0329699613561929, 0.15696658530359375, 0.8553618168471122, 0.3398270023737717, 0.7099281747926457, 0.23799987535976463, 0.8074101269935077, 0.7577429606338424, 0.06977378136798007, 0.38924769663238856, 0.13725757006274264, 0.9314593644916109, 0.5800782709115695, 0.5442005381571936, 0.09078629592206988, 0.33982237325614884, 0.5233793070412616, 0.24834424239295672, 0.10656032584693331, 0.6222531212266836, 0.11026076444867894, 0.5767222541627232, 0.8213605421082407, 0.4042220342914117, 0.5086261360293176, 0.10830352652922781, 0.31460119979844836, 0.29420699858731425, 0.27102235776003614, 0.7215244001345396, 0.003757854865288124, 0.13378381296717023, 0.4004733598153881, 0.508240336008106, 0.9004693485487324, 0.5161319869840322, 0.44929251677640736, 0.0816625073655195, 0.40020492559186704, 0.305598752686734, 0.20873123907155022, 0.3384601272052107, 0.22090273575138908, 0.7775569232073181, 0.8853173389504151]\n"
     ]
    }
   ],
   "source": [
    "# 產生 100 個隨機實數數列\n",
    "Lr = [random.random() for x in range(100)]\n",
    "print(Lr)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[11, 57, 31, 91, 59, 91, 10, 56, 74, 13, 38, 5, 94, 73, 37, 86, 90, 5, 14, 56, 63, 74, 69, 51, 43, 22, 10, 57, 98, 47, 60, 82, 34, 77, 41, 33, 15, 95, 70, 95, 48, 48, 17, 28, 40, 44, 100, 17, 58, 93, 100, 16, 91, 82, 85, 96, 40, 3, 86, 92, 33, 36, 60, 63, 53, 30, 76, 93, 23, 84, 100, 2, 73, 47, 90, 67, 67, 46, 16, 15, 6, 74, 32, 46, 28, 24, 47, 27, 7, 65, 19, 9, 27, 70, 49, 79, 36, 6, 87, 4]\n"
     ]
    }
   ],
   "source": [
    "# 產生 100 個隨機整數數列\n",
    "Li = [random.randint(1, 100) for x in range(100)]\n",
    "print(Li)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1.5547964720490222,\n",
       " 30.879379115434432,\n",
       " 26.426500783069663,\n",
       " 1.2249376104827925,\n",
       " 2.030760763231264,\n",
       " 77.75569232073181,\n",
       " 23.323987785256936,\n",
       " 42.76530392264297,\n",
       " 68.92799297237922,\n",
       " 13.824802801220837]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 從數列中隨機選取 10 個樣本,產生新的隨機數列\n",
    "[x * y for x, y in zip(random.sample(Lr, 10), random.sample(Li, 10))]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"module-math\"></a>\n",
    "\n",
    "## 9.5 數學函數 Math Functions\n",
    "\n",
    "Python 標準函式庫中的 [`math`](https://docs.python.org/3/library/math.html) 模組,提供了用於實數運算的常用函數。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 載入 math 模組\n",
    "import math"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.9999999999999999\n"
     ]
    }
   ],
   "source": [
    "# 內建函式的 sum() 在浮點數運算的精度不足\n",
    "print(sum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.0\n"
     ]
    }
   ],
   "source": [
    "# math 模組的 fsum() 可避免精度的誤差\n",
    "print(math.fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "cosine(pi) = -1.0\n"
     ]
    }
   ],
   "source": [
    "# cosine 180 度\n",
    "print('cosine(pi) =', math.cos(math.pi))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sine(pi/2) = 1.0\n"
     ]
    }
   ],
   "source": [
    "# sine 90 度\n",
    "print('sine(pi/2) =', math.sin(math.radians(90)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Euclidean distance between (1, 3, 5, 7, 9), (2, 4, 6, 8, 10) = 2.23606797749979\n"
     ]
    }
   ],
   "source": [
    "# 載入 math 模組裡要用到的函式\n",
    "from math import sqrt\n",
    "\n",
    "# 計算 N 維的歐幾里得距離\n",
    "def EuclideanDist(p1, p2):\n",
    "    return sqrt(sum((x1 - x2) ** 2 for x1, x2 in zip(p1, p2)))\n",
    "\n",
    "m, n = (1, 3, 5, 7, 9), (2, 4, 6, 8, 10)\n",
    "print('Euclidean distance between {}, {} = {}'.format(m, n, EuclideanDist(m, n)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"module-pathlib\"></a>\n",
    "\n",
    "## 9.6 檔案系統路徑 File System Paths\n",
    "\n",
    "Python 標準函式庫中的 [`pathlib`](https://docs.python.org/3/library/pathlib.html) 模組,提供了通用於不同平台的檔案系統路徑操作,`Path` 物件可以比較、解析路徑的組成部份、也可以串接重組,主要有以下屬性:\n",
    "\n",
    "+ `Path.drive` - 目標路徑的磁碟代號\n",
    "+ `Path.root` - 目標路徑的根目錄\n",
    "+ `Path.parent` - 目標路徑的上層目錄\n",
    "+ `Path.name` - 目標路徑最後部份的名字\n",
    "+ `Path.suffix` - 目標路徑最後部份的副檔名\n",
    "+ `Path.stem` - 目標路徑最後部份去除副檔名的名字\n",
    "\n",
    "常用的 `Path` 類別方法如下:\n",
    "\n",
    "+ `Path.cwd()` - 目前工作目錄。\n",
    "+ `Path.home()` - 登入使用者的家目錄。\n",
    "+ `Path(str)` - 從字串 str 建立路徑物件。\n",
    "+ `Path.exists()` - 路徑的檔案或目錄是否存在。\n",
    "+ `Path.glob(pattern)` - 返回生成函式,用來列出路徑下符合指定 pattern 的所有檔案或目錄。\n",
    "+ `Path.is_dir()` - 檢查路徑的目標是否爲目錄。\n",
    "+ `Path.is_file()` - 檢查路徑的目標是否爲檔案。\n",
    "+ `Path.iterdir()` - 當目標路徑爲目錄時,用來迭代尋訪目錄下的所有檔案。\n",
    "+ `Path.mkdir()` - 當目標路徑爲目錄時,爲該目標建立目錄。\n",
    "+ `Path.rename(new_name)` - 重新命名檔案。\n",
    "+ `Path.open(mode)` - 功能同內建函式 `open()`,使用指定模式開啓檔案,返回檔案物件。\n",
    "+ `Path.rmdir()` - 刪除目錄,必須是空目錄才能刪除。\n",
    "+ `Path.unlink()` - 刪除檔案或連結(symbolic link)。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 載入 Path 類別\n",
    "from pathlib import Path"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Current working directory:  D:\\Users\\James\\Documents\\Code\\Lecture\\Python-Machine-Learning\\Lecture-Notes\n",
      ".directory\n",
      ".git\n",
      ".gitignore\n",
      ".ipynb_checkpoints\n",
      "01-Getting_Started.ipynb\n",
      "02-Syntax_Overview_1.ipynb\n",
      "03-Syntax_Overview_2.ipynb\n",
      "04-String_Operations.ipynb\n",
      "05-List_Operations.ipynb\n",
      "06-Tuple_Operations.ipynb\n",
      "07-Dict_Operations.ipynb\n",
      "08-File_Operations.ipynb\n",
      "09-Other_Utilities.ipynb\n",
      "10-Coding_Project.ipynb\n",
      "11-Numpy_Vectorized_Computation.ipynb\n",
      "12-Matplotlib_Data_Visualization.ipynb\n",
      "13-Pandas_Data_Processing.ipynb\n",
      "14-Sklearn_Building_A_Machine_Learning_Model.ipynb\n",
      "15-Sklearn_Data_Preprocessing.ipynb\n",
      "16-Sklearn_Best_Practice_Techniques.ipynb\n",
      "17-Artificial_Neural_Network_with_tf_Keras.ipynb\n",
      "18-ANN_Case_Studies.ipynb\n",
      "19-Practical_Autoencoders.ipynb\n",
      "20-CNN_Fundamental.ipynb\n",
      "card.json\n",
      "dataset\n",
      "QuickStart\n",
      "README.md\n",
      "record.pkl\n"
     ]
    }
   ],
   "source": [
    "# 取得目前工作目錄\n",
    "pwd = Path.cwd()\n",
    "print('Current working directory: ', pwd)\n",
    "\n",
    "# 列出目前工作目錄下所有的檔案及目錄\n",
    "for f in pwd.iterdir():\n",
    "    print(f.name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "File \"card.json\"\" is removed.\n",
      "File \"record.pkl\"\" is removed.\n"
     ]
    }
   ],
   "source": [
    "# 建構 Path 物件可以用不同的表示法\n",
    "file2remove = [Path(pwd, 'card.json'), Path(pwd / 'record.pkl')]\n",
    "\n",
    "# 刪除之前建立的測試用檔案\n",
    "for path in file2remove:\n",
    "    if path.exists():\n",
    "        path.unlink()\n",
    "        print('File \"{}\"\" {} removed.'.format(path.name, 'is not' if path.exists() else 'is'))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'.directory': '',\n",
       " '.gitignore': '',\n",
       " '01-Getting_Started': '.ipynb',\n",
       " '02-Syntax_Overview_1': '.ipynb',\n",
       " '03-Syntax_Overview_2': '.ipynb',\n",
       " '04-String_Operations': '.ipynb',\n",
       " '05-List_Operations': '.ipynb',\n",
       " '06-Tuple_Operations': '.ipynb',\n",
       " '07-Dict_Operations': '.ipynb',\n",
       " '08-File_Operations': '.ipynb',\n",
       " '09-Other_Utilities': '.ipynb',\n",
       " '10-Coding_Project': '.ipynb',\n",
       " '11-Numpy_Vectorized_Computation': '.ipynb',\n",
       " '12-Matplotlib_Data_Visualization': '.ipynb',\n",
       " '13-Pandas_Data_Processing': '.ipynb',\n",
       " '14-Sklearn_Building_A_Machine_Learning_Model': '.ipynb',\n",
       " '15-Sklearn_Data_Preprocessing': '.ipynb',\n",
       " '16-Sklearn_Best_Practice_Techniques': '.ipynb',\n",
       " '17-Artificial_Neural_Network_with_tf_Keras': '.ipynb',\n",
       " '18-ANN_Case_Studies': '.ipynb',\n",
       " '19-Practical_Autoencoders': '.ipynb',\n",
       " '20-CNN_Fundamental': '.ipynb',\n",
       " 'README': '.md'}"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 建立一個記錄檔案名字與副檔名對照的字典\n",
    "{f.stem:f.suffix for f in pwd.iterdir() if f.is_file()}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['01-Getting_Started.ipynb',\n",
       " '02-Syntax_Overview_1.ipynb',\n",
       " '03-Syntax_Overview_2.ipynb',\n",
       " '04-String_Operations.ipynb',\n",
       " '05-List_Operations.ipynb',\n",
       " '06-Tuple_Operations.ipynb',\n",
       " '07-Dict_Operations.ipynb',\n",
       " '08-File_Operations.ipynb',\n",
       " '09-Other_Utilities.ipynb',\n",
       " '10-Coding_Project.ipynb',\n",
       " '11-Numpy_Vectorized_Computation.ipynb',\n",
       " '12-Matplotlib_Data_Visualization.ipynb',\n",
       " '13-Pandas_Data_Processing.ipynb',\n",
       " '14-Sklearn_Building_A_Machine_Learning_Model.ipynb',\n",
       " '15-Sklearn_Data_Preprocessing.ipynb',\n",
       " '16-Sklearn_Best_Practice_Techniques.ipynb',\n",
       " '17-Artificial_Neural_Network_with_tf_Keras.ipynb',\n",
       " '18-ANN_Case_Studies.ipynb',\n",
       " '19-Practical_Autoencoders.ipynb',\n",
       " '20-CNN_Fundamental.ipynb']"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 列出目前工作目錄下所有副檔名是 .ipynb 的檔案\n",
    "[f.name for f in pwd.glob('*.ipynb')]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"type-hints\"></a>\n",
    "\n",
    "## 9.7 資料型別提示 Type Hints\n",
    "\n",
    "版本 3.5 之後的 Python 在執行時期都有支援資料型別提示的語法,不需要載入特別的模組。\n",
    "\n",
    "```\n",
    "def function(arg: arg_type) -> return_type:\n",
    "    statements\n",
    "    return value\n",
    "```\n",
    "\n",
    "Python 是動態型別的程式語言,沒有強制變數或函式參數要事先宣告型別,但在大型的專案中,有型別的提示可以讓程式的結構設計具備較高的可讀性。 進階的型別支援功能可以透過載入 [`typing`](https://docs.python.org/3/library/typing.html) 模組來取得。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Mary >.^ '"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# def 函式名稱(參數: 型別) -> 回傳型別\n",
    "def jiume(who: str) -> str:\n",
    "    return who + ' >.^ '\n",
    "\n",
    "jiume('Mary')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def addmyself(myself: int) -> int:\n",
    "    return myself + myself\n",
    "\n",
    "addmyself(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Mary >.^ Mary >.^ Mary >.^ \n",
      "30\n"
     ]
    }
   ],
   "source": [
    "from typing import Any\n",
    "\n",
    "def triple(what: Any) -> Any:\n",
    "    return what * 3\n",
    "\n",
    "print(triple(jiume('Mary')))\n",
    "print(triple(addmyself(5)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1, 2, 3) dot (4, 5, 6) = 32.0\n"
     ]
    }
   ],
   "source": [
    "from math import fsum\n",
    "\n",
    "# dot_product() 函式接受兩個 list 當參數\n",
    "def dot_product(vec1: list, vec2: list) -> float:\n",
    "    return fsum(c1 * c2 for c1, c2 in zip(vec1, vec2))\n",
    "\n",
    "# 型別提醒就只是提醒,傳兩個 tuple 還是可以正常運作\n",
    "vector1, vector2 = (1, 2, 3), (4, 5, 6)\n",
    "print('{} dot {} = {}'.format(vector1, vector2, dot_product(vector1, vector2)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### § 型別別名 Type Aliases\n",
    "\n",
    "當某個型別定義在很深層的套件的模組裡時,使用別名可以讓程式看起來簡潔。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Tuple\n",
    "from math import hypot\n",
    "\n",
    "point3d = Tuple[float, float, float]\n",
    "\n",
    "def distance3d(p1: point3d, p2: point3d) -> float:\n",
    "    return hypot(*[(x1 - x2) for x1, x2 in zip(p1, p2)])\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "distance between points (1, 2, 3) and (3.0, 2.0, 1.0) = 2.8284271247461903\n"
     ]
    }
   ],
   "source": [
    "a = (1, 2, 3)\n",
    "b = (3., 2., 1.)\n",
    "print('distance between points', a, 'and', b, '=', distance3d(a, b))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}