{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "<p style=\"text-align:center\">\n", " <a href=\"https://nbviewer.jupyter.org/github/twMr7/Python-Machine-Learning/blob/master/09-Other_Utilities.ipynb\">\n", " Open In Jupyter nbviewer\n", " <img style=\"float: center;\" src=\"https://nbviewer.jupyter.org/static/img/nav_logo.svg\" width=\"120\" />\n", " </a>\n", "</p>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[](https://colab.research.google.com/github/twMr7/Python-Machine-Learning/blob/master/09-Other_Utilities.ipynb)\n", "\n", "# 9. 其他實用工具 Other Utilities\n", "\n", "Python 的通用性來自於豐富的標準函式庫,本章介紹以下幾種常用的工具模組。\n", "+ [**9.1 日期與時間(Date and Time)**](#module-datetime)\n", "+ [**9.2 物件序列化(Python Object Serialization)**](#module-pickle)\n", "+ [**9.3 Jason**](#module-json)\n", "+ [**9.4 亂數(Random Numbers)**](#module-random)\n", "+ [**9.5 數學函數(Math Functions)**](#module-math)\n", "+ [**9.6 檔案系統路徑(File System Paths)**](#module-pathlib)\n", "+ [**9.7 資料型別提示(Type Hints)**](#type-hints)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a id=\"module-datetime\"></a>\n", "\n", "## 9.1 日期與時間 Date and Time\n", "\n", "Python 標準函式庫中的 [`datetime`](https://docs.python.org/3/library/datetime.html#datetime-objects) 模組可以用來處理日期時間相關的資料,包含了 `date`, `time`, `datetime`, `timedelta`, `timezone` 等型別。\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### § timedelta\n", "\n", "**`timedelta`** 物件是用來表示時間差、時間概念上的距離,不是特定某天幾點幾分的時間,可以用來進行加減乘除的四則運算。\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datetime.timedelta(days=64, seconds=29156, microseconds=10)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 載入 timedelta 模組\n", "from datetime import timedelta\n", "\n", "delta = timedelta(\n", " days=50,\n", " seconds=27,\n", " microseconds=10,\n", " milliseconds=29000,\n", " minutes=5,\n", " hours=8,\n", " weeks=2\n", ")\n", "# Only days, seconds and microseconds are stored internally\n", "delta" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datetime.timedelta(days=3650)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ten_years = timedelta(days=365) * 10\n", "ten_years" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ten_years.days // 365" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### § date\n", "\n", "`date` 有 **year**、**month**、**day** 屬性,沒有時分秒的概念,算數運算與比較會忽略 `timedelta.seconds` 和 `timedelta.microseconds`。\n", "\n", "| 操作範例 | \n", "|---------------------------------|\n", "| **`date2 = date1 + timedelta`** |\n", "| **`date2 = date1 - timedelta`** |\n", "| **`timedelta = date1 - date2`** |\n", "| **`date1 < date2`** |\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datetime.date(2021, 3, 17)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 載入 date 模組\n", "from datetime import date\n", "\n", "today = date.today()\n", "today" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datetime.date(2022, 3, 17)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "today.replace(year=2022)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'2011-03-20'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(today - ten_years).isoformat()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datetime.timedelta(days=366)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 記得 timedelta 只有天數的概念,沒有年月的概念\n", "date.today() - date(2020, 3, 16)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### § time\n", "\n", "`time` 是時區性的時分秒概念的物件,有**hour**、**minute**、**second**、**microsecond**、**tzinfo**的屬性。\n", "\n", "\n", "### § datetime\n", "\n", "`datetime` 綜合了 `date` 與 `time` 物件資訊的物件。\n", "\n", "| 操作範例 | \n", "|-----------------------------------------|\n", "| **`datetime2 = datetime1 + timedelta`** |\n", "| **`datetime2 = datetime1 - timedelta`** |\n", "| **`timedelta = datetime1 - datetime2`** |\n", "| **`datetime1 < datetime2`** |\n", "\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datetime.datetime(2021, 3, 17, 11, 39, 4, 937520)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 載入 datetime 模組\n", "from datetime import datetime\n", "\n", "# 現在的日期時間,返回 datetime 型別\n", "t1 = datetime.now()\n", "t1" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'2021-03-17 11:39:04.937520'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 轉成字串 str 型別\n", "str(t1)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'03/17/2021 11:39:04'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 現在的日期時間,轉成指定格式的字串\n", "t1.strftime('%m/%d/%Y %H:%M:%S')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "t1 比 t2 晚 139 days, 20:29:04.937520\n" ] } ], "source": [ "# 從日期時間的字串轉成 datetime 型別\n", "t2 = datetime.strptime('2020-10-28 15:10:00', '%Y-%m-%d %H:%M:%S')\n", "\n", "# 比較兩個 datetime\n", "if (t1 > t2):\n", " print('t1 比 t2 晚', t1 - t2)\n", "else:\n", " print('t2 比 t1 晚', t2 - t1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "標準函式庫還有另外一個 `time` 模組,提供了專門用來處理時間相關的函式,大多是從系統的C函式庫來的比較低階的處理。" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'20210317 113905'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# time 模組也有 strftime 可以用來格式化時間字串\n", "import time\n", "time.strftime('%Y%m%d %H%M%S')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a id=\"module-pickle\"></a>\n", "\n", "## 9.2 物件序列化 Python Object Serialization\n", "\n", "Python 標準函式庫中的 [`pickle`](https://docs.python.org/3/library/pickle.html#module-pickle) 模組,提供了將 Python 物件序列化(serializing)及解序列化(de-serializing)的方法。 序列化指的是將物件階層轉換成位元組串流(byte stream),以方便物件的儲存、網路傳送、以及不同平臺的互通交換,反向的解序列化操作則是將位元組串流轉換成物件階層。\n", "\n", "+ `pickle` 模組可以將物件儲存至檔案,或從檔案載入物件,檔案的存取需使用 binary 模式。\n", "+ `pickle` 模組提供的序列化功能只適用於 Python 物件專用,標準函式庫中另外有跨平臺及程式語言的通用型的序列化模組 [`json`](https://docs.python.org/3/library/json.html#module-json),但 `json` 只支援較少的 Python 內建物件型別。" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# 載入 pickle 模組\n", "import pickle" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# 建立一個數據記錄的結構\n", "tformat = '%Y-%m-%d %H:%M:%S'\n", "record = [\n", " {'時間':datetime.strptime('2019-04-03 10:35:58', tformat), '體溫':37.0, '速度':35.0, '心率':92},\n", " {'時間':datetime.strptime('2019-04-03 10:37:00', tformat), '體溫':37.1, '速度':33.8, '心率':97},\n", " {'時間':datetime.strptime('2019-04-03 10:37:59', tformat), '體溫':37.4, '速度':35.5, '心率':99}\n", "]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# 開啟新的 binary 檔案,用 pickle 將 record 物件 serialize\n", "# 注意: pickle 的檔案是 binary 的格式\n", "pfile = open('record.pkl', 'wb')\n", "pickle.dump(record, pfile)\n", "pfile.close()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "# 學過 context manager 了,應該這樣寫比較ok\n", "with open('record.pkl', 'wb') as pfile:\n", " pickle.dump(record, pfile)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "# 讀入檔案,將 record 物件 de-serialize\n", "pfile = open('record.pkl', 'rb')\n", "record2 = pickle.load(pfile)\n", "pfile.close()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'時間': datetime.datetime(2019, 4, 3, 10, 35, 58),\n", " '體溫': 37.0,\n", " '速度': 35.0,\n", " '心率': 92},\n", " {'時間': datetime.datetime(2019, 4, 3, 10, 37),\n", " '體溫': 37.1,\n", " '速度': 33.8,\n", " '心率': 97},\n", " {'時間': datetime.datetime(2019, 4, 3, 10, 37, 59),\n", " '體溫': 37.4,\n", " '速度': 35.5,\n", " '心率': 99}]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 載入 pickle 物件也改成 context manager 的寫法\n", "with open('record.pkl', 'rb') as pfile:\n", " record2 = pickle.load(pfile)\n", "\n", "record2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a id=\"module-json\"></a>\n", "\n", "## 9.3 Json\n", "\n", "JSON(JavaScript Object Notation)是常用的公開規格的資料交換格式,副檔名慣例為 **`.json`**。 Python 標準函式庫中的 [`json`](https://docs.python.org/3/library/json.html) 模組提供了類似 `pickle` 的方法,可以用來將內建的物件型別輸出成 JSON 檔,或反過來載入用 JSON 格式編碼的物件。 支援的物件類型與 JSON 編碼的對應表列如下:\n", "\n", "| JSON 物件 | Python 物件 |\n", "|-----------|--------------|\n", "| *object* | **dict** |\n", "| *array* | **list** |\n", "| *string* | **str** |\n", "| *int* | **int** |\n", "| *real* | **float** |\n", "| *true* | **True** |\n", "| *false* | **False** |\n", "| *null* | **None** |\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'image': {'Width': 600, 'Height': 800, 'Title': 'Portrait'},\n", " 'person': {'firstName': 'John',\n", " 'lastName': 'Doe',\n", " 'isAlive': True,\n", " 'age': 27,\n", " 'phoneNumbers': [{'type': 'home', 'number': '212 555-1234'},\n", " {'type': 'office', 'number': '646 555-4567'}],\n", " 'spouse': None}}" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 準備要儲存成 JSON 檔的物件\n", "card = {\n", " \"image\": {\n", " \"Width\": 600,\n", " \"Height\": 800,\n", " \"Title\": \"Portrait\",\n", " },\n", " \n", " \"person\": {\n", " \"firstName\": \"John\",\n", " \"lastName\": \"Doe\",\n", " \"isAlive\": True,\n", " \"age\": 27,\n", " \"phoneNumbers\": [\n", " {\n", " \"type\": \"home\",\n", " \"number\": \"212 555-1234\"\n", " },\n", " {\n", " \"type\": \"office\",\n", " \"number\": \"646 555-4567\"\n", " }\n", " ],\n", " \"spouse\": None\n", " }\n", "}\n", "\n", "card" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "# 載入 json 模組\n", "import json" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "# 開啟新的文字檔案,將 python 物件編碼成 JSON 輸出到檔案\n", "# 注意: .json 的檔案是文字格式\n", "with open('card.json', 'w') as jfile:\n", " json.dump(card, jfile)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'image': {'Width': 600, 'Height': 800, 'Title': 'Portrait'},\n", " 'person': {'firstName': 'John',\n", " 'lastName': 'Doe',\n", " 'isAlive': True,\n", " 'age': 27,\n", " 'phoneNumbers': [{'type': 'home', 'number': '212 555-1234'},\n", " {'type': 'office', 'number': '646 555-4567'}],\n", " 'spouse': None}}" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 開啟.json檔案,將 JSON 編碼的物件載入轉成 Python 物件\n", "with open('card.json', 'r') as jfile:\n", " card2 = json.load(jfile)\n", "\n", "card2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a id=\"module-random\"></a>\n", "\n", "## 9.4 亂數 Random Numbers\n", "\n", "Python 標準函式庫中的 [`random`](https://docs.python.org/3/library/random.html) 模組,提供了擬隨機(pseudo-random)亂數產生的方法。\n", "\n", "+ `random()` - 返回下一個 [0.0, 1.0) 區間內的隨機實數。\n", "+ `randrange(start, stop[, step])` - 返回下一個 [start, stop) 區間內的隨機整數。\n", "+ `randint(a, b)` - 返回下一個 [a, b] 區間內的隨機整數,同 `randrange(a, b+1)`。\n", "+ `choice(seq)` - 從 seq 序列中隨機選取其中一個成員。\n", "+ `shuffle(seq)` - 將 seq 序列中的元素順序重新隨機排列,序列必須是可就地變更的容器類別。\n", "+ `sample(seq, k)` - 從 seq 序列或集合中,返回隨機選取 k 個樣本的 List 清單。" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "# 載入 random 模組\n", "import random" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.49356032291668317, 0.2411096099519041, 0.24274807400132448, 0.6806583376348618, 0.4007445154008329, 0.5020029135428243, 0.8753881901813494, 0.2064575923259282, 0.5471457366330188, 0.6100255703399993, 0.6754079019577277, 0.028047998071924818, 0.9113825389832221, 0.12481777147365836, 0.9197373983810999, 0.4972709758446857, 0.3109592944098044, 0.6135724808834165, 0.9030384725303868, 0.4016312744745454, 0.4224038103832404, 0.4288471001262948, 0.466593180358435, 0.47192722041625734, 0.23088689632757775, 0.5080219321975132, 0.23119624893044544, 0.766038585063923, 0.9432255999781156, 0.4438950384837139, 0.008875158371981162, 0.7747935214178607, 0.8328097496865488, 0.03820973930946581, 0.2425404000214182, 0.20378756529358255, 0.9011823798074147, 0.9429615434171732, 0.008365499494094153, 0.30710204474646563, 0.7714685577914125, 0.44228662030116717, 0.47293711004645833, 0.9272615793168927, 0.6274046238104688, 0.06124138720195915, 0.20303402087805467, 0.00551746765636052, 0.29608613992703825, 0.8392178754932461, 0.4092118352756213, 0.5371365032419317, 0.45466517861888456, 0.07374675487497973, 0.38185686745620595, 0.0329699613561929, 0.15696658530359375, 0.8553618168471122, 0.3398270023737717, 0.7099281747926457, 0.23799987535976463, 0.8074101269935077, 0.7577429606338424, 0.06977378136798007, 0.38924769663238856, 0.13725757006274264, 0.9314593644916109, 0.5800782709115695, 0.5442005381571936, 0.09078629592206988, 0.33982237325614884, 0.5233793070412616, 0.24834424239295672, 0.10656032584693331, 0.6222531212266836, 0.11026076444867894, 0.5767222541627232, 0.8213605421082407, 0.4042220342914117, 0.5086261360293176, 0.10830352652922781, 0.31460119979844836, 0.29420699858731425, 0.27102235776003614, 0.7215244001345396, 0.003757854865288124, 0.13378381296717023, 0.4004733598153881, 0.508240336008106, 0.9004693485487324, 0.5161319869840322, 0.44929251677640736, 0.0816625073655195, 0.40020492559186704, 0.305598752686734, 0.20873123907155022, 0.3384601272052107, 0.22090273575138908, 0.7775569232073181, 0.8853173389504151]\n" ] } ], "source": [ "# 產生 100 個隨機實數數列\n", "Lr = [random.random() for x in range(100)]\n", "print(Lr)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[11, 57, 31, 91, 59, 91, 10, 56, 74, 13, 38, 5, 94, 73, 37, 86, 90, 5, 14, 56, 63, 74, 69, 51, 43, 22, 10, 57, 98, 47, 60, 82, 34, 77, 41, 33, 15, 95, 70, 95, 48, 48, 17, 28, 40, 44, 100, 17, 58, 93, 100, 16, 91, 82, 85, 96, 40, 3, 86, 92, 33, 36, 60, 63, 53, 30, 76, 93, 23, 84, 100, 2, 73, 47, 90, 67, 67, 46, 16, 15, 6, 74, 32, 46, 28, 24, 47, 27, 7, 65, 19, 9, 27, 70, 49, 79, 36, 6, 87, 4]\n" ] } ], "source": [ "# 產生 100 個隨機整數數列\n", "Li = [random.randint(1, 100) for x in range(100)]\n", "print(Li)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1.5547964720490222,\n", " 30.879379115434432,\n", " 26.426500783069663,\n", " 1.2249376104827925,\n", " 2.030760763231264,\n", " 77.75569232073181,\n", " 23.323987785256936,\n", " 42.76530392264297,\n", " 68.92799297237922,\n", " 13.824802801220837]" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 從數列中隨機選取 10 個樣本,產生新的隨機數列\n", "[x * y for x, y in zip(random.sample(Lr, 10), random.sample(Li, 10))]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a id=\"module-math\"></a>\n", "\n", "## 9.5 數學函數 Math Functions\n", "\n", "Python 標準函式庫中的 [`math`](https://docs.python.org/3/library/math.html) 模組,提供了用於實數運算的常用函數。" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "# 載入 math 模組\n", "import math" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9999999999999999\n" ] } ], "source": [ "# 內建函式的 sum() 在浮點數運算的精度不足\n", "print(sum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]))" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.0\n" ] } ], "source": [ "# math 模組的 fsum() 可避免精度的誤差\n", "print(math.fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]))" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cosine(pi) = -1.0\n" ] } ], "source": [ "# cosine 180 度\n", "print('cosine(pi) =', math.cos(math.pi))" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sine(pi/2) = 1.0\n" ] } ], "source": [ "# sine 90 度\n", "print('sine(pi/2) =', math.sin(math.radians(90)))" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Euclidean distance between (1, 3, 5, 7, 9), (2, 4, 6, 8, 10) = 2.23606797749979\n" ] } ], "source": [ "# 載入 math 模組裡要用到的函式\n", "from math import sqrt\n", "\n", "# 計算 N 維的歐幾里得距離\n", "def EuclideanDist(p1, p2):\n", " return sqrt(sum((x1 - x2) ** 2 for x1, x2 in zip(p1, p2)))\n", "\n", "m, n = (1, 3, 5, 7, 9), (2, 4, 6, 8, 10)\n", "print('Euclidean distance between {}, {} = {}'.format(m, n, EuclideanDist(m, n)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a id=\"module-pathlib\"></a>\n", "\n", "## 9.6 檔案系統路徑 File System Paths\n", "\n", "Python 標準函式庫中的 [`pathlib`](https://docs.python.org/3/library/pathlib.html) 模組,提供了通用於不同平台的檔案系統路徑操作,`Path` 物件可以比較、解析路徑的組成部份、也可以串接重組,主要有以下屬性:\n", "\n", "+ `Path.drive` - 目標路徑的磁碟代號\n", "+ `Path.root` - 目標路徑的根目錄\n", "+ `Path.parent` - 目標路徑的上層目錄\n", "+ `Path.name` - 目標路徑最後部份的名字\n", "+ `Path.suffix` - 目標路徑最後部份的副檔名\n", "+ `Path.stem` - 目標路徑最後部份去除副檔名的名字\n", "\n", "常用的 `Path` 類別方法如下:\n", "\n", "+ `Path.cwd()` - 目前工作目錄。\n", "+ `Path.home()` - 登入使用者的家目錄。\n", "+ `Path(str)` - 從字串 str 建立路徑物件。\n", "+ `Path.exists()` - 路徑的檔案或目錄是否存在。\n", "+ `Path.glob(pattern)` - 返回生成函式,用來列出路徑下符合指定 pattern 的所有檔案或目錄。\n", "+ `Path.is_dir()` - 檢查路徑的目標是否爲目錄。\n", "+ `Path.is_file()` - 檢查路徑的目標是否爲檔案。\n", "+ `Path.iterdir()` - 當目標路徑爲目錄時,用來迭代尋訪目錄下的所有檔案。\n", "+ `Path.mkdir()` - 當目標路徑爲目錄時,爲該目標建立目錄。\n", "+ `Path.rename(new_name)` - 重新命名檔案。\n", "+ `Path.open(mode)` - 功能同內建函式 `open()`,使用指定模式開啓檔案,返回檔案物件。\n", "+ `Path.rmdir()` - 刪除目錄,必須是空目錄才能刪除。\n", "+ `Path.unlink()` - 刪除檔案或連結(symbolic link)。\n" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "# 載入 Path 類別\n", "from pathlib import Path" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Current working directory: D:\\Users\\James\\Documents\\Code\\Lecture\\Python-Machine-Learning\\Lecture-Notes\n", ".directory\n", ".git\n", ".gitignore\n", ".ipynb_checkpoints\n", "01-Getting_Started.ipynb\n", "02-Syntax_Overview_1.ipynb\n", "03-Syntax_Overview_2.ipynb\n", "04-String_Operations.ipynb\n", "05-List_Operations.ipynb\n", "06-Tuple_Operations.ipynb\n", "07-Dict_Operations.ipynb\n", "08-File_Operations.ipynb\n", "09-Other_Utilities.ipynb\n", "10-Coding_Project.ipynb\n", "11-Numpy_Vectorized_Computation.ipynb\n", "12-Matplotlib_Data_Visualization.ipynb\n", "13-Pandas_Data_Processing.ipynb\n", "14-Sklearn_Building_A_Machine_Learning_Model.ipynb\n", "15-Sklearn_Data_Preprocessing.ipynb\n", "16-Sklearn_Best_Practice_Techniques.ipynb\n", "17-Artificial_Neural_Network_with_tf_Keras.ipynb\n", "18-ANN_Case_Studies.ipynb\n", "19-Practical_Autoencoders.ipynb\n", "20-CNN_Fundamental.ipynb\n", "card.json\n", "dataset\n", "QuickStart\n", "README.md\n", "record.pkl\n" ] } ], "source": [ "# 取得目前工作目錄\n", "pwd = Path.cwd()\n", "print('Current working directory: ', pwd)\n", "\n", "# 列出目前工作目錄下所有的檔案及目錄\n", "for f in pwd.iterdir():\n", " print(f.name)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "File \"card.json\"\" is removed.\n", "File \"record.pkl\"\" is removed.\n" ] } ], "source": [ "# 建構 Path 物件可以用不同的表示法\n", "file2remove = [Path(pwd, 'card.json'), Path(pwd / 'record.pkl')]\n", "\n", "# 刪除之前建立的測試用檔案\n", "for path in file2remove:\n", " if path.exists():\n", " path.unlink()\n", " print('File \"{}\"\" {} removed.'.format(path.name, 'is not' if path.exists() else 'is'))\n" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'.directory': '',\n", " '.gitignore': '',\n", " '01-Getting_Started': '.ipynb',\n", " '02-Syntax_Overview_1': '.ipynb',\n", " '03-Syntax_Overview_2': '.ipynb',\n", " '04-String_Operations': '.ipynb',\n", " '05-List_Operations': '.ipynb',\n", " '06-Tuple_Operations': '.ipynb',\n", " '07-Dict_Operations': '.ipynb',\n", " '08-File_Operations': '.ipynb',\n", " '09-Other_Utilities': '.ipynb',\n", " '10-Coding_Project': '.ipynb',\n", " '11-Numpy_Vectorized_Computation': '.ipynb',\n", " '12-Matplotlib_Data_Visualization': '.ipynb',\n", " '13-Pandas_Data_Processing': '.ipynb',\n", " '14-Sklearn_Building_A_Machine_Learning_Model': '.ipynb',\n", " '15-Sklearn_Data_Preprocessing': '.ipynb',\n", " '16-Sklearn_Best_Practice_Techniques': '.ipynb',\n", " '17-Artificial_Neural_Network_with_tf_Keras': '.ipynb',\n", " '18-ANN_Case_Studies': '.ipynb',\n", " '19-Practical_Autoencoders': '.ipynb',\n", " '20-CNN_Fundamental': '.ipynb',\n", " 'README': '.md'}" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 建立一個記錄檔案名字與副檔名對照的字典\n", "{f.stem:f.suffix for f in pwd.iterdir() if f.is_file()}" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['01-Getting_Started.ipynb',\n", " '02-Syntax_Overview_1.ipynb',\n", " '03-Syntax_Overview_2.ipynb',\n", " '04-String_Operations.ipynb',\n", " '05-List_Operations.ipynb',\n", " '06-Tuple_Operations.ipynb',\n", " '07-Dict_Operations.ipynb',\n", " '08-File_Operations.ipynb',\n", " '09-Other_Utilities.ipynb',\n", " '10-Coding_Project.ipynb',\n", " '11-Numpy_Vectorized_Computation.ipynb',\n", " '12-Matplotlib_Data_Visualization.ipynb',\n", " '13-Pandas_Data_Processing.ipynb',\n", " '14-Sklearn_Building_A_Machine_Learning_Model.ipynb',\n", " '15-Sklearn_Data_Preprocessing.ipynb',\n", " '16-Sklearn_Best_Practice_Techniques.ipynb',\n", " '17-Artificial_Neural_Network_with_tf_Keras.ipynb',\n", " '18-ANN_Case_Studies.ipynb',\n", " '19-Practical_Autoencoders.ipynb',\n", " '20-CNN_Fundamental.ipynb']" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 列出目前工作目錄下所有副檔名是 .ipynb 的檔案\n", "[f.name for f in pwd.glob('*.ipynb')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a id=\"type-hints\"></a>\n", "\n", "## 9.7 資料型別提示 Type Hints\n", "\n", "版本 3.5 之後的 Python 在執行時期都有支援資料型別提示的語法,不需要載入特別的模組。\n", "\n", "```\n", "def function(arg: arg_type) -> return_type:\n", " statements\n", " return value\n", "```\n", "\n", "Python 是動態型別的程式語言,沒有強制變數或函式參數要事先宣告型別,但在大型的專案中,有型別的提示可以讓程式的結構設計具備較高的可讀性。 進階的型別支援功能可以透過載入 [`typing`](https://docs.python.org/3/library/typing.html) 模組來取得。\n" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Mary >.^ '" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# def 函式名稱(參數: 型別) -> 回傳型別\n", "def jiume(who: str) -> str:\n", " return who + ' >.^ '\n", "\n", "jiume('Mary')" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def addmyself(myself: int) -> int:\n", " return myself + myself\n", "\n", "addmyself(5)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mary >.^ Mary >.^ Mary >.^ \n", "30\n" ] } ], "source": [ "from typing import Any\n", "\n", "def triple(what: Any) -> Any:\n", " return what * 3\n", "\n", "print(triple(jiume('Mary')))\n", "print(triple(addmyself(5)))" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 2, 3) dot (4, 5, 6) = 32.0\n" ] } ], "source": [ "from math import fsum\n", "\n", "# dot_product() 函式接受兩個 list 當參數\n", "def dot_product(vec1: list, vec2: list) -> float:\n", " return fsum(c1 * c2 for c1, c2 in zip(vec1, vec2))\n", "\n", "# 型別提醒就只是提醒,傳兩個 tuple 還是可以正常運作\n", "vector1, vector2 = (1, 2, 3), (4, 5, 6)\n", "print('{} dot {} = {}'.format(vector1, vector2, dot_product(vector1, vector2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### § 型別別名 Type Aliases\n", "\n", "當某個型別定義在很深層的套件的模組裡時,使用別名可以讓程式看起來簡潔。" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "from typing import Tuple\n", "from math import hypot\n", "\n", "point3d = Tuple[float, float, float]\n", "\n", "def distance3d(p1: point3d, p2: point3d) -> float:\n", " return hypot(*[(x1 - x2) for x1, x2 in zip(p1, p2)])\n" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "distance between points (1, 2, 3) and (3.0, 2.0, 1.0) = 2.8284271247461903\n" ] } ], "source": [ "a = (1, 2, 3)\n", "b = (3., 2., 1.)\n", "print('distance between points', a, 'and', b, '=', distance3d(a, b))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 2 }