{ "cells": [ { "cell_type": "markdown", "id": "southern-variable", "metadata": { "deletable": false, "editable": false }, "source": [ "# Python 的函數使用方式\n", "\n", "> 認識函數的基本觀念\n", "\n", "[數據交點](https://www.datainpoint.com) | 郭耀仁 " ] }, { "cell_type": "markdown", "id": "published-option", "metadata": { "deletable": false, "editable": false }, "source": [ "## 解析哈囉世界\n", "\n", "在 [Python 起步走](https://medium.com/datainpoint/getting-started-with-python-b78bd029df4d)我們透過 `print(\"Hello, World!\")` 來確認 Python 的開發環境及執行環境是否已經安裝妥當,在這一段哈囉世界的程式碼中,其實就能體驗程式設計與資料分析的一個核心精神:對資料應用函數(Apply functions to data)。\n", "\n", "`print` 是函數的名稱,小括號則表示要使用該函數的功能,而 `\"Hello, World!\"` 則是一個被稱呼為字串(String)的資料型別;換言之,哈囉世界可以表達為:對 `\"Hello, World!\"` 這個字串應用了 `print` 函數。" ] }, { "cell_type": "code", "execution_count": 1, "id": "deluxe-christopher", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello, World!\n" ] } ], "source": [ "print(\"Hello, World!\")" ] }, { "cell_type": "markdown", "id": "accredited-center", "metadata": { "deletable": false, "editable": false }, "source": [ "除了直接對 `\"Hello, World!\"` 字串應用 `print` 函數,更常見的作法是將資料與物件名稱透過 `=` 符號來產生參照宣告(Reference),這個動作又被稱作宣告(Declaration)或者賦值(Assignment),完整的說法為`hello_message` 物件是字串這種資料型別的實例(Instance),哈囉世界可以表達為:對 `hello_message` 這個由字串實例化的物件應用 `print` 函數。" ] }, { "cell_type": "code", "execution_count": 2, "id": "thick-transparency", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello, World!\n" ] } ], "source": [ "hello_message = \"Hello, World!\"\n", "print(hello_message)" ] }, { "cell_type": "markdown", "id": "lyric-broadcast", "metadata": { "deletable": false, "editable": false }, "source": [ "在為物件取名稱的時候,應該參考 Python 改進提案(Python Enhancement Proposals, PEPs)第八項 :\n", "\n", "- 命名的時候使用全小寫英文,採用蛇形命名法(Snake case),不同單字之間以底線 `_` 相隔。\n", "- 不能使用保留字作物件的命名,會產生錯誤,Python 的保留字可以參考:\n", "- 使用單數名詞為資料型別(Data types)實例的物件命名,使用複數名詞為資料結構(Data structures)實例的物件命名,使用動詞為函數(Functions)或方法(Methods)命名,盡量讓名稱簡潔且具有意義。\n", "- 不要使用內建函數作物件的命名,避免覆蓋內建函數的功能,Python 的內建函數可以參考:。\n", "\n", "## 什麼是函數\n", "\n", "函數是構成所有程式語言的基本要素,一個函數是一段「被命名」的程式碼,這段程式碼可以用於執行某個特定任務,可能是數值的運算或者文字的處理,當使用者想要應用(使用、呼叫)某一個函數之前,必須先確定在欲產生作用的範疇中該函數已經被定義或者被載入。簡而言之,在哈囉世界的程式碼中,之所以能夠對字串 `\"Hello, World!\"` 應用 `print` 函數,是因為 `print` 函數屬於 Python 的內建函數,在 Python 啟動的當下就會被載入供我們使用,內建函數多達 74 個,在這篇文章中我們會用到的 `print` 函數與 `sorted` 函數都是內建函數的成員,Python 的內建函數可以參考:。\n", "\n", "## 函數的來源\n", "\n", "Python 使用者除了能夠應用內建函數以外,還可以從其他三個管道取得數值計算或文字處理所需要的函數:\n", "\n", "1. 標準模組。\n", "2. 第三方模組。\n", "3. 自行定義。\n", "\n", "在這三個管道之中,僅有內建函數可以供我們直接使用,另外兩個管道的函數都必須在呼叫之前確定好在欲產生作用的範疇中已經被載入或者定義。取用標準模組以及第三方模組中的函數之前,必須確認模組是否已經安裝、是否已經載入,取用自行定義的函數之前,則是必須確認該函數是否已經完成定義。\n", "載入模組的保留字是 `import`(就像 [Python 起步走](https://medium.com/datainpoint/getting-started-with-python-b78bd029df4d)的 Python 禪學範例一樣),使用模組的函數則需要在其名稱之前利用句點(`.`)註明其模組名稱。\n", "\n", "```python\n", "import module_name\n", "\n", "module_name.function_name(INPUTS)\n", "```\n", "\n", "例如引入標準模組 `random`,然後使用其中的 `randint` 函數隨機在 0 與 1 之間挑選整數。" ] }, { "cell_type": "code", "execution_count": 3, "id": "innovative-might", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n" ] } ], "source": [ "import random\n", "\n", "random_integer = random.randint(0, 1)\n", "print(random_integer)" ] }, { "cell_type": "markdown", "id": "legitimate-laundry", "metadata": { "deletable": false, "editable": false }, "source": [ "## 應用函數的語法\n", "最常見的函數應用語法是在函數名稱後接上小括號,並在小括號中輸入物件名稱以及引數(如果函數有設計任何的參數)。\n", "\n", "```python\n", "function_name(OBJECT, ARGUMENTS)\n", "```\n", "\n", "例如內建函數 `sorted` 可以將未來我們會認識的資料結構串列(list)中的數字依照大小排序,這個函數設計了參數 `reverse` 來表示要由小往大排(遞增排序),或者要由大往小排(遞減排序),在應用該函數的時候,如果沒有指定,參數 `reverse` 就會採預設值 `False` 也就是遞增排序;如果指定參數 `reverse=True` 就會遞減排序。" ] }, { "cell_type": "code", "execution_count": 4, "id": "accessory-recipient", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2, 3, 5, 7, 11]\n", "[11, 7, 5, 3, 2]\n" ] } ], "source": [ "list_to_be_sorted = [11, 7, 5, 3, 2]\n", "print(sorted(list_to_be_sorted))\n", "print(sorted(list_to_be_sorted, reverse=True))" ] }, { "cell_type": "markdown", "id": "alien-decision", "metadata": { "deletable": false, "editable": false }, "source": [ "## 參數與引數的差別\n", "\n", "上文提到了引數(Arguments)與參數(Parameters)兩個名詞,多數時候混淆它們其實無傷大雅,具體來說,在函數的定義或者描述階段,我們會稱為參數;在函數的使用階段,我們會稱傳入引數來調整函數的參數值。" ] }, { "cell_type": "code", "execution_count": 5, "id": "driving-chain", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[11, 7, 5, 3, 2]\n" ] } ], "source": [ "list_to_be_sorted = [11, 7, 5, 3, 2]\n", "argument_to_be_passed = True\n", "print(sorted(list_to_be_sorted, reverse=argument_to_be_passed))" ] }, { "cell_type": "markdown", "id": "serious-adelaide", "metadata": { "deletable": false, "editable": false }, "source": [ "函數在設計參數的時候,會採取兩種方式讓使用者傳入引數,一種稱為關鍵字引數(Keyword arguments),另一種則稱為位置引數(Positional arguments),例如內建函數 `sorted` 的參數 `reverse` 就設計為關鍵字引數,使用時必須要給予參數名稱,否則就會發生錯誤。" ] }, { "cell_type": "code", "execution_count": 6, "id": "cathedral-charity", "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "sorted expected 1 argument, got 2", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mlist_to_be_sorted\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m11\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m7\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msorted\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlist_to_be_sorted\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: sorted expected 1 argument, got 2" ] } ], "source": [ "list_to_be_sorted = [11, 7, 5, 3, 2]\n", "print(sorted(list_to_be_sorted, True))" ] }, { "cell_type": "markdown", "id": "protecting-treaty", "metadata": { "deletable": false, "editable": false }, "source": [ "例如內建函數 `round` 可以將輸入的浮點數依照「剛好為 5 的時候往偶數進位或捨去 Round to even when rounding off a 5」規則輸出進位或捨去的結果,這個函數設計了參數 `ndigits` 來表示輸出的小數位數,參數 `ndigits` 就設計為兩種引數傳入方式都可以的機制。" ] }, { "cell_type": "code", "execution_count": 7, "id": "conservative-disabled", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.14\n", "3.14\n" ] } ], "source": [ "float_to_be_rounded = 3.14159\n", "print(round(float_to_be_rounded, 2))\n", "print(round(float_to_be_rounded, ndigits=2))" ] }, { "cell_type": "markdown", "id": "careful-metropolitan", "metadata": { "deletable": false, "editable": false }, "source": [ "## 函數與方法\n", "\n", "Python 屬於物件導向(Object-oriented)程式語言,不論是資料型別、資料結構、函數、類別甚至模組,都包含於物件的範疇之中,因此有另外一種函數的存在形式為附屬在物件之下,這時應用的方式就變成了在物件名稱後接上句點(.)再接上函數名稱、小括號與引數(如果函數有設計任何的參數),不只應用上的語法相異,在稱呼上我們也不再用「函數」稱呼,而改稱為物件的方法(Methods)。\n", "\n", "```python\n", "OBJECT.method_name(ARGUMENTS)\n", "```\n", "\n", "例如上文提到的 Python 內建函數 `sorted` 可以讓未來我們會認識的資料結構串列(list)中的數字依照大小排序,串列也具有方法 `sort`,兩者都能夠讓串列排序,但是使用的語法與意義上不同,使用內建函數 `sorted` 排序串列是我們習慣且直觀的「對物件應用函數」,語法是 `sorted(OBJECT)`;使用方法 `sort` 排序是目前較陌生的「使用附屬於物件的方法」,語法是 `OBJECT.sort()`。" ] }, { "cell_type": "code", "execution_count": 8, "id": "experienced-mexican", "metadata": {}, "outputs": [], "source": [ "list_to_be_sorted = [11, 7, 5, 3, 2]\n", "# Apply sorted function to list_to_be_sorted\n", "sorted(list_to_be_sorted)\n", "# Call sort method of list_to_be_sorted\n", "list_to_be_sorted.sort()" ] }, { "cell_type": "markdown", "id": "harmful-hardware", "metadata": { "deletable": false }, "source": [ "## 變更結果的機制\n", "\n", "除了使用「函數」或「方法」的差異,另外值得注意之處在於函數與方法的使用都可能讓應用或所屬的物件造成變更,一種變更方式是以回傳值型態輸出變更後的結果;另一種變更方式是直接變更資料型別與資料結構而沒有輸出。這個差異能夠延續上文內建函數 `sorted` 與串列方法 `sort` 的例子,`sorted` 是以回傳值輸出排序後的串列,因此如果沒有將回傳值更新原本命名的串列,排序的變更並不會被保留。" ] }, { "cell_type": "code", "execution_count": 9, "id": "earned-creation", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[11, 7, 5, 3, 2]\n", "[2, 3, 5, 7, 11]\n" ] } ], "source": [ "list_to_be_sorted = [11, 7, 5, 3, 2]\n", "# Apply sorted function to list_to_be_sorted\n", "sorted(list_to_be_sorted)\n", "print(list_to_be_sorted) # list_to_be_sorted is not sorted \n", "# Update list_to_be_sorted with function output\n", "list_to_be_sorted = sorted(list_to_be_sorted)\n", "print(list_to_be_sorted) # list_to_be_sorted is sorted" ] }, { "cell_type": "markdown", "id": "documented-thesaurus", "metadata": { "deletable": false, "editable": false }, "source": [ "而 `sort` 則是直接將排序變更了,不需要更新原本命名的物件,也不會伴隨有回傳值。" ] }, { "cell_type": "code", "execution_count": 10, "id": "sharp-simple", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2, 3, 5, 7, 11]\n" ] } ], "source": [ "list_to_be_sorted = [11, 7, 5, 3, 2]\n", "# Call sort method of list_to_be_sorted\n", "list_to_be_sorted.sort()\n", "print(list_to_be_sorted) # list_to_be_sorted is sorted" ] }, { "cell_type": "markdown", "id": "black-router", "metadata": { "deletable": false, "editable": false }, "source": [ "多數情況下函數與方法會在「以回傳值輸出」或「直接變更物件的狀態」擇一,少數情況下會有兩者兼具的情況,例如未來我們將認識的資料結構串列(list)具備的方法 `pop` 就是兩者兼具,除了會將串列末端的資料拋出為回傳值,也會刪除串列末端的資料點。" ] }, { "cell_type": "code", "execution_count": 11, "id": "forced-serbia", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[11, 7, 5, 3]\n", "[11, 7, 5, 3]\n" ] } ], "source": [ "list_to_be_sorted = [11, 7, 5, 3, 2]\n", "the_last_element = list_to_be_sorted.pop()\n", "print(list_to_be_sorted) # The last element was deleted\n", "print(list_to_be_sorted) # The last element was returned" ] }, { "cell_type": "markdown", "id": "waiting-declaration", "metadata": { "deletable": false, "editable": false }, "source": [ "這意味著設計為「函數」或者「物件的方法」並不代表恰好對應「以回傳值輸出」或「直接變更物件狀態」,絕大多數的函數確實是以回傳值輸出,但物件的方法則可能有以回傳值輸出的設計、有直接變更物件狀態的設計甚至是用參數來決定要回傳值輸出或直接變更物件狀態。例如未來我們將認識的資料科學模組 Pandas 所創造的 DataFrame 類別具備的方法 `drop` 就是採用 `inplace` 參數來決定「以回傳值輸出」或「直接變更物件狀態」。" ] }, { "cell_type": "code", "execution_count": 12, "id": "unlike-partnership", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
orderprime
012
123
235
347
4511
\n", "
" ], "text/plain": [ " order prime\n", "0 1 2\n", "1 2 3\n", "2 3 5\n", "3 4 7\n", "4 5 11" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "df = pd.DataFrame()\n", "df[\"order\"] = list(range(1, 6))\n", "df[\"prime\"] = [2, 3, 5, 7, 11]\n", "df.drop(axis=1, labels=\"prime\")\n", "df # Column is not dropped" ] }, { "cell_type": "code", "execution_count": 13, "id": "minimal-clearance", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
order
01
12
23
34
45
\n", "
" ], "text/plain": [ " order\n", "0 1\n", "1 2\n", "2 3\n", "3 4\n", "4 5" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.drop(axis=1, labels=\"prime\", inplace=True)\n", "df # Column was dropped" ] }, { "cell_type": "markdown", "id": "limiting-notification", "metadata": { "deletable": false, "editable": false }, "source": [ "在初步認識了函數之後,Python 的函數使用方式來到尾聲,後續我們將在自行定義函數複習本篇文章的觀念。\n", "\n", "## 延伸閱讀\n", "\n", "- [Python 內建函數](hhttps://docs.python.org/3/library/functions.html)\n", "- [Python 保留字](https://docs.python.org/3/reference/lexical_analysis.html#keywords)" ] } ], "metadata": { "kernelspec": { "display_name": "Python Data Science", "language": "python", "name": "pyds" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 5 }