{ "cells": [ { "cell_type": "markdown", "id": "ed455b47", "metadata": {}, "source": [ "--- \n", " \n", "\n", "

Department of Data Science

\n", "

Course: Tools and Techniques for Data Science

\n", "\n", "---\n", "

Instructor: Muhammad Arif Butt, Ph.D.

" ] }, { "cell_type": "markdown", "id": "453283b9", "metadata": {}, "source": [ "

Lecture 2.15 (Part - I)

" ] }, { "cell_type": "markdown", "id": "515376c6", "metadata": {}, "source": [ "\"Open" ] }, { "cell_type": "markdown", "id": "c041eac7", "metadata": {}, "source": [ "## _Creating_Modules.ipynb_" ] }, { "cell_type": "markdown", "id": "4a466ee0", "metadata": {}, "source": [ " " ] }, { "cell_type": "code", "execution_count": null, "id": "43edc47d", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "597c5507", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "0ef18912", "metadata": {}, "source": [ " " ] }, { "cell_type": "code", "execution_count": null, "id": "17652508", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "8ea00da3", "metadata": {}, "source": [ " " ] }, { "cell_type": "code", "execution_count": null, "id": "a7a3634c", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "447ea2ac", "metadata": {}, "source": [ " " ] }, { "cell_type": "code", "execution_count": null, "id": "bfdef8dc", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "c84d6cd1", "metadata": {}, "source": [ "# Learning agenda of this notebook\n", "Modular Programming is a design technique to break your code into different parts. These parts in which we are breaking code into are called modules.\n", "\n", "**PART - I**\n", "1. Create a Module of your own\n", "2. Use the newly created module in this notebook file\n", "3. How Python locates a module?\n", "4. Reloading a module\n", "\n", "**PART - II**\n", "1. What are Python packages?\n", "2. How to create a Python Package?\n", "3. How to import modules from the package?" ] }, { "cell_type": "code", "execution_count": null, "id": "8e9320fc", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "4bf35ccf", "metadata": {}, "source": [ "## 1. Create a Module named _`mymodule`_\n", "- Create `mymodule.py` file as shown below in the current working directory, i.e., the directory in which this IPython Notebook file resides. " ] }, { "cell_type": "markdown", "id": "8ff90697", "metadata": {}, "source": [ "#### mymodule.py\n", "```\n", "AUTHOR = 'Arif Butt'\n", "BATCH = 2021\n", "\n", "def myfactorial(num):\n", " # code\n", " \n", "\n", "def myindex(numbers, no):\n", " # code\n", "\n", "def mysort(numbers, inplace=False):\n", " # code\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "255458e3", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "eddccc76", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "0607638d", "metadata": {}, "source": [ "## 2. Use the Functions of `mymodule`" ] }, { "cell_type": "code", "execution_count": 1, "id": "e4ffc4cd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['AUTHOR', 'BATCH', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'myfactorial', 'myindex', 'mysort']\n" ] } ], "source": [ "import mymodule as mm\n", "print(dir(mm))" ] }, { "cell_type": "code", "execution_count": null, "id": "9ba22c8a", "metadata": {}, "outputs": [], "source": [ "mm.AUTHOR" ] }, { "cell_type": "code", "execution_count": null, "id": "193dfd67", "metadata": {}, "outputs": [], "source": [ "mm.mysort" ] }, { "cell_type": "code", "execution_count": null, "id": "0b43ffe2", "metadata": {}, "outputs": [], "source": [ "mm.__name__" ] }, { "cell_type": "code", "execution_count": null, "id": "7f3ae021", "metadata": {}, "outputs": [], "source": [ "mm.__cached__" ] }, { "cell_type": "markdown", "id": "d12833ed", "metadata": {}, "source": [ ">- `.pyc` files are created by the Python interpreter when a `.py` file is imported. \n", ">- The `.pyc` file contain the **compiled bytecode** of the imported module, so that the \"translation\" from source code to bytecode (which only needs to be done once) can be skipped on subsequent imports to speed up startup.\n", ">- If the `.pyc` is older than the corresponding .py file, we do have to re-import it in our program.\n", ">- The `.pyc` file is still interpreted, however, cnce the `.pyc` file is generated, there is no need of `.py` file, unless you edit it." ] }, { "cell_type": "code", "execution_count": null, "id": "90a96624", "metadata": { "scrolled": true }, "outputs": [], "source": [ "mm.__builtins__" ] }, { "cell_type": "code", "execution_count": null, "id": "56494486", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "b3d3ece1", "metadata": {}, "source": [ "**Let us use the `factorial()` function**" ] }, { "cell_type": "code", "execution_count": null, "id": "412fb266", "metadata": {}, "outputs": [], "source": [ "mm.myfactorial(5)" ] }, { "cell_type": "code", "execution_count": null, "id": "d7625c43", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "815aa081", "metadata": {}, "source": [ "**Let us use the `myindex()` function**" ] }, { "cell_type": "code", "execution_count": 2, "id": "49acf41c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list1=[44, 99, -65, 101, 27, 88]\n", "mm.myindex(list1, 101)" ] }, { "cell_type": "code", "execution_count": null, "id": "a6a90f5c", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "d065467a", "metadata": {}, "source": [ "**Let us use the `mysort()` function. Remember the default inplace argument is False, so it does not modify the list that is passed rather returns a new sorted list**" ] }, { "cell_type": "code", "execution_count": 3, "id": "620727f0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[-65, 27, 44, 88, 99, 101]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list1=[44, 99, -65, 101, 27, 88]\n", "\n", "rv = mm.mysort(list1)\n", "rv" ] }, { "cell_type": "code", "execution_count": 4, "id": "34b503f3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[44, 99, -65, 101, 27, 88]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list1" ] }, { "cell_type": "code", "execution_count": null, "id": "c7ad2d47", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "26ef1cc7", "metadata": {}, "source": [ "**Let us use the `mysort()` function and pass `inplace` argument as True, so it sort the list that is passed and returns None**" ] }, { "cell_type": "code", "execution_count": 5, "id": "bcc55b97", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NoneType" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list1=[44, 99, -65, 101, 27, 88]\n", "rv = mm.mysort(list1, inplace=True)\n", "type(rv)" ] }, { "cell_type": "code", "execution_count": 6, "id": "93c736d6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[-65, 27, 44, 88, 99, 101]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list1" ] }, { "cell_type": "code", "execution_count": null, "id": "fbd64da2", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "d880daa8", "metadata": {}, "source": [ "## 3. How Python Locates a module?\n", "Let us try to import a module named `arifmodule` that is located in the following directory\n", "```\n", "/Users/arif/Documents/0-DS-522/Demo-Files-Repo/Section-2-Basics-of-Python-Programming/Lec-2.15-Creating-Python-Modules-and-Packages/module-files/pathissue/\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "a8fade70", "metadata": {}, "outputs": [], "source": [ "import arifmodule" ] }, { "cell_type": "code", "execution_count": null, "id": "0e8a95ed", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "c464a2ce", "metadata": {}, "source": [ ">- **First** Python Interpreter looks for the module in the current working directory, i.e., the directory from which the input script was run\n", ">- Let us see the contents of current working directory" ] }, { "cell_type": "code", "execution_count": null, "id": "121449d2", "metadata": {}, "outputs": [], "source": [ "import os\n", "os.listdir()" ] }, { "cell_type": "code", "execution_count": null, "id": "95e72d7e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "e340fb2e", "metadata": {}, "source": [ "**So in the current working directory, we do not have the module named `arifmodule`**" ] }, { "cell_type": "code", "execution_count": null, "id": "f95804bc", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "4db1fbe3", "metadata": {}, "source": [ ">- **Second** Python Interpreter looks for the module in the list of built-in modules of Python standard library\n", ">- Let us see the list of all builtin modules" ] }, { "cell_type": "code", "execution_count": null, "id": "0e696795", "metadata": {}, "outputs": [], "source": [ "print(dir(__builtins__))" ] }, { "cell_type": "code", "execution_count": null, "id": "6766659a", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "aa0497f6", "metadata": {}, "source": [ "**So `arifmodule` is not in the list of `__builtins__`**" ] }, { "cell_type": "code", "execution_count": null, "id": "9df8de64", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "d105875b", "metadata": {}, "source": [ ">- **Third** If there is no built-in module of that name, Python looks into a list of directories defined in `sys.path` environment variable\n", ">- Let us see the directories mentioned in the `sys.path` environment variable" ] }, { "cell_type": "code", "execution_count": null, "id": "fe7fc091", "metadata": {}, "outputs": [], "source": [ "import sys\n", "sys.path" ] }, { "cell_type": "code", "execution_count": null, "id": "82f275ad", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "07de67d4", "metadata": {}, "source": [ "**The module `arifmodule` is not in either of the above locations**" ] }, { "cell_type": "markdown", "id": "ba34f475", "metadata": {}, "source": [ ">- Let us now add the path of the directory containing `arifmodule` in the `sys.path` environment variable" ] }, { "cell_type": "code", "execution_count": null, "id": "cdaa6259", "metadata": { "scrolled": true }, "outputs": [], "source": [ "import sys\n", "sys.path.append('/Users/arif/Documents/0-DS-522/Demo-Files-Repo/Section-2-Basics-of-Python-Programming/Lec-2.15-Creating-Python-Modules-and-Packages/module-files/pathissue/')\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "91de54cc", "metadata": {}, "source": [ "**Now let us try to import `arifmodule` again**" ] }, { "cell_type": "code", "execution_count": null, "id": "dce49f56", "metadata": {}, "outputs": [], "source": [ "import arifmodule" ] }, { "cell_type": "markdown", "id": "1c3334d4", "metadata": {}, "source": [ "**Success**" ] }, { "cell_type": "code", "execution_count": null, "id": "30e7e497", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "c100e05f", "metadata": {}, "source": [ "## 4. Reload a module\n", "- The Python interpreter imports a module only once during a session. This makes things more efficient. \n", "- Consider `demomodule.py` located in the same directory in which this .ipynb file is located." ] }, { "cell_type": "markdown", "id": "2685f55a", "metadata": {}, "source": [ "#### demomodule.py\n", "```\n", "print(\"This is the demomodule having only one print statement. It is located in the same directory in which this .ipynb file is...\")\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "8fc71033", "metadata": {}, "outputs": [], "source": [ "# import the module for the first time\n", "import demomodule" ] }, { "cell_type": "markdown", "id": "51273d45", "metadata": {}, "source": [ "**Let us import this module again**" ] }, { "cell_type": "code", "execution_count": null, "id": "08d97c4c", "metadata": {}, "outputs": [], "source": [ "import demomodule\n", "\n", "# you can note that python imports the module only once" ] }, { "cell_type": "code", "execution_count": null, "id": "f90b47c4", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "da52a313", "metadata": {}, "source": [ "**What if we have made some changes in this module and want to reload it, there are two ways to do that:**\n", "1. Restart the Interpreter\n", "2. Call the `imp.reload()` function" ] }, { "cell_type": "code", "execution_count": null, "id": "55a6631b", "metadata": {}, "outputs": [], "source": [ "import imp\n", "imp.reload(demomodule)" ] }, { "cell_type": "code", "execution_count": null, "id": "f5e8548e", "metadata": {}, "outputs": [], "source": [ "import demomodule" ] }, { "cell_type": "code", "execution_count": null, "id": "4fcc3000", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "36ac0776", "metadata": {}, "source": [ "# Bonus: `main()` Function in Python\n", "- In most programming languages, the `main()` function serves as a starting point for the execution of a program. It is executed automatically every time the program is run.\n", "- In Python, it is not necessary to define the `main()`, because the Python interpreter executes from the top of the script file unless a specific function is defined. \n", "- However, having a defined starting point for the execution of your Python program is useful to better understand how a Python program works." ] }, { "cell_type": "markdown", "id": "0981e83c", "metadata": {}, "source": [ "### The Python `__name__` Variable and Python Execution Modes\n", "- In Python, the `.py` file may be executed as a script or may be imported as a module in another script.\n", "- Python has a special built-in variable, called `__name__`, that helps us decide whether we want to run the script or we want to import the functions defined in the script\n", " - When you run your script, the `__name__` variable equals `__main__` \n", " - When you import the containing script, the `__name__` variable will contain the name of the script.\n", "- Consider the following file named `myscript.py`: \n", "```\n", "def myFunction():\n", " print('The value of __name__ is ' + __name__)\n", "def main():\n", " myFunction()\n", "if __name__ == '__main__':\n", " main()\n", "```" ] }, { "cell_type": "markdown", "id": "cbc6d6f0", "metadata": {}, "source": [ "### Scenario 1: Execute `myscript.py` as a Python Script" ] }, { "cell_type": "code", "execution_count": null, "id": "622f945e", "metadata": {}, "outputs": [], "source": [ "!cat myscript.py" ] }, { "cell_type": "code", "execution_count": null, "id": "25c76f17", "metadata": {}, "outputs": [], "source": [ "%run myscript.py" ] }, { "cell_type": "markdown", "id": "e0dd0256", "metadata": {}, "source": [ ">**So it proves that when we run the `myscript.py` file as a script the `__name__` variable contains `__main__`, the condition evaluates to True and therefore `main()` function is called which further caled `myFunction()`, which executes the print statement.**" ] }, { "cell_type": "markdown", "id": "d6609474", "metadata": {}, "source": [ "### Scenario 2: Import `myscript.py` in another script as a Python Module" ] }, { "cell_type": "code", "execution_count": null, "id": "6e33f51e", "metadata": {}, "outputs": [], "source": [ "!cat myscript.py" ] }, { "cell_type": "code", "execution_count": null, "id": "7be76e0d", "metadata": {}, "outputs": [], "source": [ "import myscript as ms" ] }, { "cell_type": "markdown", "id": "6830d923", "metadata": {}, "source": [ ">**So it proves that when we import the `myscript.py` file as a module the `__name__` variable DOES NOT contains `__main__`, the condition evaluates to False and therefore `main()` function is NOT called and therefore nothing is printed on screen.**\n", "\n", "**Now since the `myscript.py` has been imported, so we can call its functions. Let us call `myFunction()`**" ] }, { "cell_type": "code", "execution_count": null, "id": "2d4081dd", "metadata": {}, "outputs": [], "source": [ "ms.myFunction()" ] }, { "cell_type": "markdown", "id": "59adf27f", "metadata": {}, "source": [ "**So it proves that when we import the `myscript.py` file as a module the `__name__` variable contains the name of the script `__myscript__`**" ] }, { "cell_type": "code", "execution_count": null, "id": "ffaa6530", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 5 }