# Python modules

---------

Notebooks are great (and we'll continue to use them in class as a teaching tool!), but they are hard to build large, sophisticated software in because of their ordering issues and overhead. This is where `.py` files come in - separate files containing Python code.

Most of this section will be done in a text editer (either on the command line or in a graphical program like VS Code or Sublime).

--------

## Python files

In this directory, create a new file `celebrate.py` and add some Python to it:

```python
print("Let's have a party!")
```

This new Python file is called a *module*. A module is a file containing Python definitions and statements. The file name is the module name with the suffix `.py` appended - so this is the "celebrate" module.

In [None]:
# Try it!

From the terminal, you can now run the module:

```bash
$ python celebrate.py
```

In [None]:
# Try it!

When you use the `python` command from the command line using a `.py` as an argument, the Python interpreter program executes the statements in the `.py` file

--------

## Python + Jupyter

Now if you come back to Jupyter and look at all the files in this directory (you may need to refresh the page), you'll see that your new `celebrate.py` file has appeared. If you click on it, you can edit it directly in Jupyter like you could in your text editor. You can edit Python files via Jupyter if you like, although a graphical code editor tool like VS Code or Sublime will be more effective.

In [None]:
# Try it!

You can actually also import your new module into a Jupyter notebook, using a regular import statement with the name of the module.

In [None]:
# Import the module here


In your text editor, try editing the file - update the message to say `"Let's have a party and eat cake!"`. Now try rerunning the previous import command.

In [None]:
# Try it!

Notice that nothing happened when you tried to reload - Jupyter saves the import and won't rerun it. There are two ways to reload a module after you've edited it:

  1. Restart the kernel (Kernel > Restart)
  2. Or use special Jupyter settings to automatically reload files. Autoreload has a bunch of different settings that you can look up if you want to use this:

    ```
    %load_ext autoreload
    %autoreload 2
    ```

In [None]:
# Try it!

--------

## Functions in modules

Usually what you'll want to do more than just execute statements in a module - this is where functions come in.

Change your `celebrate.py` file so that you add a `cake` function that prints `Let's have some cake!`

In [None]:
# Try it!

Now try rerunning the module from the command line:

```bash
$ python celebrate.py
```

In [None]:
# Try it!

It didn't print anything about cake! Why not?

Try it in the notebook too - try re-importing the module.

In [None]:
import celebrate

Still no cake.

But now, we can use our `celebrate` module and `cake` function in our code!

This is just like how we used math, numpy, and pandas last week. You can even give a shortened name if you like, like we did for pandas:

In [None]:
import celebrate as cb

However, this can be confusing unless it is a universally accepted naming convention (like `pd` for pandas). Stick with the full module name most of the time.

Note that there's an alternative import structure that involves the asterisk character:

  ```python
  from celebrate import *
  ```

We'll talk more about that in a later lecture, but for now, **avoid this construction!** Stick with importing a module plainly (`import celebrate`) or with a short name (`import celebrate as cb`).

--------

## Python top-level environment

Sometimes, we want a module to provide both *reusable functionality* (ie functions) and *an actual result* (ie statements that do something). You COULD in fact make two modules - one with the functions and one that is a regular script that does something - but people are lazy and it's common to do it all in one file, at least for simple to moderate code.

What we want:

* When someone imports the module with `import celebrate`, they have access to the functions we've defined but any computational code does NOT run (ie the `Let's have a party!` print).
* When someone runs the module directly (ie on the command line with `python celebrate.py`), it SHOULD execute the print statement.

The way we do this is with a special `if` statement that wraps our computational code:

```python
if __name__ == "__main__":
    # do stuff
```

In [None]:
# Try it in celebrate.py!

Now if you try to import the file into the Jupyter notebook, you won't see the print statement:

In [None]:
import celebrate

But if you run the module from the command line, it WILL print:

In [None]:
# Try running `python celebrate.py` from the command line

Lots of programming languages have a "main" construction like this.

What is actually going on?

When a Python module is imported, the variable `__name__` is automatically set to the module's name. Usually, this is the name of the Python file itself without the `.py` extension:

In [None]:
celebrate.__name__

However, if the module is executed in the top-level code environment, its `__name__` is set to the string `__main__`. “Top-level code” is the first user-specified Python module that starts running. It’s “top-level” because it imports all other modules that the program needs. Sometimes “top-level code” is called an entry point to the application.

So in our Jupyter notebook, you also have a `__name__` variable.

In [None]:
__name__

When you run the module with `python celebrate.py`, however, the `__name__` for the `celebrate` module will be `__main__` instead, because that's the Python code that you started with!

Lastly, if your main block is getting really big or complicated, a common way to structure the code is with a specific `main` function that you call in the main block:

In [None]:
def main():
    # do stuff
    return

if __name__ == '__main__':
    main()

--------

## Exercise

1. Move the three functions that you wrote in the previous breakout section on functions into a `.py` module file.
1. Add a main function and block that calls each function 2 times with different arguments.
1. Execute your module from the command line.
1. Import the module into a Jupyter notebook.
1. Now try moving the main function and block to a different `.py` file. Use an import to access the functions.