"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The Python Standard Library has lots of built-in **modules** that contain useful functions and data types for doing specific tasks. You can also use modules from outside the standard library. And you will undoubtedly write your own modules!\n",
"\n",
"A module is contained in a file that ends with `.py`. This file can have **classes**, functions, and other objects.\n",
"\n",
"A **package** contains several related modules that are all grouped together under one name. We will extensively use the [NumPy](http://www.numpy.org), [SciPy](http://www.scipy.org/), [Bokeh](http://bokeh.pydata.org), and biocircuits packages. As such, the first module we will consider is NumPy. We will revisit NumPy with its own section."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: You want to compute the mean and median of a list of numbers\n",
"\n",
"Say you have a list of numbers and you want to compute the mean. This happens all the time; you repeat a measurement multiple times and you want to compute the mean. We could write a function to do this."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"def mean(values):\n",
" \"\"\"Compute the mean of a sequence of numbers.\"\"\"\n",
" return sum(values) / len(values)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And it works as expected."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3.0\n",
"3.275\n"
]
}
],
"source": [
"print(mean([1, 2, 3, 4, 5]))\n",
"print(mean((4.5, 1.2, -1.6, 9.0)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In addition to the mean, we might also want to compute the median, the standard deviation, etc. These seem like really common tasks. Remember my advice: if you want to do something that seems really common, a good programmer (or a team of them) probably already wrote something to do that. Means, medians, standard deviations, and lots and lots and lots of other numerical things are included in the **Numpy package**. To get access to it, we have to **import** it."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import numpy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That's it! We now have the `numpy` module available for use. Remember, in Python everything is an object, so if we want to access the methods and attributes available in the `numpy` module, we use dot syntax. In a Jupyter you can type\n",
"\n",
" numpy.\n",
"\n",
"(note the dot) and hit tab, and we will see what is available. For Numpy, there is a huge number of options!\n",
"\n",
"So, let's try to use Numpy's `numpy.mean()` function to compute a mean."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3.0\n",
"3.275\n"
]
}
],
"source": [
"print(numpy.mean([1, 2, 3, 4, 5]))\n",
"print(numpy.mean((4.5, 1.2, -1.6, 9.0)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Great! We get the same values! Now, we can use the `numpy.median()` function to compute the median."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3.0\n",
"2.85\n"
]
}
],
"source": [
"print(numpy.median([1, 2, 3, 4, 5]))\n",
"print(numpy.median((4.5, 1.2, -1.6, 9.0)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is nice. It gives the median, including when we have an even number of elements in the sequence of numbers, in which case it automatically interpolates. It is really important to know that it does this interpolation, since if you are not expecting it, it can give unexpected results. So, here is an important piece of advice:\n",
"\n",
"
\n",
"\n",
"Always check the doc strings of functions.\n",
" \n",
"
\n",
"\n",
"We can access the doc string of the `numpy.median()` function in JupyterLab by typing\n",
"\n",
" numpy.median?\n",
" \n",
"and looking at the output. An important part of that output:\n",
"\n",
" Notes\n",
" -----\n",
" Given a vector ``V`` of length ``N``, the median of ``V`` is the\n",
" middle value of a sorted copy of ``V``, ``V_sorted`` - i\n",
" e., ``V_sorted[(N-1)/2]``, when ``N`` is odd, and the average of the\n",
" two middle values of ``V_sorted`` when ``N`` is even.\n",
"\n",
"This is where the documentation tells you that the median will be reported as the average of two middle values when the number of elements is even. Note that you could also read the documentation [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html), which is a bit easier to read."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The as keyword\n",
"\n",
"We use Numpy all the time. Typing `numpy` over and over again can get annoying. So, it is common practice to use the `as` keyword to import a module with an **alias**. Numpy's alias is traditionally `np`, _and this is the only alias you should ever use for Numpy_."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2.85"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"\n",
"np.median((4.5, 1.2, -1.6, 9.0))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We do things this way, though some purists differ. We will use traditional aliases for major packages like Numpy (`np`) and Pandas (`pd`), a package we will encounter later."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Third party packages\n",
"\n",
"Standard Python installations come with the standard library. Numpy and other useful packages are not in the standard library. Outside of the standard library, there are several packages available. Several. Ha! There are currently (March 26, 2023) about 442,000 packages available through the [Python Package Index](https://pypi.python.org/pypi), PyPI. Usually, you can Google what you are trying to do, and there is often a third party module to help you do it. The most useful (for scientific computing) and thoroughly tested packages and modules are available using `conda`. Others can be installed using `pip`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Computing environment"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Python implementation: CPython\n",
"Python version : 3.10.9\n",
"IPython version : 8.10.0\n",
"\n",
"numpy : 1.23.5\n",
"jupyterlab: 3.5.3\n",
"\n"
]
}
],
"source": [
"%load_ext watermark\n",
"%watermark -v -p numpy,jupyterlab"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}