# Configuring your computer to use Python for scientific computing {#sec-setting_up_python_computing_environment}

<hr>

## Why Python?

There are plenty of programming languages that are widely used in data science and in scientific computing more generally. Some of these, in addition to Python, are [Matlab](https://www.mathworks.com/products/matlab.html)/[Octave](https://www.gnu.org/software/octave/), [Mathematica](https://www.wolfram.com/mathematica/), [R](https://www.r-project.org), [Julia](http://julialang.org/), [Java](https://www.oracle.com/java/), [JavaScript](https://developer.mozilla.org/en-US/docs/Web/JavaScript), [Rust](https://www.rust-lang.org), and [C++](https://en.wikipedia.org/wiki/C%2B%2B).

I have chosen to use Python. I believe language wars are counterproductive and welcome anyone to port the code we use to any language of their choice, I nonetheless feel we should explain this choice.

Python is a flexible programming language that is widely used in many applications. This is in contrast to more domain-specific languages like R and Julia. It is easily extendable, which is in many ways responsible for its breadth of use. We find that there is a decent Python-based tool for many applications we can dream up, certainly in data science. However, the Python-based tool is often not the _very best_ for the particular task at hand, but it is almost always _pretty good_. Thus, knowing Python is like having a Swiss Army knife; you can wield it to effectively accomplish myriad tasks. Finally, we also find that it has a shallow learning curve with most students.

Perhaps most importantly, specifically for neuroscience applications, is that Python is widely used in machine learning and AI. The development of packages like [TensorFlow](https://www.tensorflow.org), [PyTorch](https://pytorch.org), [JAX](https://docs.jax.dev/en/latest/), [Keras](https://keras.io), and [scikit-learn](https://scikit-learn.org/stable/) have led to very widespread adoption of Python.

## Jupyter notebooks

The materials of this course are constructed from **Jupyter notebooks**. To quote [Jupyter's documentation](https://docs.jupyter.org/en/latest/projects/architecture/content-architecture.html),

>Jupyter Notebook and its flexible interface extends the notebook beyond code to visualization, multimedia, collaboration, and more. In addition to running your code, it stores code and output, together with markdown notes, in an editable document called a notebook.

This allows for executable documents that have code, but also richly formatted text and graphics, enabling the reader to *interact* with the material as they read it.

Specifically, notebooks are comprised of **cells**, where each cell contains either executable Python code or text.

While you read the materials, you can read the HTML-rendered versions of the notebooks. To *execute* (and even edit!) code in the notebooks, you will need to run them. There are many options available to run Jupyter notebooks. Here are a few we have found useful.

- [JupyterLab](https://jupyterlab.readthedocs.io/): This is a browser-based interface to Jupyter notebooks and more (including a terminal application, text editor, file manager, etc.). As of March 2025, Chrome, Firefox, Safari, and Edge are supported. I encourage you to run your code own machine. I give instructions below on how to do the necessary installations and launch JupyterLab.
- [VSCode](https://code.visualstudio.com): This is an excellent source code editor that supports Jupyter notebooks. Be sure to read [the documentation](https://code.visualstudio.com/docs/datascience/jupyter-notebooks) on how to use Jupyter notebooks in VSCode.
- [Google Colab](https://colab.research.google.com/): Google offers this service to run notebooks in the cloud on their machines. There are a few caveats, though. First, not all packages and updates are available in Colab. Furthermore, not all interactivity that will work natively in Jupyter notebooks works with Colab. If a notebook sits idle for too long, you will be disconnected from Colab. Finally, there is a limit to resources that are available for free, and as of March 2025, that limit is unpublished and can vary. All of the notebooks in the HTML rendering of this book have an "Open in Colab" button at the upper right that allows you to launch the notebook in Colab. This is a quick-and-easy way to execute the book's contents.

## Marimo

[Marimo](https://marimo.io) offers a very nice notebook interface that is a departure from Jupyter notebooks in its structure. The biggest departure is that Marimo notebooks are specifically for Python, as opposed to being language agnostic like Jupyter. As a result, Marimo notebooks can offer many features not seen in Jupyter notebooks (without add-ons). The two most compelling, at least to me, are

- Marimo notebooks are simple `.py` files which allow for easier version control and simple execution as scripts.
- Marimo notebooks are **reactive**, meaning that the ordering of the cells is irrelevant and the notebook runs all cells that need to be rerun as a result of a change of value of a variable in any given cell.

In the course, we will use Jupyter notebooks, but you are welcome to play with Marimo notebooks. Upon completing the installation instructions in this notebook, Marimo will be installed.

## Installing Python tools

Prior to embarking on your journey into data analysis, you need to have a functioning Python distribution installed on your computer. We present two methods for installation and package management with [pixi](https://pixi.sh/) being our preferred software.

### Option 1: Pixi (preferred)

[Pixi](https://pixi.sh/) is a package management tool that allows installation of packages. Importantly, it does so in a *project-based* way. That is, for each project, you use Pixi to create and manage the packages needed for that project. Our "project" here is our data analysis/statistical inference course.

**Step 1**: Install Pixi. To install Pixi, you need access to the command line. For macOS users, hit `Command-space`, type in "terminal" and open the `Terminal` app. In Windows, open PowerShell by opening the Start Menu, typing "PowerShell" in the search bar, and selecting "Windows PowerShell." I assume you know how to get access to the command line if you are using Linux.

On the command line, do the following.

**macOS or Linux**

    curl -fsSL https://pixi.sh/install.sh | sh

**Windows**

    powershell -ExecutionPolicy ByPass -c "irm -useb https://pixi.sh/install.ps1 | iex"

**Step 2**: Create a directory for your work in the course. You might want to name the directory `huji_stats/`, which is what I have named it. You can do this either with the command line of your graphical file management program (e.g., `Finder` for macOS).

**Step 3** Navigate to the directory you created on the command line. For example, if the directory is `huji_stats/` in your home directory and you are in your home directory, you can do

    cd huji_stats

on the command line.

**Step 4** Download the requisite Pixi files: [pixi.toml](https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.toml), [pixi.lock](https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.lock). These files need to be stored in the directory you created in step 3. You may download them by right-clicking those links, or by doing the following on the command line.

**macOS or Linux**

    curl -fsSL https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.toml
    curl -fsSL https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.lock

**Windows**

    irm -useb https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.toml -OutFile pixi.toml

    irm -useb https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.toml -OutFile pixi.lock

**Step 5** To be able to use all of the packages, you need to invoke a Pixi shell. To do so, execute the following on the command line.

    pixi shell

You are now good to go! After you are done working, to exit the Pixi shell, hit `Control-D`.

**For doing work for this class, you will need to cd into the directory you created in step 2 and execute `pixi shell` every time you open a new terminal (or PowerShell) window.**

### Option 2: Conda (Pixi is preferred)

[Conda](https://anaconda.org/anaconda/conda) is a system-level package manager. It differs from Pixi in that it operates system-wide (i.e., not just for a specific project) and uses **environments** to define a set of packages for a given mode of computing as oppose to specific projects like Pixi. We will use [Miniconda](https://www.anaconda.com/docs/getting-started/miniconda/main) to get access to the Conda pacakge manager.

To download and install Miniconda, do the following.

#### Windows

1. Go to the [Miniconda page](https://docs.anaconda.com/miniconda/#quick-command-line-install) and go to the "Quick command line install" section.
2. Click on the "Windows PowerShell" tab.
3. Copy all of the contents in the gray box (starting the `curl`).
4. Go to the Start menu and search for "PowerShell." Click to open a PowerShell window. Alternatively, you can hit `Windows + R` and type `PowerShell` in the text box.
5. Paste the copied text into the PowerShell window and hit enter.

#### macOS

1. Go to the [Miniconda page](https://docs.anaconda.com/miniconda/#quick-command-line-install) and go to the "Quick command line install" section.
2. Click on the "macOS" tab.
3. Copy all of the contents in the gray box (starting the `mkdir`).
4. Open a Terminal window. You can do this by hitting `Command-space bar`, typing `Terminal`, and hitting enter. Alternatively, the Terminal application is located in the `/System/Applications/Utilities/` folder, which you can navigate to using Finder.
5. Paste the copied text into the Terminal window and hit enter.

#### Linux

1. Go to the [Miniconda page](https://docs.anaconda.com/miniconda/#quick-command-line-install) and go to the "Quick command line install" section.
2. Click on the "Linux" tab.
3. Copy all of the contents in the gray box (starting the `mkdir`).
4. Open a terminal window. I assume you know how to do this if you are using Linux.
5. Paste the copied text into the terminal window and hit enter.

### Setting up a conda environment

I have created a conda environment for use in this workshop. You can install this environment by executing the following on the command line.

    conda env create -f https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/huji_stats.yml

This will build the environment for you (it may take several minutes). To then activate the environment, enter

    conda activate huji_stats

on the command line. **You will need to activate the environment every time you open a new terminal (or PowerShell) window.**

## Launching JupyterLab

Once you have invoked a Pixi shell or activated your conda environment, you can launch JupyterLab via your operating system's terminal program (Terminal on macOS and PowerShell on Windows). To do so, enter the following on the command line.

    jupyter lab

You will have an instance of JupyterLab running in your default browser. If you want to specify the browser, you can, for example, type

    jupyter lab --browser=firefox

on the command line.

Alternatively, if you are using VSCode, you can use its menu system to open `.ipynb` files.

## Checking your distribution

Let's now run a quick test to make sure things are working properly. We will make a quick plot that requires some of the scientific libraries we will use.

Launch a Jupyter notebook in JupyterLab. In the first cell (the box next to the `[ ]:` prompt), paste the code below. To run the code, press `Shift+Enter` while the cursor is active inside the cell. You should see a plot that looks like the one below. If you do, you have a functioning Python environment for scientific computing!

In [1]:
import numpy as np
import bokeh.plotting
import bokeh.io

bokeh.io.output_notebook()

# Generate plotting values
t = np.linspace(0, 2*np.pi, 200)
x = 16 * np.sin(t)**3
y = 13 * np.cos(t) - 5 * np.cos(2*t) - 2 * np.cos(3*t) - np.cos(4*t)

p = bokeh.plotting.figure(height=250, width=275)
p.line(x, y, color='red', line_width=3)
text = bokeh.models.Label(x=0, y=0, text='HUJI-Stats', text_align='center')
p.add_layout(text)

bokeh.io.show(p)

## Computing environment {.unnumbered}

In [2]:
%load_ext watermark
%watermark -v -p numpy,bokeh,jupyterlab

Python implementation: CPython
Python version       : 3.13.5
IPython version      : 9.4.0

numpy     : 2.2.6
bokeh     : 3.7.3
jupyterlab: 4.4.5

