# Jupyter/Tensorflow Installation and Setup


## Anaconda Setup

First, [download and install Anaconda](https://www.continuum.io/downloads).

This step takes a long time since the full anaconda distribution is >2G and takes forever to download, and the installation script isn't fast either.

Once Anaconda is installed though, the rest only takes ~15 minutes.

## Python Package Installation

Next, create a new anaconda environment:

 conda create -n tf3.5 python=3.5

Activate the environment "tf3.5":

 source activate tf3.5

Install things that will be useful for data analysis in general:

 # The below will also install scipy, matplotlib, numpy and a variety of other things
 conda install scikit-learn jupyter pandas seaborn plotly

[Install Tensorflow](https://www.tensorflow.org/install/):

 # Note that if you want to use the GPU-enabled version of TF, you should make deal with the
 # things mentioned below in the "Using GPUs" section FIRST (ie before running this command)
 pip install tensorflow-gpu
 
 # OR if you don't want to use the GPU enabled version (which can be a pain) just run this:
 # pip install tensorflow

And that's it. If you want to check to see if a package is installed you can run something like this:

 pip freeze | grep tensorflow

Also, if you ever want to delete an environment:
 
 rm -rf ~/anaconda/envs/tf3.5 # Or replace "tf3.5" with whatever you called your environment

## Using GPUs

If you want to use your GPU for running tensorflow graphs, at least on Mac OS, there are a few things that you'd have to do first:

- Figure out if your machine has a CUDA-enabled NVIDIA graphics card (ie GPU)
 - This is probably true if you have a 2011 or later macbook pro with bigger than 13" display
 - Older Macs have AMD GPUs that are not CUDA compatible
 - Some newer Macs only have an Intel, integrated graphics card (that is also not CUDA compatible)
- Install CUDA Toolkit
- Install cuDNN

See here for some more details:

https://www.tensorflow.org/install/install_mac#requirements_to_run_tensorflow_with_gpu_support


## Creating a new Jupyter Notebook


To create a new folder to contain a notebook (a good idea to do it within a git repo), and then create a notebook in that folder:

 > source activate tf3.5
 (tf3.5)> mkdir /tmp/test_notebooks
 (tf3.5)> cd /tmp/test_notebooks/
 (tf3.5)> jupyter notebook
 [I 08:00:15.291 NotebookApp] Serving notebooks from local directory: /private/tmp/test_notebook
 [I 08:00:15.291 NotebookApp] 0 active kernels
 [I 08:00:15.291 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/?token=846ffbfdefd9ee084d60977e8d49ae1e7b1809154de2936d
 [I 08:00:15.291 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
 
Now your browser should automatically open up and bring you to a notebook to start working in.

## Some Helpful Tips

- On GPUs:

When you launch a notebook and the tensorflow GPU setup is all working correctly, you should see some output like this in the console where you ran "jupyter notebook":
 
 I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
 name: GeForce GT 750M
 major: 3 minor: 0 memoryClockRate (GHz) 0.9255
 pciBusID 0000:01:00.0
 Total memory: 2.00GiB
 Free memory: 76.35MiB

In the new notebook, you can test the tensorflow installation by running "import tensorflow as tf".

In the past, this would give me some errors that were ultimately corrected by making sure some necessary CUDA libraries were reachable by making sure to set these environment variables:

 export CUDA_HOME=/usr/local/cuda
 export DYLD_LIBRARY_PATH="$DYLD_LIBRARY_PATH:$CUDA_HOME/lib"
 export PATH="$CUDA_HOME/bin:$PATH"

- On common notebook imports:

I always thought this was a surprisingly difficult thing to google, but if you ever get tired of running the same import commands in ever single notebook you can create your "initialization scripts" by doing something like this:

First, create a file in the Anaconda environment like this:

 vi ~/anaconda/envs/tf3.5/lib/python3.5/site-packages/local.pth

Add a single line to this file indicating a directory that will contain python scripts (or multiple lines for multiple directories). For example, I use something like:

 /Users/eczech/repos/python/notebooks/startup_scripts

Within this directory, I have small files with boilerplate code like:


 cat /Users/eczech/repos/python/notebooks/startup_scripts/ipy_startup.py
 
 import os
 import sys
 import numpy as np
 import pandas as pd
 import matplotlib.pyplot as plt

Now in a notebook, you can run "%run -m ipy_startup" and the lines above will execute based on a single line of code rather than having to run all 5 of those lines every single time.