
<center>
<img class="logo" src="images/anaconda_logo_web.png" height="20%" width="20%" align="center"/>
<h1>Building and Distributing Python Software with Conda</h1>
<br/><br/>
<h2>Jonathan J. Helmus
</h2>
<br/><br/>
<h3>DePy 2016</h3>
<h3>
Chicago, IL</h3>
<h3>May 7, 2016</h3>
</center>


### Abstract
    
Conda is a cross platform, package management system widely used in the scientific and data science Python communities. Conda can be used to package and distribute software written in any language but has first class support for Python packages. This talk will briefly cover how to use conda to install and manage data science packages as well as how conda can be used to create isolated computing environments. The main focus of the talk will be an in-depth look at how to easily and reproducibly create conda packages for your own Python software, and options for how to share these packages with others. Finally, combining a collection of conda packages into custom cross-platform installable conda-based Python distributions will be explored.

## About Me

<img src="images/argonne_logo.png" height="25%" width="25%" align="right"/>

* Scientist and software developer at Argonne National Laboratory
* User of the SciPy stack and PyData tools
    * NumPy, SciPy, matplotlib, Jupyter, pandas, ...
* Contributor to a number of open source projects
* conda enthusiast
<img src="images/pydatalogo.png" height="30%" width="30%" align="right"/>


## The Problem

A number of **powerful** Python data science libraries exist.  Installing and managing these libraries can be **difficult and time consuming**.


* Traditional **package managers** (apt-get, brew, etc)
    * out of date or missing packages
    * operating system specific

<img src="images/build_from_source_meme.jpg" height="40%" width="40%" align="right"/>  

* **pip**
    * often builds from source
    * limited to Python packages
<br/><br/> 
* Install everything from **scratch**
    * works, but really?!



## [conda](http://conda.pydata.org/)

<img src="images/keep-calm-and-conda-install.png" height="40%" width="40%" align="right"/>  
* Cross-platform **package** and **environment** management system 
* Created by Continuum Analytics
* Open source, BSD license
* Created for Python programs but can package and manage any software.
* Does not require administator privileges.
* Available by installing **Anaconda** or **Miniconda** from Continuum.

## conda : the package manager

Conda is a cross platform package manager which installs binary conda packages.

In [None]:
# Before talk
conda create -n depy python=3.5
source activate depy

# In terminal during talk

python
>>> import pandas   # No module named pandas...what!

conda list   # not much installed
conda install pandas
conda install matplotlib ipython
ipython 
>>> import pandas
>>> pandas.__version__   # 0.18.1

conda list
conda list pandas
conda remove pandas
conda install pandas=0.16
python -c "import pandas; print(pandas.__version__)" # 0.16.2

conda update pandas
python -c "import pandas; print(pandas.__version__)" # 0.18.1

# lots of other commands
conda --help


* **conda install** : Install a package
* **conda remove** : Remove a package
* **conda update** : Update a package
* **conda list** : List packages installed

## conda : the environment manager

Conda can create isolated environments that have their own 
set of installed and managed packages.  

In [None]:
source deactivate

# say we want to more easily go between a pandas 0.16 and 0.18, make seperate environments
conda create

conda create -n depy_pandas16 pandas=0.16 python=2.7  # also want to use Python 2.7
source activate depy_pandas16
python  # notice that we have python 2.7
>>> print "Hi DePy"
>>> import pandas
>>> pandas.__version.__
0.16.2

source deactivate

conda create -n depy_pandas18 pandas=0.18 python # defaults to latest Python, 3.5
source activate depy_pandas18
python
>>> import pandas
>>> pandas.__version.__

source deactivate

* **conda create** : Create a new conda environment
* **source activate** : Activate a conda enviroment
* **source deactivate** : Deactivate the current conda enviroment

Packages are hard linked into the enviroment to save disk space.

## Finding conda packages

Conda can search for packages from the repository provided by Continuum as well as packages created by users and shared on the [Anaconda cloud](https://anaconda.org).

In [None]:
conda search scikit-learn
conda search tensorflow

anaconda search tensorflow
anaconda show jjhelmus/tensorflow

# Search on Anaconda.org


* **conda search** : Search for packages in the default repo
* **anaconda search** : Search Anaconda.org for packages

## Creating and sharing conda packages

Conda packages can be built from a recipe and shared on Anaconda.org

In [None]:
cd recipe/nmrglue

cat meta.yaml
cat build.sh
cat bld.sh

cd ..
conda build nmrglue
<lots of text>

anaconda upload ...

cd from_skeleton
conda skeleton pypi nmrglue


* **conda build** : Build a conda package from a recipe
* **anaconda upload** : Upload a package to Anaconda.org
* **conda skeleton** : Generate a boilderplate recipe from PyPI

## [conda-forge](https://conda-forge.github.io/)

A community led collection of recipes, build infrastructure and distributions for the conda package manager.

* reproducable method for building package on all platforms
* recipes are submitted to the [**staged-recipes** repo](https://github.com/conda-forge/staged-recipes)
* once working, a **feedstock** repository is created
* bug fixes and new versions done in the **feedstock** repo
* packages are uploaded to the [**conda-forge channel**](https://anaconda.org/conda-forge)
<center>
<img src="images/condaforge_logo.png" height="40%" width="40%" align="center"/>
</center> 

In [None]:
# Show conda-forge webpage
# Show staged-recipes repo

cd ..
cd staged-recipes
cd recipes
ls
cat meta.yaml

git checkout -b nmrglue
git add .
git commit -m "Add nmrglue recipe"
git pull-request

# Show staged-recipe repo
# Show Py-ART repo


## (conda) constructor

[constructor](https://github.com/conda/constructor) is a tool for creating an installer from a collection of conda packages.

* open source software by Continuum Analytics
* install using, `conda install constructor`
* Installer settings are specified in a `construct.yaml` file
* cross-platform, create installers for Linux, OS X, and Windows

## conda from scratch?

conda-forge packages + constructor == custom Python distribution?

[**acpd**](https://github.com/acpd/acpd) : Another conda-based Python distribution
* Proof of concept, builds installer which is self-hosting
* repository include all conda recipes and scripts
* Currently limited to linux-64 and Python 3.5
* more to come...

<center>
<img src="images/acpd_logo.png" height="30%" width="30%" align="center"/>
</center> 

<center>

## Thanks
<br/> <br/>
## Questions?
<br/> <br/>

<center>
<img src="images/jhelmus_square_headshot_medium.jpg" height="25%" width="25%" align="center"/>

<center>
Jonathan J. Helmus
<br/><br/>
jjhelmus@gmail.com 
<br/><br/>
GitHub: jjhelmus 
<br/><br/>
http://nmrglue.com/jhelmus