{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# [NTDS'19] tutorial 1: introduction\n", "[ntds'19]: https://github.com/mdeff/ntds_2019\n", "\n", "[Michaƫl Defferrard](https://deff.ch), [EPFL LTS2](https://lts2.epfl.ch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Content\n", "\n", "1. [Conda and Anaconda](#conda)\n", "1. [Python](#python)\n", "1. [Jupyter notebooks](#jupyter)\n", "1. [Version control with git](#git)\n", "1. [Scientific Python](#scipy)\n", "1. [Ressources to improve your Python skills](#improve)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 1 Conda and Anaconda\n", "\n", "![conda](figures/conda.jpg)\n", "\n", "[Conda](https://conda.io) is a package and environment manager. It allows you to create environments, ideally one per project, and install packages into them. It is available for Windows, macOS and Linux.\n", "\n", "[Anaconda](https://anaconda.org/download) is a commercial distribution that comes with many of the packages used by data scientists. [Miniconda](https://conda.io/miniconda.html) is a lighter open distribution. Both install `conda`, from which you'll be able to install many packages.\n", "\n", "[conda-forge](https://conda-forge.org) is a community-driven collection of recipes to build conda packages. It contains many more packages than the official defaults channel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get basic information from your conda installation:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", " active environment : ntds_2019\r\n", " active env location : /home/michael/.conda/envs/ntds_2019\r\n", " shell level : 1\r\n", " user config file : /home/michael/.condarc\r\n", " populated config files : /home/michael/.condarc\r\n", " conda version : 4.7.2\r\n", " conda-build version : not installed\r\n", " python version : 3.7.4.final.0\r\n", " virtual packages : \r\n", " base environment : /usr (read only)\r\n", " channel URLs : https://conda.anaconda.org/conda-forge/linux-64\r\n", " https://conda.anaconda.org/conda-forge/noarch\r\n", " https://repo.anaconda.com/pkgs/main/linux-64\r\n", " https://repo.anaconda.com/pkgs/main/noarch\r\n", " https://repo.anaconda.com/pkgs/r/linux-64\r\n", " https://repo.anaconda.com/pkgs/r/noarch\r\n", " package cache : /home/michael/.conda/pkgs\r\n", " envs directories : /home/michael/.conda/envs\r\n", " /usr/envs\r\n", " platform : linux-64\r\n", " user-agent : conda/4.7.2 requests/2.22.0 CPython/3.7.4 Linux/5.2.6-arch1-1-ARCH arch/ glibc/2.29\r\n", " UID:GID : 1000:1000\r\n", " netrc file : None\r\n", " offline mode : False\r\n", "\r\n" ] } ], "source": [ "!conda info" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "List your environments:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# conda environments:\r\n", "#\r\n", "complexes /home/michael/.conda/envs/complexes\r\n", "eeg_denoising /home/michael/.conda/envs/eeg_denoising\r\n", "ntds_2019 * /home/michael/.conda/envs/ntds_2019\r\n", "osmnx /home/michael/.conda/envs/osmnx\r\n", "python2 /home/michael/.conda/envs/python2\r\n", "scnn /home/michael/.conda/envs/scnn\r\n", "snn /home/michael/.conda/envs/snn\r\n", "base /usr\r\n", "\r\n" ] } ], "source": [ "!conda env list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "List the packages in an environment:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# packages in environment at /home/michael/.conda/envs/ntds_2019:\r\n", "#\r\n", "# Name Version Build Channel\r\n", "_libgcc_mutex 0.1 main \r\n", "attrs 19.1.0 py_0 conda-forge\r\n", "backcall 0.1.0 py_0 conda-forge\r\n", "bleach 3.1.0 py_0 conda-forge\r\n", "bzip2 1.0.8 h516909a_1 conda-forge\r\n", "ca-certificates 2019.9.11 hecc5488_0 conda-forge\r\n", "certifi 2019.9.11 py37_0 conda-forge\r\n", "cffi 1.12.3 py37h8022711_0 conda-forge\r\n", "cpuonly 1.0 0 pytorch\r\n", "curl 7.65.3 hf8cf82a_0 conda-forge\r\n", "cycler 0.10.0 py_1 conda-forge\r\n", "decorator 4.4.0 py_0 conda-forge\r\n", "defusedxml 0.5.0 py_1 conda-forge\r\n", "dgl 0.3.1 py37_0 dglteam\r\n", "entrypoints 0.3 py37_1000 conda-forge\r\n", "expat 2.2.5 he1b5a44_1003 conda-forge\r\n", "freetype 2.10.0 he983fc9_1 conda-forge\r\n", "gettext 0.19.8.1 hc5be6a0_1002 conda-forge\r\n", "git 2.23.0 pl526hce37bd2_2 conda-forge\r\n", "icu 64.2 he1b5a44_1 conda-forge\r\n", "intel-openmp 2019.4 243 \r\n", "ipykernel 5.1.2 py37h5ca1d4c_0 conda-forge\r\n", "ipython 7.8.0 py37h5ca1d4c_0 conda-forge\r\n", "ipython_genutils 0.2.0 py_1 conda-forge\r\n", "jedi 0.15.1 py37_0 conda-forge\r\n", "jinja2 2.10.1 py_0 conda-forge\r\n", "joblib 0.13.2 py_0 conda-forge\r\n", "json5 0.8.5 py_0 conda-forge\r\n", "jsonschema 3.0.2 py37_0 conda-forge\r\n", "jupyter_client 5.3.1 py_0 conda-forge\r\n", "jupyter_core 4.4.0 py_0 conda-forge\r\n", "jupyterlab 1.1.3 py_0 conda-forge\r\n", "jupyterlab_server 1.0.6 py_0 conda-forge\r\n", "kiwisolver 1.1.0 py37hc9558a2_0 conda-forge\r\n", "krb5 1.16.3 h05b26f9_1001 conda-forge\r\n", "libblas 3.8.0 12_openblas conda-forge\r\n", "libcblas 3.8.0 12_openblas conda-forge\r\n", "libcurl 7.65.3 hda55be3_0 conda-forge\r\n", "libedit 3.1.20170329 hf8c457e_1001 conda-forge\r\n", "libffi 3.2.1 he1b5a44_1006 conda-forge\r\n", "libgcc-ng 9.1.0 hdf63c60_0 \r\n", "libgfortran-ng 7.3.0 hdf63c60_0 \r\n", "libiconv 1.15 h516909a_1005 conda-forge\r\n", "liblapack 3.8.0 12_openblas conda-forge\r\n", "libopenblas 0.3.7 h6e990d7_1 conda-forge\r\n", "libpng 1.6.37 hed695b0_0 conda-forge\r\n", "libsodium 1.0.17 h516909a_0 conda-forge\r\n", "libssh2 1.8.2 h22169c7_2 conda-forge\r\n", "libstdcxx-ng 9.1.0 hdf63c60_0 \r\n", "markupsafe 1.1.1 py37h14c3975_0 conda-forge\r\n", "matplotlib-base 3.1.1 py37he7580a8_1 conda-forge\r\n", "mistune 0.8.4 py37h14c3975_1000 conda-forge\r\n", "mkl 2019.4 243 \r\n", "nbconvert 5.6.0 py37_1 conda-forge\r\n", "nbformat 4.4.0 py_1 conda-forge\r\n", "ncurses 6.1 hf484d3e_1002 conda-forge\r\n", "networkx 2.3 py_0 conda-forge\r\n", "ninja 1.9.0 h6bb024c_0 conda-forge\r\n", "notebook 6.0.1 py37_0 conda-forge\r\n", "numpy 1.15.4 py37h8b7e671_1002 conda-forge\r\n", "openssl 1.1.1c h516909a_0 conda-forge\r\n", "pandas 0.25.1 py37hb3f55d8_0 conda-forge\r\n", "pandoc 2.7.3 0 conda-forge\r\n", "pandocfilters 1.4.2 py_1 conda-forge\r\n", "parso 0.5.1 py_0 conda-forge\r\n", "pcre 8.41 hf484d3e_1003 conda-forge\r\n", "perl 5.26.2 h516909a_1006 conda-forge\r\n", "pexpect 4.7.0 py37_0 conda-forge\r\n", "pickleshare 0.7.5 py37_1000 conda-forge\r\n", "pip 19.2.3 py37_0 conda-forge\r\n", "prometheus_client 0.7.1 py_0 conda-forge\r\n", "prompt_toolkit 2.0.9 py_0 conda-forge\r\n", "ptyprocess 0.6.0 py_1001 conda-forge\r\n", "pycparser 2.19 py37_1 conda-forge\r\n", "pygments 2.4.2 py_0 conda-forge\r\n", "pygsp 0.5.1 py_0 conda-forge\r\n", "pyparsing 2.4.2 py_0 conda-forge\r\n", "pyrsistent 0.15.4 py37h516909a_0 conda-forge\r\n", "python 3.7.3 h33d41f4_1 conda-forge\r\n", "python-dateutil 2.8.0 py_0 conda-forge\r\n", "pytorch 1.2.0 py3.7_cpu_0 [cpuonly] pytorch\r\n", "pytz 2019.2 py_0 conda-forge\r\n", "pyzmq 18.1.0 py37h1768529_0 conda-forge\r\n", "readline 8.0 hf8c457e_0 conda-forge\r\n", "scikit-learn 0.21.3 py37hcdab131_0 conda-forge\r\n", "scipy 1.3.1 py37h921218d_2 conda-forge\r\n", "send2trash 1.5.0 py_0 conda-forge\r\n", "setuptools 41.2.0 py37_0 conda-forge\r\n", "six 1.12.0 py37_1000 conda-forge\r\n", "sqlite 3.29.0 hcee41ef_1 conda-forge\r\n", "terminado 0.8.2 py37_0 conda-forge\r\n", "testpath 0.4.2 py_1001 conda-forge\r\n", "tk 8.6.9 hed695b0_1003 conda-forge\r\n", "tornado 6.0.3 py37h516909a_0 conda-forge\r\n", "traitlets 4.3.2 py37_1000 conda-forge\r\n", "wcwidth 0.1.7 py_1 conda-forge\r\n", "webencodings 0.5.1 py_1 conda-forge\r\n", "wheel 0.33.6 py37_0 conda-forge\r\n", "xz 5.2.4 h14c3975_1001 conda-forge\r\n", "zeromq 4.3.2 he1b5a44_2 conda-forge\r\n", "zlib 1.2.11 h516909a_1006 conda-forge\r\n" ] } ], "source": [ "!conda list -n ntds_2019" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Install packages in an environment. The package will be installed in the activated environment if an environment name is not given." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting package metadata (current_repodata.json): - \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\bdone\r\n", "Solving environment: \\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\b- \b\b\\ \b\b| \b\b/ \b\bdone\r\n", "\r\n", "# All requested packages already installed.\r\n", "\r\n" ] } ], "source": [ "!conda install -n ntds_2019 git" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Want to know more?** Look at the [conda user guide](https://conda.io/docs/user-guide/overview.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 2 Python\n", "\n", "[Python](https://python.org) is one of the main programming languages used by data scientists, along [R](https://www.r-project.org) and [Julia](https://julialang.org). As an open and general purpose language, it is replacing [MATLAB](https://mathworks.com/products/matlab.html) in many scientific and engineering fields. Python is the most popular language used for machine learning.\n", "\n", "Below are very basic examples of Python code. **Want to learn more?** Look at the [Python Tutorial](https://docs.python.org/3/tutorial/index.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Control flow" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "hello\n" ] } ], "source": [ "if 1 == 1:\n", " print('hello')" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n" ] } ], "source": [ "for i in range(5):\n", " print(i)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n", "3\n" ] } ], "source": [ "a = 4\n", "while a > 2:\n", " print(a)\n", " a -= 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data structures" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lists are mutable, i.e., we can change the objects they store." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 'hello', 3.2]\n", "[1, 2, 'world', 3.2]\n" ] } ], "source": [ "a = [1, 2, 'hello', 3.2]\n", "print(a)\n", "a[2] = 'world'\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tuples are not mutable." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1, 2, 'hello')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(1, 2, 'hello')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sets contain unique values." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{1, 2, 3, 4}\n", "{2, 4}\n" ] } ], "source": [ "a = {1, 2, 3, 3, 4}\n", "print(a)\n", "print(a.intersection({2, 4, 6}))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dictionaries map keys to values." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = {'one': 1, 'two': 2, 'three': 3}\n", "a['two']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Functions" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def add(a, b):\n", " return a + b\n", "\n", "add(1, 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Classes" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "30" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class A:\n", " d = 10\n", " \n", " def add(self, c):\n", " return self.d + c\n", "\n", "a = A()\n", "a.add(20)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "30\n", "-10\n" ] } ], "source": [ "class B(A):\n", " def sub(self, c):\n", " return self.d - c\n", "\n", "b = B()\n", "print(b.add(20))\n", "print(b.sub(20))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dynamic typing" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'abc'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = 1\n", "x = 'abc'\n", "x" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'hello'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add('hel', 'lo')" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add([1, 2], [3, 4, 5])" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "130\n", "120 items\n" ] } ], "source": [ "print(int('120') + 10)\n", "print(str(120) + ' items')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 3 Jupyter notebooks\n", "\n", "[Jupyter](https://jupyter.org) notebooks allow to mix text, math, code, and results (numerical or figures) in a **single document**. It is intended for interactive computing and is very useful to explore data, teach concepts, create reports. Code can be written in many programming languages, including Python, Julia, R, MATLAB, C++." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Markdown text (and Latex math)\n", "\n", "A list:\n", "\n", "* item\n", "* item\n", "\n", "Text in a paragraph. Text can be *italic*, **bold**, `verbatim`. We can define [hyperlinks](https://github.com/mdeff/ntds_2019).\n", "\n", "A numbered list:\n", "\n", "1. item\n", "1. item\n", "\n", "Some inline math: $x = \\frac12$\n", "\n", "Some display math:\n", "$$ f(x) = \\frac{e^{-x}}{4} $$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Code and results" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6.0" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "20 / 100 * 30" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Inline figures" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "y = np.random.uniform(size=100)\n", "plt.plot(y);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Want to learn more?** Look at the [documentation](https://jupyter.org/documentation)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 4 Version control with git\n", "\n", "![git](figures/git.jpg)\n", "\n", "[git](https://git-scm.com) is an open-source distributed version control system. It allows users to collaborate on projects (not only software!), synchronize and track their changes. It is most often used with an hosting service such as [GitHub](https://github.com) or [GitLab](https://about.gitlab.com). Those services add many tools to facilitate issue tracking, code review, continuous integration, etc.\n", "\n", "* Decentralized: draw on black board. Make it clear that all repos are the same.\n", "* Commit are local. We push / pull to sync with other repos.\n", "* Git is often used in a centralized fashion, with github / gitlab being the syncing point for everybody. It does not have to be, but github is easier to access than my laptop.\n", "* **Want to learn more?** Try this [interactive guide](https://try.github.io) or look at the more involved [user manual](https://git-scm.com/docs).\n", "\n", "### Basic usage\n", "\n", "1. Install with `conda install git`.\n", "1. Everybody make a clean clone (to be erased afterwards). Use HTTPS if not logged on GitHub.\n", "1. I add a fake file.\n", "1. I commit. It is not on github.\n", "1. I push. It is on github.\n", "1. They pull. They see it on their machines.\n", "\n", "Two kinds of users:\n", "* Those who don't want to use git, just do `git pull` before every lab. **Do not modify the content of the folder.** That is like your inbox, you only copy files from there and modify them outside.\n", "* The power users make a branch for each of their solutions!\n", "\n", "### Power users\n", "\n", "* Make a branch: `git branch assignment1_solution`\n", "* Work on that branch: `git checkout assignment1_solution`\n", "* Do and commit your modifications. You get a history of your changes!\n", "* Come back to master with `git checkout master` and get new stuff from the TAs with `git pull`. Again, you should never modify master (you could do it locally, but only the TAs have write access to the github repo).\n", "\n", "### Super-power users\n", "\n", "Those who want to backup or share their work on github.\n", "\n", "1. Create a github account.\n", "1. Create a repository (you could have forked mdeff/ntds_2019).\n", "1. Add a remote repo: `git remote add my_repo git@github.com:username/ntds_2019.git`\n", "1. Push your own branches to your repo: `git push -u my_repo milestone1_solution`.\n", "1. Go on your github and see your changes.\n", "\n", "### Contributors\n", "\n", "Same as before, except that you can now make a pull request for your changes to be integrated into master and be available to all of us.\n", "\n", "### Collaborate with git and github\n", "\n", "All the code for your projects will have to be handled as a repository on GitHub.\n", "While you don't have to collaborate with git (i.e., you can create a single commit at the end with all of your code), we highly recommend you to use it.\n", "It is a very good way to manage your project, as it allows you to come back to previous states, synchronize your changes without being lost with versions, track who did what, discuss issues and code, etc.\n", "As such, we recommend you to use git from the start to get the basics. Once you feel ready, create a repository for your project and start working on an assignment there." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 5 Scientific Python\n", "\n", "Below are the basic packages used for scientific computing and data science.\n", "* [NumPy](https://www.numpy.org): N-dimensional arrays\n", "* [SciPy](https://www.scipy.org/scipylib/index.html): scientific computing\n", "* [matplotlib](https://matplotlib.org): powerful visualization\n", "* [pandas](https://pandas.pydata.org): data analysis\n", "\n", "**Want to learn more?** Look at the [Scipy Lecture Notes](https://www.scipy-lectures.org/).\n", "\n", "Finally, the below packages will be useful to work with networks and graphs.\n", "* [NetworkX](https://networkx.github.io): network science\n", "* [graph-tool](https://graph-tool.skewed.de): network science\n", "* [scikit-learn](https://scikit-learn.org): graph embedding (dimensionality reduction)\n", "* [PyGSP](https://github.com/epfl-lts2/pygsp): graph signal processing\n", "* [Deep Graph Library](https://www.dgl.ai): deep learning on graphs with [PyTorch](https://pytorch.org)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 6 Ressources to improve your Python skills (for experienced Python users)\n", "\n", "We provide a non exhaustive list of tools and concepts that can help you improve your Python coding skills.\n", "They are by no means things that you need to master immediately.\n", "\n", "* Numpy and pytorch indexing and broadcasting rules.\n", " They take some time to understand, but are really essential.\n", " They will help you avoid writing loops, which are considerably slower and sometimes memory inefficient.\n", " * \n", " * \n", " * \n", " * \n", "\n", "\n", "* Some common Python built-in functions.\n", " * [`enumerate`](https://docs.python.org/3/library/functions.html#enumerate)\n", " * [`zip`](https://docs.python.org/3/library/functions.html#zip)\n", " * [`itertools.product`](https://docs.python.org/3/library/itertools.html#itertools.product)\n", "\n", "\n", "* Scipy functions.\n", " * `pdist` and `cdist` are considerably faster than loops to compute pairwise distances between objects (e.g., to build a nearest neighbors graph).\n", " * \n", " * \n", "\n", "\n", "* Object-oriented programming.\n", " * Classes.\n", " * \n", " * Read the source code of libraries you commonly use, and try to understand how they organize it. In particular, we advise you to write your methods as in the scikit-learn API.\n", " * \n", " * Inheritance.\n", " When you implement different models that have the same role (such as different machine learning classifiers), base methods allow you to avoid writing the same code several times.\n", " * \n", " * Abstract methods.\n", " They allow you to tell which methods subclassses (of the base class) should implement.\n", " * \n", " * \n", "\n", "\n", "* Google python style guide.\n", " Not essential, but following these rules make things easier when you work in a group.\n", " * \n", " \n", "\n", "* Unit tests.\n", " * \n", " * " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }