{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "$$\n", "\\newcommand{\\bb}[1]{\\boldsymbol{#1}}\n", "$$\n", "\n", "# CS236781: Deep Learning\n", "\n", "# Tutorial 0: Python and PyTorch basics" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In this tutorial, we will cover:\n", "* Course info \n", "* Environment setup with `conda`\n", "* Jupyter: Using notebooks\n", "* Pytorch:\n", " - Tensors basics: indexing, datatypes, math\n", " - Broadcasting\n", " - Intro to automatic differentiation\n", "\n", "Also in this tutorial, but for self-study:\n", "* Basic Python: Basic data types (Containers, Lists, Dictionaries, Sets, Tuples), Functions, Classes\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "## Administration and General Info" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "My info:\n", "- Aviv A. Rosenberg\n", "- avivr@cs.technion.ac.il\n", "- Office hour: Thursdays, 13:30.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Course:\n", "- Website is at https://vistalab-technion.github.io/cs236781/\n", "- Updates will be posted there (but emails will be sent via WebCourse)\n", "- Post questions on **Piazza** only! (Not email, not facebook)\n", "- Any questions about the **homeworks** or **tutorials**: I'll answer on Piazza.\n", "- For personal administrative requests/delays: email Chaim.\n", "- For appeals/questions about grades: Email Yaniv/Evgenii/Ben." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Video Lectures (by Prof. Alex Bronstein)\n", "- Provide a high level presentation of most core topics of Deep Learning, including very recent topics.\n", "- Give mathematical background and justifications.\n", "\n", "In-class Lectures\n", "- Supplementary material with more in-depth examples or advanced topics." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Tutorials:\n", "- Structure is usually a short theory reminders part and then step-by-step technical implementation of a real problem.\n", "- Technical, meant to help you understand the implementation details behind deep learning.\n", "- **Highly relevant** for success in the homework assignments.\n", "- After this tutorial you should clone the [tutorials repo](https://github.com/vistalab-technion/cs236781-tutorials), install the conda env and play with the code.\n", "- Videos are available on course site (by Aviv Rosenberg) - requires a technion account." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Homework:\n", "- Four HW assignments, quite heavy load. Best to tackle them after you have sufficient programming experience.\n", "- Almost entirely \"wet\" i.e. implementation of real algorithms with real data.\n", "- Should be done in pairs.\n", "- Some will require use of GPUs. We will provide access to course servers - **please register**.\n", "- Read the [getting started page](https://vistalab-technion.github.io/cs236781/assignments/getting-started) and [collaboration policy](https://vistalab-technion.github.io/cs236781/info/#administration) carefully!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Environment setup" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "To install and manage all the necessary packages and dependencies for the\n", "course tutorials and assignments, we use [conda](https://conda.io), a popular package-manager for python.\n", "\n", "- The tutorial notebooks and homework assignments come with an `environment.yml` file which defines which third-party libraries we depend on.\n", "- Conda will use this file to create a virtual environment for you.\n", "- This virtual environment includes python and all other packages and tools we specified, separated from any preexisting\n", "python installation you may have." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Installation\n", "\n", "1. Install the python3 version of [miniconda](https://conda.io/miniconda.html).\n", "Follow the [installation instructions](https://conda.io/docs/user-guide/install/index.html)\n", "for your platform.\n", "\n", "2. Install all dependencies (into a virtual env) with `conda`:\n", "\n", " ```shell\n", " conda env update -f environment.yml\n", " ```\n", " \n", " This will also create a new virtual env (`cs236781-tutorials`) if it doesn't already exist.\n", "\n", "3. To activate the virtual environment (set up `$PATH`):\n", "\n", " ```shell\n", " conda activate cs236781-tutorials\n", " ```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "You can also check what conda environments you have and which is active, run\n", "\n", "```shell\n", "conda env list\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Short demo of environment setup\n", "\n", "We'll now do a quick demo of the environment installation and working with `conda`, since usually there are many questions about this." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Running Jupyter\n", "\n", "From a terminal, enter the folder contaning the tutorial notebooks.\n", "1. Make sure that the active conda environment is `cs236781-tutorials`:\n", "\n", " ```shell\n", " conda activate cs236781-tutorials\n", " ```\n", "\n", "2. Run jupyter with\n", "\n", " ```shell\n", " jupyter lab\n", " ```\n", " \n", " This will start a [jupyter lab](https://jupyterlab.readthedocs.io/en/stable/)\n", " server and open your browser at the local server's url. You can now start working with the notebooks.\n", "\n", "If you're new to jupyter notebooks, you can get started by reading the\n", "[UI guide](https://jupyter-notebook.readthedocs.io/en/stable/notebook.html#notebook-user-interface)\n", "and also about how to use notebooks in\n", "[JupyterLab](https://jupyterlab.readthedocs.io/en/latest/user/notebook.html)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Important Note**: The course homework and tutorials use **different** conda envs! Make sure to use the correct one each time." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Jupyter basics\n", "\n", "Jupyter notebooks consist mainly of code and markdown cells.\n", "The code cells contain code that is run by a `kernel`, an\n", "interpreter for some programming language, python in our case." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "bar\n" ] }, { "data": { "text/plain": [ "42" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# This is a code cell; it can contain arbitrary python code.\n", "\n", "foo = 'bar'\n", "print(foo)\n", "\n", "def the_answer():\n", " return 42\n", "\n", "# The output of the last expression in a cell is shown\n", "2*the_answer()\n", "the_answer()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Variables and functions defined in a code cell are available in subsequent cells." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "ans = the_answer()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "42" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ans" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This is a markdown cell. You can use markdown syntax to format your text, and also include equations\n", "written in $\\LaTeX$:\n", "\n", "$$\n", "e^{i\\pi} - 1 = 0\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Other useful things to know about:\n", "* Opening a console for notebook\n", "* Restarting kernel\n", "* Magics" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The autoreload extension is already loaded. To reload it, use:\n", " %reload_ext autoreload\n" ] } ], "source": [ "%matplotlib inline\n", "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "52.7 ns ± 0.148 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)\n" ] } ], "source": [ "%timeit the_answer()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [], "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "## Introduction" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Python is a great general-purpose programming language on its own and with the addition of a few\n", "popular libraries such as `numpy`, `scipy`, `pandas`, `scikit-learn`, `matplotlib` and others it becomes an\n", "effective scientific computing environment.\n", "\n", "Today it is also the most-used language for machine learning both in research and industry." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Recently many **Deep Learning frameworks** have emerged for python.\n", "Arguably the most notable ones in 2021 are **TensorFlow** (with the Keras frontend) and **PyTorch**.\n", "\n", "In this course we'll use PyTorch, which is currently [the leading DL framework](https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominates-industry) for research.\n", "\n", "