{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Sveučilište u Zagrebu
\n", "Fakultet elektrotehnike i računarstva\n", "\n", "# Strojno učenje\n", "\n", "http://www.fer.unizg.hr/predmet/su\n", "\n", "Ak. god. 2015./2016.\n", "\n", "# Bilježnica 5: Regresija\n", "\n", "(c) 2015 Jan Šnajder\n", "\n", "Verzija: 0.3 (2015-11-09)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Populating the interactive namespace from numpy and matplotlib\n" ] } ], "source": [ "import scipy as sp\n", "import scipy.stats as stats\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "%pylab inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sadržaj:\n", "\n", "* Uvod\n", "\n", "* Osnovni pojmovi\n", "\n", "* Model, funkcija gubitka i optimizacijski postupak\n", "\n", "* Postupak najmanjih kvadrata\n", "\n", "* Probabilistička interpretacija regresije\n", "\n", "* Poopćeni linearan model regresije\n", "\n", "* Odabir modela\n", "\n", "* Regularizirana regresija\n", "\n", "* Sažetak" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Osnovni pojmovi\n", "\n", "* Označen skup podataka: $\\mathcal{D}=\\{(\\mathbf{x}^{(i)},y^{(i)})\\},\\quad \\mathbf{x}\\in\\mathbb{R}^n,\\quad y\\in\\mathbb{R}$\n", "\n", "\n", "* Hipoteza $h$ aproksimira nepoznatu funkciju $f:\\mathbb{R}^n\\to\\mathbb{R}$\n", "\n", "\n", "* Idealno, $y^{(i)}=f(\\mathbf{x}^{(i)})$, ali zbog šuma: $$y^{(i)}=f(\\mathbf{x}^{(i)})+\\varepsilon$$\n", "\n", "\n", "* $\\mathbf{x}$ - **ulazna varijabla** (nezavisna, prediktorska)\n", "\n", "\n", "* $y$ - **izlazna varijabla** (zavisna, kriterijska)\n", "\n", "\n", "### Vrste regresije\n", "\n", "* Broj **ulaznih** (nezavisnih) varijabli:\n", " * Univarijatna (jednostavna, jednostruka) regresija: $n=1$\n", " * Multivarijatna (višestruka, multipla) regresija: $n>1$\n", "\n", "\n", "* Broj **izlaznih** (zavisnih) varijabli:\n", " * Jednoizlazna regresija: $f(\\mathbf{x}) = y$\n", " * Višeizlazna regresija: $f(\\mathbf{x})=\\mathbf{y}$\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model, funkcija gubitka i optimizacijski postupak\n", "\n", "\n", "### (1) Model\n", "\n", "* **Linearan model regresije**: $h$ je linearna funkcija parametara\n", "$\\mathbf{w} = (w_0,\\dots,w_n)$\n", "\n", "\n", "* Linearna regresija:\n", " $$h(\\mathbf{x}|\\mathbf{w}) = w_0 + w_1 x_1 + w_2 x_2 + \\dots + w_n x_n$$\n", "\n", "\n", "* Polinomijalna regresija:\n", " * Univarijatna polinomijalna: $$h(x|\\mathbf{w}) = w_0 + w_1 x + w_2 x^2 + \\dots + w_d x^d\\quad (n=1)$$\n", " * Multivarijatna polinomijalna: $$h(\\mathbf{x}|\\mathbf{w}) = w_0 + w_1 x_1 + w_2 x_2 + w_3 x_1 x_2 + w_4 x_1^2 + w_5 x_2^2\\quad (n=2, d=2)$$\n", " * Modelira međuovisnost značajki (*cross-terms* $x_1 x_2, \\dots$) \n", "\n", "\n", "* Općenite **bazne funkcije**:\n", " $$h(\\mathbf{x}|\\mathbf{w}) = w_0 + w_1\\phi_1(\\mathbf{x}) + \\dots + w_m\\phi_m(\\mathbf{x})$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (2) Funkcija gubitka (funkcija pogreške)\n", "\n", "* Kvadratni gubitak (engl. *quadratic loss*)\n", "\n", "$$\n", "L(y^{(i)},h(\\mathbf{x}^{(i)})) = \\big(y^{(i)}-h(\\mathbf{x}^{(i)})\\big)^2\n", "$$\n", "\n", "* Funkcija pogreške (proporcionalna s empirijskim očekivanjem gubitka):\n", "$$\n", "E(h|\\mathcal{D})=\\frac{1}{2}\n", "\\sum_{i=1}^N\\big(y^{(i)}-h(\\mathbf{x}^{(i)})\\big)^2\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (3) Optimizacijski postupak\n", "\n", "* Postupak **najmanjih kvadrata** (engl. *least squares*)\n", "\n", "$$\n", "\\mathrm{argmin}_{\\mathbf{w}} E(\\mathbf{w}|\\mathcal{D})\n", "$$\n", "\n", "\n", "* Rješenje ovog optimizacijskog problema postoji u **zatvorenoj formi**\n", "\n", "\n", "# Postupak najmanjih kvadrata\n", "\n", "\n", "* Razmotrimo najprije linearnu regresiju:\n", "$$h(\\mathbf{x}|\\mathbf{w}) = w_0 + w_1 x_1 + w_2 x_2 + \\dots + w_n x_n = \\sum_{i=1}^n w_i x_i + w_0$$\n", "\n", "\n", "* Izračun je jednostavniji ako pređemo u matrični račun\n", " * Svaki vektor primjera $\\mathbf{x}^{(i)}$ proširujemo *dummy* značajkom $x^{(i)}_0 = 1$, pa je model onda:\n", "\n", "$$h(\\mathbf{x}|\\mathbf{w}) = \\mathbf{w}^\\intercal \\mathbf{x}$$\n", "\n", "\n", "* Skup primjera:\n", "\n", "$$\n", "\\mathbf{X} = \n", "\\begin{pmatrix}\n", "1 & x^{(1)}_1 & x^{(1)}_2 \\dots & x^{(1)}_n\\\\\n", "1 & x^{(2)}_1 & x^{(2)}_2 \\dots & x^{(2)}_n\\\\\n", "\\vdots\\\\\n", "1 & x^{(N)}_1 & x^{(N)}_2 \\dots & x^{(N)}_n\\\\\n", "\\end{pmatrix}_{N\\times (n+1)}\n", "=\n", "\\begin{pmatrix}\n", "1 & (\\mathbf{x}^{(1)})^\\intercal \\\\\n", "1 & (\\mathbf{x}^{(2)})^\\intercal \\\\\n", "\\vdots\\\\\n", "1 & (\\mathbf{x}^{(N)})^\\intercal \\\\\n", "1 & \\end{pmatrix}_{N\\times (n+1)}\n", "$$\n", "* Matricu primjera $\\mathbf{X}$ zovemo **dizajn-matrica**\n", "\n", "\n", "* Vektor izlaznih vrijednosti:\n", "$$\n", "\\mathbf{y} = \n", "\\begin{pmatrix}\n", "y^{(1)}\\\\\n", "y^{(2)}\\\\\n", "\\vdots\\\\\n", "y^{(N)}\\\\\n", "\\end{pmatrix}_{N\\times 1}\n", "$$\n", "\n", "### Egzaktno rješenje\n", "\n", "* Idealno, tražimo egzaktno rješenje, tj. rješenje za koje vrijedi\n", "$$\n", "(\\mathbf{x}^{(i)}, y^{(i)})\\in\\mathcal{D}.\\ h(\\mathbf{x}^{(i)}) = y^{(i)}\n", "$$\n", "odnosno\n", "$$\n", "(\\mathbf{x}^{(i)}, y^{(i)})\\in\\mathcal{D}.\\ \\mathbf{w}^\\intercal \\mathbf{x} = y^{(i)}\n", "$$\n", "\n", "\n", "* Možemo napisati kao matričnu jednadžbu ($N$ jednadžbi s $(n+1)$ nepoznanica):\n", "\n", "$$\n", "\\mathbf{X}\\mathbf{w} = \\mathbf{y}\n", "$$\n", "\n", "$$\n", "\\begin{pmatrix}\n", "1 & x^{(1)}_1 & x^{(1)}_2 \\dots & x^{(1)}_n\\\\\n", "1 & x^{(2)}_1 & x^{(2)}_2 \\dots & x^{(2)}_n\\\\\n", "\\vdots\\\\\n", "1 & x^{(N)}_1 & x^{(N)}_2 \\dots & x^{(N)}_n\\\\\n", "\\end{pmatrix}\n", "\\cdot\n", "\\begin{pmatrix}\n", "w_0\\\\\n", "w_1\\\\\n", "\\vdots\\\\\n", "w_n\\\\\n", "\\end{pmatrix}\n", "=\n", "\\begin{pmatrix}\n", "y^{(1)}\\\\\n", "y^{(2)}\\\\\n", "\\vdots\\\\\n", "y^{(N)}\\\\\n", "\\end{pmatrix}\n", "$$\n", "\n", "* Egzaktno rješenje ovog sustava jednadžbi je\n", "\n", "$$\n", "\\mathbf{w} = \\mathbf{X}^{-1}\\mathbf{y}\n", "$$\n", "\n", "Međutim, rješenja nema ili ono nije jedinstveno ako:\n", "\n", "* (1) $\\mathbf{X}$ nije kvadratna, pa nema inverz. U pravilu:\n", " * $N>(n+1)$
\n", " $\\Rightarrow$ sustav je **preodređen** (engl. *overdetermined*) i nema rješenja\n", " * $N<(n+1)$
\n", " $\\Rightarrow$ sustav je **pododređen** (engl. *underdetermined*) i ima višestruka rješenja\n", " \n", "* (2) $\\boldsymbol{X}$ jest kvadratna (tj. $N=(n+1)$), ali ipak nema inverz (ovisno o rangu matrice)
$\\Rightarrow$ sustav je **nekonzistentan**\n", "\n", "\n", "### Rješenje najmanjih kvadrata\n", "\n", "\n", "* Približno rješenje sustava $\\mathbf{X}\\mathbf{w}=\\mathbf{y}$\n", "\n", "\n", "* Funkcija pogreške: \n", "$$\n", "E(\\mathbf{w}|\\mathcal{D})=\\frac{1}{2}\n", "\\sum_{i=1}^N\\big(\\mathbf{w}^\\intercal\\mathbf{x}^{(i)} - y^{(i)}\\big)^2\n", "$$\n", "\n", "\n", "* Matrični oblik:\n", "\\begin{align*}\n", "E(\\mathbf{w}|\\mathcal{D}) \n", "=& \n", "\\frac{1}{2} (\\mathbf{X}\\mathbf{w} - \\mathbf{y})^\\intercal (\\mathbf{X}\\mathbf{w} - \\mathbf{y})\\\\\n", "=&\n", "\\frac{1}{2}\n", "(\\mathbf{w}^\\intercal\\mathbf{X}^\\intercal\\mathbf{X}\\mathbf{w} - \\mathbf{w}^\\intercal\\mathbf{X}^\\intercal\\mathbf{y} - \\mathbf{y}^\\intercal\\mathbf{X}\\mathbf{w} + \\mathbf{y}^\\intercal\\mathbf{y})\\\\\n", "=&\n", "\\frac{1}{2}\n", "(\\mathbf{w}^\\intercal\\mathbf{X}^\\intercal\\mathbf{X}\\mathbf{w} - 2\\mathbf{y}^\\intercal\\mathbf{X}\\mathbf{w} + \\mathbf{y}^\\intercal\\mathbf{y})\n", "\\end{align*}\n", "\n", "> Jednakosti linearne algebre:\n", "> * $(A^\\intercal)^\\intercal = A$\n", "> * $(AB)^\\intercal = B^\\intercal A^\\intercal$\n", "\n", "* Minimizacija pogreške:\n", "$$\n", "\\begin{align*}\n", "\\nabla_{\\mathbf{w}}E &= \n", "\\frac{1}{2}\\Big(\\mathbf{w}^\\intercal\\big(\\mathbf{X}^\\intercal\\mathbf{X}+(\\mathbf{X}^\\intercal\\mathbf{X})^\\intercal\\big) -\n", "2\\mathbf{y}^\\intercal\\mathbf{X}\\Big) = \n", "\\mathbf{X}^\\intercal\\mathbf{X}\\mathbf{w} - \\mathbf{X}^\\intercal\\mathbf{y} = \\mathbf{0}\n", "\\end{align*}\n", "$$\n", "\n", "\n", "> Jednakosti linearne algebre:\n", "> * $\\frac{\\mathrm{d}}{\\mathrm{d}x}x^\\intercal A x=x^\\intercal(A+A^\\intercal)$\n", "> * $\\frac{\\mathrm{d}}{\\mathrm{d}x}A x=A$\n", "\n", "\n", "* Dobivamo sustav tzv. **normalnih jednadžbi**:\n", "$$\n", "\\mathbf{X}^\\intercal\\mathbf{X}\\mathbf{w} = \\mathbf{X}^\\intercal\\mathbf{y}\n", "$$\n", "\n", "\n", "* Rješenje:\n", "$$\n", "\\mathbf{w} = (\\mathbf{X}^\\intercal\\mathbf{X})^{-1}\\mathbf{X}^\\intercal\\mathbf{y} = \\color{red}{\\mathbf{X}^{+}}\\mathbf{y}\n", "$$\n", "\n", "\n", "* Matrica $\\mathbf{X}^{+}=(\\mathbf{X}^\\intercal\\mathbf{X})^{-1}\\mathbf{X}^\\intercal$ je **pseudoinverz** (Moore-Penroseov inverz) matrice $\\mathbf{X}$\n", "\n", "\n", "* **Q:** Kojih je dimenzija matrica $(\\mathbf{X}^\\intercal\\mathbf{X})^{-1}$?\n", "* **Q:** Što utječe na složenost izračuna inverza matrice: broj primjera $N$ ili broj dimenzija $n$?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Probabilistička interpretacija regresije" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Ograničimo se BSO na univarijatnu ($n=1$) linearnu regresiju:\n", "\n", "$$\n", "h(x|w_0, w_1) = w_0 + w_1 x\n", "$$\n", "\n", "\n", "* Zbog šuma u $\\mathcal{D}$:\n", "$$\n", " y^{(i)} = f(x^{(i)}) + \\color{red}{\\varepsilon}\n", "$$\n", "\n", "* Prepostavka:\n", "$$\n", " \\color{red}{\\varepsilon}\\ \\sim\\ \\mathcal{N}(0, \\sigma^2)\n", "$$\n", "\n", "* Posjedično:\n", "$$\n", " \\color{red}{y|x}\\ \\sim\\ \\mathcal{N}\\big(f(x), \\sigma^2\\big)\n", "$$\n", "odnosno\n", "$$\n", " \\color{red}{p(y|x)} = \\mathcal{N}\\big(f(x), \\sigma^2\\big)\n", "$$\n", "\n", "* Vrijedi \n", "$$\\mathbb{E}[y|x] = \\mu = f(x)$$\n", "\n", "\n", "* Naš cilj je: $h(x|\\mathbf{w}) = f(x)$\n", "\n", "\n", "* [Skica]\n", "\n", "\n", "* $p(y^{(i)}|x^{(i)})$ je vjerojatnost da je $f(x^{(i)})$ generirala vrijednost $y^{(i)}$\n", " * (Formulacija nije baš točna, jer je $x$ kontinuirana varijabla a $p$ je gustoća vjerojatnosti.)\n", " \n", "### Log-izglednost\n", "\n", "$$\n", "\\begin{align*}\n", "\\ln\\mathcal{L}(\\mathbf{w}|\\mathcal{D}) \n", "&= \n", "\\ln p(\\mathcal{D}|\\mathbf{w}) = \n", "\\ln\\prod_{i=1}^N p(x^{(i)}, y^{(i)}) =\n", "\\ln\\prod_{i=1}^N p(y^{(i)}|x^{(i)}) p(x^{(i)}) \\\\ \n", "&= \n", "\\ln\\prod_{i=1}^N p(y^{(i)}|x^{(i)}) + \\underbrace{\\color{gray}{\\ln\\prod_{i=1}^N p(x^{(i)})}}_{\\text{Ne ovisi o $\\mathbf{w}$}} \\\\\n", "& \\Rightarrow \\ln\\prod_{i=1}^N p(y^{(i)}|x^{(i)}) =\n", "\\ln\\prod_{i=1}^N\\mathcal{N}\\big(h(x^{(i)}|\\mathbf{w}),\\sigma^2\\big)\\\\ &= \n", "\\ln\\prod_{i=1}^N\\frac{1}{\\sqrt{2\\pi}\\sigma}\\exp\\Big\\{-\\frac{\\big(y^{(i)}-h(x^{(i)}|\\mathbf{w})\\big)^2}{2\\sigma^2}\\Big\\}\\\\ \n", "&=\n", "\\underbrace{\\color{gray}{-N\\ln(\\sqrt{2\\pi}\\sigma)}}_{\\text{konst.}} -\n", "\\frac{1}{2\\color{gray}{\\sigma^2}}\\sum_{i=1}^N\\big(y^{(i)}-h(x^{(i)}|\\mathbf{w})\\big)^2\\\\\n", "& \\Rightarrow\n", "-\\frac{1}{2}\\sum_{i=1}^N\\big(y^{(i)}-h(x^{(i)}|\\mathbf{w})\\big)^2\n", "\\end{align*}\n", "$$\n", "\n", "\n", "* Uz pretpostavku Gaussovog šuma, **maksimizacija izglednosti** odgovara **minimizaciji funkcije pogreške** definirane kao **zbroj kvadratnih odstupanja**:\n", "\n", "$$\n", "\\begin{align*}\n", "\\mathrm{argmax}_{\\mathbf{w}} \\ln\\mathcal{L}(\\mathbf{w}|\\mathcal{D}) &= \\mathrm{argmin}_{\\mathbf{w}} E(\\mathbf{w}|\\mathcal{D})\\\\\n", "E(h|\\mathcal{D}) &=\\frac{1}{2} \\sum_{i=1}^N\\big(y^{(i)}-h(x^{(i)}|\\mathbf{w})\\big)^2\\\\\n", "L\\big(y,h(x|\\mathbf{w})\\big)\\ &\\propto\\ \\big(y - h(x|\\mathbf{w})\\big)^2\n", "\\end{align*}\n", "$$\n", "\n", "\n", "* $\\Rightarrow$ Probabilističko opravdanje za kvadratnu funkciju gubitka\n", "\n", "\n", "* Rješenje MLE jednako je rješenju koje daje postupak najmanjih kvadrata!\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Poopćeni linearan model regresije" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Zanima nas poopćenje na $n>1$ koje obuhvaća sve multivarijatne linearne modele regresije: univarijatna regresija, linearna regresija, polinomijalna regresija, ...\n", " * $h(\\mathbf{x}|\\mathbf{w}) = w_0 + w_1 x_1 + w_2 x_2 + \\dots + w_n x_n$\n", " * $h(x|\\mathbf{w}) = w_0 + w_1 x + w_2 x^2 + \\dots + w_d x^d$\n", " * $h(\\mathbf{x}|\\mathbf{w}) = w_0 + w_1 x_1 + w_2 x_2 + w_3 x_1 x_2 + w_4 x_1^2 + w_5 x_2^2$\n", " * ...\n", "\n", "\n", "* Uvodimo fiksan skup **baznih funkcija** (nelinearne funkcije ulaznih varijabli):\n", "$$\n", " \\{\\phi_0, \\phi_1, \\phi_2, \\dots, \\phi_m\\}\n", "$$ \n", "gdje $\\phi_j:\\mathbb{R}^n\\to\\mathbb{R}$\n", "\n", "\n", "* Dogovorno: $\\phi_0(\\mathbf{x}) = 1$\n", "\n", "\n", "* Svaki vektor primjera u $n$-dimenzijskom originalnom ulaznom prostoru (engl. *input space*) $\\mathcal{X}$:\n", "$$\n", "\\mathbf{x} = (x_1, x_2, \\dots, x_n)\n", "$$\n", "preslikavamo u nov, $m$-dimenzijski prostor, tzv. **prostor značajki** (engl. *feature space*):\n", "$$\n", "\\boldsymbol{\\phi}(\\mathbf{x}) = \\big(\\phi_0(\\mathbf{x}), \\phi_1(\\mathbf{x}), \\dots, \\phi_m(\\mathbf{x})\\big)\n", "$$\n", "\n", "\n", "* **Funkija preslikavanja** (vektor baznih funkcija)\n", "$$\n", "\\begin{align*}\n", "\\boldsymbol{\\phi}&:\\mathbb{R}^n\\to\\mathbb{R}^m:\\\\\n", "\\boldsymbol{\\phi}(\\mathbf{x}) &= \\big(\\phi_0(\\mathbf{x}),\\dots,\\phi_m(\\mathbf{x})\\big)\\\\\n", "\\end{align*}\n", "$$\n", "\n", "\n", "* Poopćen linearan model:\n", "$$\n", " h(\\mathbf{x}|\\mathbf{w}) = \\sum_{j=0}^m w_j\\phi_j(\\mathbf{x}) = \\mathbf{w}^\\intercal\\boldsymbol{\\phi}(\\mathbf{x})\n", "$$\n", "\n", "\n", "### Uobičajene funkcije preslikavanja\n", "\n", "\n", "* Linearna regresija:\n", "$$\n", "\\boldsymbol{\\phi}(\\mathbf{x}) = (1,x_1,x_2,\\dots,x_n)\n", "$$\n", "\n", "\n", "* Univarijatna polinomijalna regresija: \n", "$$\n", "\\boldsymbol{\\phi}(x) = (1,x,x^2,\\dots,x^m)\n", "$$\n", "\n", "\n", "* Polinomijalna regresija drugog stupnja: \n", "$$\n", "\\boldsymbol{\\phi}(\\mathbf{x}) = (1,x_1,x_2,x_1 x_2, x_1^2, x_2^2)\n", "$$\n", "\n", "\n", "* Gaussove bazne funkcije (RBF):\n", "$$\n", "\\phi_j(x) = \\exp\\Big\\{-\\frac{(x-\\mu_j)^2}{2\\sigma^2}\\Big\\}\n", "$$\n", "\n", "\n", "* [Skica: RBF] \n", "\n", "### Prostor značajki\n", "\n", "\n", "* **Funkcija preslikavanja značajki** $\\mathbf{\\phi} : \\mathbb{R}^n \\to \\mathbb{R}^m $ preslikava primjere iz $n$-dimenzijskog ulaznog prostora u $m$-dimenzijski prostor značajki\n", "\n", "\n", "* Tipično je $m>n$\n", "\n", "\n", "* Tada je funkcija koja je linearna u prostoru značajki **nelinearna u ulaznom prostoru**\n", "\n", "\n", "* Dakle, možemo koristiti linearan model za nelinearne probleme\n", "\n", "\n", "* Imamo unificiran postupak, neovisno koju funkciju $\\boldsymbol{\\phi}$ odaberemo" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primjer: Preslikavanje iz ulaznog prostora u prostor značajki\n", "\n", "* $\\mathcal{X} = \\mathbb{R}$\n", "* $n=1$, $m=3$\n", "* $\\boldsymbol{\\phi} : \\mathbb{R} \\to \\mathbb{R}^3$\n", "* $\\boldsymbol{\\phi}(x) = (1,x,x^2)$\n", "* [Skica]" ] }, { "cell_type": "code", "execution_count": 261, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def f(x) : return 3*(x - 2)**2 + 1\n", "\n", "x1 = 1\n", "x2 = 2\n", "x3 = 3" ] }, { "cell_type": "code", "execution_count": 262, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW0AAAEACAYAAAB4ayemAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAG+hJREFUeJzt3Xt0ldWdxvHvT24R7yhiFSwUraXaAaoFKgoHEEK5eKGV\nilpHnVFb5dLaOhXtKK4ubV2d6VhiL85SUFuJFi1WEkQQiAEVBOUmNyXeQBAstmoREMieP/bBwZCQ\nN8k5Z7/vOc9nrSwPOS8nj5vkl332uy/mnENERJLhkNABREQkOhVtEZEEUdEWEUkQFW0RkQRR0RYR\nSRAVbRGRBGke5SIzewv4CNgL7HbO9chmKBERqV2kog04IOWc+yCbYURE5OAaMjxiWUshIiKRRC3a\nDnjWzJaY2TXZDCQiInWLOjzS2zm32czaArPNbK1zbn42g4mIyIEiFW3n3Ob0f983s2lAD2A+gJlp\n8xIRkUZwzjV42Lne4REza21mR6QfHwYMAlbW+MKx/7j99ttz/jWHD3c8+mj8cyalPZUz/EdDct54\no+PnP493xpAfjRVlTLsdMN/MlgGLgDLn3KxGf8UCMmgQzFJLSYGaNcv/DEhm1Ts84px7E+iWgyx5\nZ9AguPtucA5Mc2+kgGza5D/OPDN0kvxTMCsiU6lUzr/mqadCs2awdm30vxMiZ2MoZ2blW87Zs2HA\nAP/9n2tJacvGsqaMrYC/EdnU18hn114Lp58O48aFTiKSO5ddBv36wb//e+gk8WVmuGzciJSm0bi2\nFJrqat/THjgwdJL8pKKdZf37w/z5sGtX6CQiubF8ObRpA1/8Yugk+UlFO8vatIGvfhVeeCF0EpHc\n0KyR7FLRzgENkUghUdHOLhXtHFDRlkKxfTu89BLk+QSOoFS0c6BnT6iqgvffD51EJLsqK/3c7MMP\nD50kf6lo50CLFr7n8eyzoZOIZJeGRrJPRTtHNEQihUBFO/u0uCZH1q+Hvn1h40YtaZf8tHEjdOsG\nW7aEWQmZNFpcE3OdO0OrVvDqq6GTiGTHrFlw3nkq2Nmmop0jZjBkCMyYETqJSHaUl/vvcckuFe0c\nGjYMyspCpxDJvF27YM4c+Na3QifJfyraOZRKwYoV8IHOtJc8M38+dOkCbduGTpL/VLRzqKjI34x8\n5pnQSUQyq7wchg4NnaIwqGjnmIZIJB+Vlfnvbck+Fe0cGzIEZs6EPXtCJxHJjNdegx07oGvX0EkK\ng4p2jrVvDx06wMKFoZOIZEZZme+MaP1BbqhoBzBsmB8DFMkH5eUaGsklFe0Ahg5V0Zb88NFHsHix\nPw9SckNFO4AePWDzZnjnndBJRJpm1izo3RsOOyx0ksKhoh1As2Z+EYJ625J0muqXeyragWiIRJKu\nutpvy6CinVsq2oEUF/sN4z/5JHQSkcZZsgSOOw46dQqdpLCoaAdy9NHw9a/DvHmhk4g0joZGwlDR\nDkirIyXJtAoyDB2CENCaNX6Y5O23tTBBkmXzZjj9dNi6FZo3D50mmXQIQgJ95Sv+G14HI0jSzJjh\njxVTwc49Fe2AzDREIsmkoZFwVLQD09Q/SZpdu2DuXBg8OHSSwqSiHVjfvrByJWzbFjqJSDSVlXDG\nGX66n+SeinZgRUXQr5/frlUkCcrKNNUvJBXtGNAQiSSFcyraoalox8DQof4Ist27QycRObh16+DT\nT+Ff/iV0ksKloh0DJ54InTv7sUKROJs2DS64QOsKQopUtM2smZktNbPp2Q5UqEaM8D8QInE2bZr/\nXpVwok6NHwesBo7IYpaCdvIx5ZTdPpHbV+5ib1ErBo0dSx8NHEpMlJdXcvfds1i6tDm//OUeduwY\nxNChfULHKkj1Fm0zaw8MAe4Ebsx6ogJUWV7Oql+NY8GnVZAeIrm1qgpAhVuCKy+vZNy4Z6iquhOA\n2bPhjTduBVDhDiDK8Mj/ADcB1VnOUrBmTZzInekivc+dVVXMLikJlEjk/02cOOuzgr1PVdWdlJTM\nDpSosB20p21mw4CtzrmlZpaq67oJEyZ89jiVSpFK1Xmp1KL5rl21fr7Zzp05TiJyoF27ai8TO3c2\ny3GSZKuoqKCioqLJr1Pf8MjZwPlmNgQoAo40s4edc1fsf9H+RVsabk+rVrV+fm9RUY6TiByoVas9\ntX6+qGhvjpMkW80O7R133NGo1zno8Ihz7hbnXAfnXCfgEmBuzYItTTdo7Fhu7dz5c5+7pXNnBo4Z\nEyiRyP8bO3YQrVvf+rnPde58C2PGDAyUqLA1dGNFbZydBftuNv5nSQn/3LKT5VVFTPjNGN2ElFg4\n99w+OAcDBvwne/Y0o6hoL2PGDNZNyEB0CELM7N0LJ50Ezz/vF9yIhPboo/DHP2qrhUzTIQh5olkz\nv+JMC20kLv7yFy2oiRMV7RjS6kiJi507YdYsOP/80ElkHxXtGOrXz58fuXlz6CRS6GbPhm7doG3b\n0ElkHxXtGGrZEoYMgSefDJ1ECp2GRuJHRTumNEQioe3ZA9Onw0UXhU4i+1PRjqniYli0CD74IHQS\nKVSVldCpE3ToEDqJ7E9FO6YOOwz699dJ7RKOhkbiSUU7xkaM8D84IrlWXa29s+NKRTvGhg2DuXNh\n+/bQSaTQLF4MRx0Fp50WOonUpKIdY8ccA7166aR2yT0NjcSXinbMaYhEcs05Fe04U9GOuQsvhBkz\nYMeO0EmkUCxb5qf7de8eOonURkU75k44Ac48U5v1SO6UlsKoUTpxPa5UtBNg1Cj/gySSbdXVfle/\nUaNCJ5G6qGgnwIgR8Oyz8OGHoZNIvnvhBTjySPja10InkbqoaCfAMcdAKqW9SCT79g2NSHypaCeE\nhkgk2/bsgalT4ZJLQieRg1HRTojhw2HhQnj//dBJJF/NmeP3GtGJSfGmop0Qhx3mt2udOjV0EslX\nGhpJBhXtBNEQiWTLzp3w17/CyJGhk0h9VLQTpLgYVq+Gd94JnUTyzYwZfjHNiSeGTiL1UdFOkJYt\n/fS/xx4LnUTyjYZGksOcc017ATPX1NeQ6ObOhR//GJYuDZ1E8sVHH/mDDt58E9q0CZ2mcJgZzrkG\nrztVTzth+vaFLVtg7drQSSRf/PWv0KePCnZSqGgnTLNm/maRbkhKpmhoJFk0PJJAixbB974H69Zp\nUx9pmr/9zc/LfvddOPzw0GkKi4ZHCkiPHrB3L7zySugkknSPPw7f+pYKdpKoaCeQmeZsS2aUlsKl\nl4ZOIQ2h4ZGEWrUKBg+Gt9+GQ/SrVxph40bo2hU2bYJWrUKnKTwaHikwp5/ud/9bsCB0Ekmqxx6D\niy5SwU4aFe0EGzUKpkwJnUKSasoUzRpJIg2PJNg770C3bv7O/6GHhk4jSbJiBQwdCm+95aeRSu5p\neKQAnXwyfOMbMG1a6CSSNJMnw5VXqmAnkYp2wl19NUyaFDqFJMmnn8Ijj/iiLcmjop1wF1wAy5b5\nt7kiUUyf7m9k67CDZKq3aJtZkZktMrNlZrbazH6Ri2ASTVGRv5n04IOhk0hSTJrk36FJMkW6EWlm\nrZ1zn5hZc2AB8BPn3IL0c7oRGdjSpXDhhX6XNs3ZloN5911/0vrGjdC6deg0hS2rNyKdc5+kH7YE\nmgEfNPQLSfZ07+53aJs7N3QSibuHH4aLL1bBTrJIRdvMDjGzZcAWYJ5zbnV2Y0lD6Yak1Mc5DY3k\ng6g97WrnXDegPdDHzFJZTSUNdumlUF4Of/976CQSVwsW+NOPevQInUSaonlDLnbOfWhm5cBZQMW+\nz0+YMOGza1KpFKlUKjPpJLJjj/V7kZSWwvXXh04jcbSvl63tfMOoqKigoqKiya9T741IMzsO2OOc\n+4eZHQo8A9zhnJuTfl43ImPimWfg1lthyZLQSSRuPv7YHym2bh20axc6jUB2b0R+AZibHtNeBEzf\nV7AlXs47D7ZuheXLQyeRuPnzn6FfPxXsfKC9R/LMbbf5g1rvuSd0EomT3r1h/HgYNix0EtmnsT1t\nFe0888Yb0LOnn4/bsmXoNBIHa9dC//5+g7HmDbqLJdmkDaMEgC99yS+emD49dBKJi8mT4YorVLDz\nhYp2HtKcbdln926/oOaqq0InkUxR0c5DI0bAiy/6IRIpbDNn+o2hTjstdBLJFBXtPNS6NYwc6d8W\nS2G7/371svONbkTmqeXL/UyBN9/UWGaheustOOssf/jzYYeFTiM16UakfE7XrtCpEzz5ZOgkEsrv\nfucPOlDBzi/qaeexqVPh3nvhuedCJ5Fc++QT+OIXYdEiP6NI4kc9bTnAhRdCVZU/xFUKy5Qp0KuX\nCnY+UtHOYy1awPe/DyUloZNILjnn/83HjAmdRLJBwyN5butWP92rqsoflCD5r7ISrr0WVq/WSUZx\npuERqdXxx8Pw4fDAA6GTSK6UlMDo0SrY+Uo97QKweLGft71+PTRrFjqNZNOGDX7m0NtvwxFHhE4j\nB6OettTpG9/wW3KWl4dOItl2331w+eUq2PlMPe0C8cgj8OCDMHt26CSSLTt3+ml+8+fDl78cOo3U\nRz1tOaiLL4ZXX4U1a0InkWz585+he3cV7Hynol0gWraEa67xi20k/+yb5jd6dOgkkm0aHikgmzbB\nGWf4/UiOOip0GsmkhQvhssvgtdd0szkpNDwi9TrxRBg0yI9tS34pKYEbblDBLgTqaReY55/3mwit\nW6d5vPnivfegSxf/Duroo0OnkajU05ZIzj7bTwd7+unQSSRT/vAH+O53VbALhXraBai01G/bOX9+\n6CTSVB9/7DeFev55zRpJGvW0JbKRI/1b6srK0Emkqe67DwYMUMEuJOppF6j774fHH/dnCEoy7dzp\ne9lPP+2XrkuyqKctDXLFFbBqFSxZEjqJNNbkyXDmmSrYhUY97QL2m9/4IZInngidRBpq924/JDJl\nCnzzm6HTSGOopy0Nds01sGCB33dZkqW01J8BqoJdeFS0C1jr1jBuHPzyl6GTSENUV8MvfgG33BI6\niYTQPHQACeuGG6BzZ3jjDZ0nmBTTpsGRR/pZI1J41NMucEcdBdddB7/6VegkEoVzcNddvpdtDR4N\nlXygG5HC++/7cyRffdXvTyLxNXMm3HQTLF+ubQiSTjcipdHatvVTAH/969BJpD533QXjx6tgFzL1\ntAWAjRv9fN/XXoNjjw2dRmozfz5cdRWsXQvNdTcq8dTTliZp3x6+/W0/d1vi6c474eabVbALnXra\n8pn166FXL6iq0iEJcfPyy3Dhhf7fqFWr0GkkE9TTliY75RQYNkwzSeJo/Hj/oYIt9fa0zawD8DBw\nPOCA/3XOTdzvefW088iGDdCtG6xYASedFDqNAMya5c9+XLUKWrQInUYypbE97ShF+wTgBOfcMjM7\nHHgZuNA5tyb9vIp2nrn5Zvjb3/xOgBLW3r1+U6jbboMRI0KnkUzK2vCIc+4959yy9ON/AmsAzebN\nYzffDE895edtS1iPPOK3G7jootBJJC4adCPSzDoCzwGnpwu4etp56p574NlnoawsdJLCtWOHX/RU\nWgq9e4dOI5mW9RuR6aGRx4Fx+wq25K8f/MDv/jdvXugkhaukBM46SwVbPi/SjE8zawE8AfzJOfdk\nzecnTJjw2eNUKkUqlcpQPAmlVSu/+u4//gMWLdIKvFzbts3P4lmwIHQSyZSKigoqKiqa/DpRbkQa\n8BCwzTn3o1qe1/BInqquhp494cc/hksuCZ2msNx4oz9O7He/C51EsiWbs0fOASqBFfgpfwDjnXMz\n08+raOexefPg3/4N1qzRHOFcefNNPyyyejW0axc6jWRL1op2hC+sop3nhg2D886DH/4wdJLCcOml\n8JWv+Gl+kr9UtCVrXn0V+vf3m0kdfXToNPltyRI4/3zf1ocfHjqNZJOWsUvWnHGGLyQ6liy7nPN7\nZU+YoIItdVNPWyJ5912/deuLL8Kpp4ZOk5+mTYNbb/VbCGgnv/yn4RHJul//2i+2mTNHR11l2j/+\n4d/RlJbCueeGTiO5oOERybqxY+Gjj2Dy5NBJ8s/NN8PQoSrYUj/1tKVBli2DQYNg5UpNR8uU+fNh\n1Ch/w1c3eguHetqSE926wdVXw7hxoZPkh1274NprYeJEFWyJRkVbGuz22/3UNG0m1XR33eXnZGvb\nVYlKwyPSKHPnwpVX+o35jzgidJpkWrUKUik/5KQDJwqPZo9Izl19tZ9PPHFi/dfK51VXwznnwBVX\nwPe/HzqNhKCiLTn3wQdw+ul+fnGvXqHTJMtvfwuPPgrPPacdFAuVirYE8dhj8POfwyuvQMuWodMk\nw4YN8PWvQ2UldOkSOo2EotkjEsTIkdCxI9x9d+gkyeAc3HADjBmjgi2No562NNmGDX4r0SefhG9+\nM3SaePv97+G+++Cll/TOpNBpeESCKiuD66+Hl1+Gtm1Dp4mnxYv9qscXXoBTTgmdRkLT8IgENWwY\nXH45XHYZ7N0bOk38bNvmh5Luu08FW5pGPW3JmD17YOBA6NvXby8qXnU1DB/ux7D/679Cp5G40PCI\nxMJ778GZZ8KkSVBcHDpNPNx1F8yY4Y9ua9EidBqJCxVtiY3KSj8UsHgxdOgQOk1Yc+f6IaMlS7Tq\nUT5PY9oSG336+NPER46ETz8NnSacTZv8OP+f/qSCLZmjnrZkRXU1XHQRdOoE99wTOk3u7d7tz9Us\nLoaf/Sx0Gokj9bQlVg45BB58EJ56yq+aLDTjx/uNtG65JXQSyTc6iU6y5phj4IknfG+zTRs/s6QQ\n/Pd/+19WL76ofUUk8/QtJVnVvbsv3Jde6k9oyXe//z3ce68/R/PYY0OnkXykoi1Zd+65MGUKfPvb\nfvl2vnroIT+9b84czZqR7FHRlpwYOBAeeMAvMlmxInSazJs61Y9jz54NX/pS6DSSz1S0JWeGD4eS\nEhg8GNauDZ0mc8rKYPRoePppf3SYSDbpRqTk1MiRsGOH73k/91zye6XPPutP8Ckrg65dQ6eRQqCi\nLTn3r/8K27fDgAG+cJ98cuhEjbNggb/B+sQT0KNH6DRSKDQ8IkFcfz388Id+/+3KytBpGm7SJL94\n6JFH/I1WkVzRikgJauZM3/MePx7GjQNr8Pqw3Nq1C8aO9b9o/vIXnT4jjacVkZJIgwfDwoXw8MN+\nqGH79tCJ6vbOO75XvW2bn7qogi0hqGhLcJ06wfPPQ1GRP9X99ddDJzrQnDnQs6e/kTp1ql+iLhKC\nirbEwqGH+nHi0aOhd2+/DDwOnPOHFl9+uV8g9JOfxH8IR/KbxrQldhYtgosvhn79/Ak4nTqFyzF+\nvB+yefxxrXKUzNKYtuSNnj1h5Uro2NGf8j56NGzenLuvv3IlXHABfOc7cMklfmqfCrbERb1F28wm\nmdkWM1uZi0AiAEcdBXfc4VdOtmoFZ5wBP/2pvwmYLevX+1NmBg6EVMqPrV97rY4Ik3iJ0tOeDAzO\ndhCR2rRt67c6Xb4cPvwQTjvNF/PXX/fjzXUpn11O8VXFpK5MUXxVMeWzy2u9bu9eePlluO46fxO0\nSxf/2j/6kb8xKhI39a6IdM7NN7OO2Y8iUrf27eEPf4CbbvI3Bvv183tVDxjgP/r3hxNP9NeWzy5n\n3G/HUdW96rO/X/Vb/3jIeUNZt86f3ThnDlRUwPHH+x0IX3vN7/stEmeRbkSmi/Z059zXanlONyIl\n55zzRXbOHP8xbx60awfnnAOzVhXzTvGsA/7OF6YXc8h7Mz9X7Pv10/mNEkZjb0RmZO+RCRMmfPY4\nlUqRSqUy8bIidTLzQyWnneaXxO/dC8uW+YU6897cVevfOeLYnZSVwimnaNqe5F5FRQUVFRVNfh31\ntCXvFF9VzKyOB/a0i98uZuakmQESiRxIU/5E0sZeOpbOSzt/7nOdX+nMmFFjAiUSyZx6e9pmVgr0\nBY4FtgK3Oecm7/e8etoSO+WzyykpLWFn9U6KDilizKgxDB04NHQskc80tqetFZEiIgFoeEREpACo\naIuIJIiKtohIgqhoi4gkiIq2iEiCqGiLiCSIiraISIKoaIuIJIiKtohIgqhoi4gkiIq2iEiCqGiL\niCSIiraISIKoaIuIJIiKtohIgqhoi4gkiIq2iEiCqGiLiCSIiraISIKoaIuIJIiKtohIgqhoi4gk\niIq2iEiCqGiLiCSIiraISIKoaIuIJIiKtohIgqhoi4gkiIq2iEiCqGiLiCSIiraISIKoaIuIJIiK\ntohIgqhoi4gkiIq2iEiC1Fu0zWywma01s9fN7Ke5CCUiIrU7aNE2s2bAvcBg4KvAKDPrkotgmVZR\nURE6QiTKmVnKmVlJyJmEjE1RX0+7B7DeOfeWc2438ChwQfZjZV5S/iGVM7OUM7OSkDMJGZuivqJ9\nErBhvz9vTH9OREQCqK9ou5ykEBGRSMy5uuuymfUCJjjnBqf/PB6ods7dvd81KuwiIo3gnLOG/p36\ninZzYB0wANgEvASMcs6taWxIERFpvOYHe9I5t8fMRgPPAM2AB1SwRUTCOWhPW0RE4iXyisgoi2zM\nbGL6+eVm1j1zMaOrL6eZpczsQzNbmv74WYCMk8xsi5mtPMg1cWjLg+aMQ1umc3Qws3lmtsrMXjWz\nsXVcF7RNo+QM3aZmVmRmi8xsmZmtNrNf1HFd6LasN2fotqyRpVk6w/Q6no/ens65ej/wQyPrgY5A\nC2AZ0KXGNUOAGenHPYGFUV47kx8Rc6aAp3KdrUaGc4HuwMo6ng/elhFzBm/LdI4TgG7px4fj78PE\n8fszSs7gbQq0Tv+3ObAQOCdubRkxZ/C23C/LjcAjteVpaHtG7WlHWWRzPvAQgHNuEXC0mbWL+PqZ\nEnUxUIPv2GaSc24+8PeDXBKHtoySEwK3JYBz7j3n3LL0438Ca4ATa1wWvE0j5oTw35+fpB+2xHeE\nPqhxSfC2TH/t+nJCDL4/zaw9vjDfT+15GtSeUYt2lEU2tV3TPuLrZ0qUnA44O/02ZIaZfTVn6aKL\nQ1tGEbu2NLOO+HcHi2o8Fas2PUjO4G1qZoeY2TJgCzDPObe6xiWxaMsIOYO3Zdr/ADcB1XU836D2\njFq0o96trPlbJNd3OaN8vVeADs65rkAJ8GR2IzVa6LaMIlZtaWaHA48D49I92QMuqfHnIG1aT87g\nbeqcq3bOdcMXjj5mlqrlsuBtGSFn8LY0s2HAVufcUg7e64/cnlGL9rtAh/3+3AH/2+Bg17RPfy6X\n6s3pnPt439sq59zTQAsza5O7iJHEoS3rFae2NLMWwBPAn5xztf1wxqJN68sZpzZ1zn0IlANn1Xgq\nFm25T105Y9KWZwPnm9mbQCnQ38wernFNg9ozatFeApxqZh3NrCXwXeCpGtc8BVwBn62k/IdzbkvE\n18+UenOaWTszs/TjHvhpj7WNhYUUh7asV1zaMp3hAWC1c+6eOi4L3qZRcoZuUzM7zsyOTj8+FBgI\nLK1xWRzast6codsSwDl3i3Oug3OuE3AJMNc5d0WNyxrUngddXLPfF651kY2ZXZd+/j7n3AwzG2Jm\n64HtwFUN/R9sqig5ge8APzCzPcAn+IbMKTMrBfoCx5nZBuB2/GyX2LRllJzEoC3TegOXAyvMbN8P\n7i3AyRCrNq03J+Hb9AvAQ2Z2CL5T90fn3Jy4/axHyUn4tqyNA2hKe2pxjYhIgui4MRGRBFHRFhFJ\nEBVtEZEEUdEWEUkQFW0RkQRR0RYRSRAVbRGRBFHRFhFJkP8D3vOeLBEdiMkAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "xs = linspace(0, 4)\n", "y = f(xs)\n", "plt.ylim(0,5)\n", "plt.plot(xs, y)\n", "plt.plot(x1, f(x1), 'ro')\n", "plt.plot(x2, f(x2), 'go')\n", "plt.plot(x3, f(x3), 'bo')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 263, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def phi(x): return sp.array([1, x, x**2])" ] }, { "cell_type": "code", "execution_count": 264, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 1, 1])" ] }, "execution_count": 264, "metadata": {}, "output_type": "execute_result" } ], "source": [ "phi(x1)" ] }, { "cell_type": "code", "execution_count": 265, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 4])" ] }, "execution_count": 265, "metadata": {}, "output_type": "execute_result" } ], "source": [ "phi(x2)" ] }, { "cell_type": "code", "execution_count": 266, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 3, 9])" ] }, "execution_count": 266, "metadata": {}, "output_type": "execute_result" } ], "source": [ "phi(x3)" ] }, { "cell_type": "code", "execution_count": 267, "metadata": { "collapsed": false }, "outputs": [], "source": [ "xs1 = linspace(0, 5)\n", "xs2 = linspace(0, 10)\n", "X1, X2 = np.meshgrid(xs1, xs2)" ] }, { "cell_type": "code", "execution_count": 268, "metadata": { "collapsed": true }, "outputs": [], "source": [ "phi_X = 3*X2 - 12*X1 + 13" ] }, { "cell_type": "code", "execution_count": 286, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWwAAAD7CAYAAABOi672AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFJBJREFUeJzt3X+Q1PV9x/HXG40VDWhakiCnU5xODSaSgWgQKzRro8Gm\nmqoBCYfEybS520wSjpN4KJkxN2KbmplLYpPOJBg1Os2dp4QzcElZTnAb03aSHGCkcgFbjsbaKsYc\n55gcxw8//YNdesDdsfv9sd9fz8fMjXvL97v33uF4z9fXvr+fjznnBACIvwlRFwAAqAwNGwASgoYN\nAAlBwwaAhKBhA0BC0LABICHODOuFzYx5QQDwwDlnoz0f6hW2c87T15e+9CXP5yb1i/ecjS/ecza+\n/Lzn8RCJAEBC0LABICFi2bBzuVzUJdQc7zkbeM/ZENZ7tvEyEzN7WNJfSNrvnJtZeu73JXVK+kNJ\n+yTd6pw7MMq57nR5DADgRGYm5/FDx0ckXX/Sc3dJ6nHOXSJpS+l7AEDIxm3YzrlnJQ2c9PTHJD1a\nevyopJtCqAsAcBIvGfa7nXOvlh6/KundAdYDABiDrw8dSyE1QTUQsv/68Y/12/37PZ374ouv6/nn\nXz39gYg9L3c6vmpmU51zr5jZBZLG/C1qbW09/jiXy2Xy02LAr4H+fj2xcKHqu7t17rveVdW5Q0OH\ntWjRk8rnr9D738//DMdRsVhUsVis6Nhxp0QkycymS9o4YkrkK5Jed87db2Z3STrfOXfKB49MiQD+\nHT10SA/Pm6eZ9fWau2JF1efn8906cOCgOjo+LrNRBw8QM+NNiYx7hW1mHZI+JGmKmb0k6R5Jfyfp\nCTP7K5XG+oItF0BZT0uLJk2bpiubmqo+t6Njp7Zs6de2bQ0065QYt2E755aM8UfXhlALgBH6urq0\n+wc/UMP27VU33D17Xtfy5Zu0efNtmjz590KqELUW2mp9ALwb6O9Xd2Oj6ru7NfEd76jq3KGhw7r1\n1ie1Zs01mj37gpAqRBROm2F7fmEybMCTI8PDemTePM1cutRTbt3YuFGDg8Pk1gnlOcMGUHs9LS2a\nVFfnKbdub9+prVv3kVunFA0biJG+ri7t2bDBc27d1ERunWY0bCAm/ObWixaRW6cdGTYQA37nrcmt\n04MMG4g5v/PW5NbZQMMGIuZn3nr37l8zb50hNGwgQv7nrdeRW2cIGTYQkSPDw3pk/nxfufUbbxxS\ne/stRCEpQoYNxJDf3PqZZ/apt5fcOkto2EAE+tavD2Deehm5dcbQsIEaG9i7V935vO91QmbNmhpS\nhYgrGjZQA4VCQW1ta2VvHdV1v9ql+atXq27OnKpfZ8WKTZoxY4oaGi4PoUrEHQ0bCFmhUNDNN9+u\noaH7db2+px0TfqXLZsyo+nXIreFrT0cAp9fWtlZDQ/drhibrPXpR69/6qr761Qereo3y+tadnQvJ\nrTOMhg3UwPl6TTeqUevUqYM6t6pzWd8aZTRsIGTNyz+lW+1uPatr9bL6NHHiKq1c2VD5+c0FzZgx\nRY2N5NZZR4YNhMyeflrvm/tB9Z97UNfZBq1c+agWLFhQ0bnsy4iRaNhAiEauE7K8yhE+9mXEyWjY\nQEjYlxFBYy0RIAR+17fO57s1ODjMOiEZxFoiQI35X9+6n3lrnIKGDQTMz/rW5dy6p4d1QnAqGjYQ\noKBya9YJwWjIsIGAlHPry5Ys0VXNzVWfn89368CBg+zLmHFk2EANlHNrLx8yMm+NStCwgQAEkVsz\nb43ToWEDPjFvjVohwwZ8ILdG0MiwgZCQW6OWaNiAR+TWqDXPy6ua2d1m9oKZ7TSzdjPjtw6ZUc6t\nF3Z2klujZjw1bDObLunTkj7gnJsp6QxJnwiuLCC+jh46pHWLF2ve3Xd72pexubmg97yH9a1RPa+R\nyBuSDks6x8yOSjpH0suBVQXEGLk1ouKpYTvnfmNmbZJ+JWlIUsE593SglQExRG6NKHlq2Gb2R5JW\nSJouaVDSk2a21Dn3vZHHtba2Hn+cy+WUy+W81glErpxbL9m4kdwagSkWiyoWixUd62kO28wWS7rO\nOffXpe+XSZrrnPvsiGOYw0ZqBDFvPTBwUI8/zrw1xjfeHLbXKZFfSpprZhPt2G/ftZJ2eS0QiLsg\ncusHH7yRZg1fvGbYvzCzxyT1SnpL0nZJa4MsDIgLcmvEBbemA+MY6O/Xd668UvXd3VWP8A0NHdZV\nVz2kfP4K5fNXhFQh0ma8SISGDYyBdUIQBdYSATzYfOedzFsjVmjYwCj61q/Xng0byK0RKzRs4CQD\ne/eqO59n3hqxQ4YNjMC8NaJGhg1UiHVCEGc0bKCEeWvEHQ0bUDD7Mt57b47cGqEiw0bmMW+NOCHD\nBsbhd95669Z+9faSWyN8NGxkWhDz1j09y8itURM0bGTWQH+/uvN5X7n1ffddo1mzpoZUIXAiMmxk\nUjm3nllf7ykKyee7NTg4rPb2W4hCECgybOAk5XnrK5uaqj6X3BpRoWEjc4KYtya3RhRo2MiUIOat\n16wht0Y0yLCRGeTWSAIybEDHcuvJdXXk1kgsGjYygdwaaUDDRuoFkVszb404IMNGqpFbI2nIsJFZ\nPatWkVsjNWjYSK1fPvXUsdx62zZya6QCDRupVM6t/e7LSG6NOCHDRuocz62XLtVcD1EIuTWiRIaN\nTDk+b718edXnklsjzmjYSBXmrZFmNGykBvPWSDsybKQC89ZICzJspB7rhCALaNhIPHJrZAUNG4lG\nbo0smeD1RDM738zWmVmfme0ys7lBFgacztFDh7Ru8WLN/+IXVTdnTtXnNzcXNGPGFDU0XB5CdUDw\n/FxhPyDpR865hWZ2pqRzA6oJqAjz1sgaTw3bzM6TNN85d7skOeeOSBoMsjBgPOTWyCKvkcjFkl4z\ns0fMbLuZPWhm5wRZGDCWcm69sLOTdUKQKV4jkTMlfUDS55xzPzezr0u6S9I9Iw9qbW09/jiXyymX\ny3n8ccAxx3Pr1at95daNjeTWiIdisahisVjRsZ5unDGzqZL+zTl3cen7eZLucs7dMOIYbpxB4Dat\nWKED+/ZpcVdX1VFIR8dO3XNPUdu2NRCFILYCv3HGOfeKmb1kZpc45/ZIulbSC36KBE4niNx68+bb\naNZILD9TIp+X9D0zO0vSf0r6VDAlAacKYt56zZprNHv2BSFVCISPtUQQe0GsE3LgwEF1dHycET7E\nHmuJINF6Wlo0ado0z+uEbNnSr23bmLdG8tGwEWvk1sD/o2EjtsitgRORYSOWyK2RVWTYSBxya+BU\nNGzEDrk1MDoaNmKF3BoYGxk2YoPcGiDDRkKQWwPjo2EjFsitgdOjYSNy5NZAZciwESlya+BEZNiI\nLXJroHI0bESG3BqoDg0bkSC3BqpHho2aI7cGxkaGjVghtwa8oWGjpsitAe9o2KgZcmvAHzJs1AS5\nNVAZMmxEjtwa8I+GjdCRWwPBoGEjVOTWQHDIsBGaIHLrwcFhtbffQhSCzCDDRiT85tZbt/art5fc\nGiijYSMUfnPrpqZN2rx5Gbk1MAING4ELKreeNWtqSBUCyUSGjUCRWwP+kGGjZsitgfDQsBGYIOat\ne3rIrYGx0LARiCBy6/vuI7cGxjPBz8lmdoaZ7TCzjUEVhOQ5euiQ1i1erPmrV6tuzpyqz29uLujS\nS9+phobLQ6gOSA+/V9hNknZJmhRALUionpYWTa6rqzi3LhQKavtGmyTpA5fdrq1bf0NuDVTAc8M2\nswslfVTS30i6I7CKkCjV5taFQkE3f+JmDeWGpDf/QD1fma1/+OaV5NZABfxEIl+TdKektwKqBQlT\nzq0XdnZWnFu3faPtWLN+35nSzkXS+7fqqR99K+RKgXTwdIVtZjdI2u+c22FmubGOa21tPf44l8sp\nlxvzUCSM39xam66X3vmaNL1XOnJd8AUCCVEsFlUsFis61tONM2b2t5KWSToi6WxJkyV93zn3yRHH\ncONMim1asUIH9u3T4q6uqrLnQqGgG29ZrcN2tZRbq4n/MkFdj3dpwYIFIVYLJMd4N874vtPRzD4k\n6QvOuRtPep6GnVJ9XV3afMcdati+veoRvj17XtecOd/SjJm9mnzeb7Xy8ytp1sAItbjTkc6cEeXc\nesnGjZ7mrRctelJf/vJH9JnPfDGkCoH0Yi0RVMzvOiGNjRs1ODjMvozAOFhLBIHwv07IPvZlBHyg\nYaMi7MsIRI+GjdMKIre+994c+zICPpFhY1zk1kBtkWHDMz+5dXs7uTUQJBo2xhTMvozk1kBQaNgY\n1cDeveTWQMyQYeMUR4aH9ci8eZq5dCm5NVBjZNioSk9LiyZVsb71SMxbA+GhYeMEfevXa8+GDZ5y\n6927f828NRAiGjaOG9i7V935vOfc+tZb15FbAyEiw4YkcmsgLsiwcVpPr1rlObdm3hqoDRo21Ld+\nved56927f828NVAjNOyMK+fW9d3d5NZAzJFhZxi5NRA/ZNgYFfPWQLLQsDOKeWsgeWjYGURuDSQT\nGXbGkFsD8UaGjeOeXrVKky+8kNwaSCAadoawLyOQbDTsjCjvy+g9t35Sa9ZcQ24NRIgMOwPK+zJe\ntmSJrmpurvp8cmugdsiwM668L6OXDxnJrYH4oGGnnJ/cmnlrIF5o2CnmP7dm3hqIEzLslCK3BpKJ\nDDuDelpaNLmujtwaSBEadgoxbw2kEw07ZZi3BtKLDDtF/ObW+Xy3Dhw4SG4NRCjwDNvMLpL0mKR3\nSXKS1jrn/t57iQiC33nrLVv6ya2BGPMaiRyW1Oyce87M3i5pm5n1OOf6AqwNVSC3BtJvgpeTnHOv\nOOeeKz1+U1KfpGlBFobKlXPrhZ2d5NZAivnOsM1suqR/lvS+UvMuP0+GXQPk1kC6hDaHXYpD1klq\nGtmsy1pbW48/zuVyyuVyfn4cRuF/3rpfvb3k1kBUisWiisViRcd6vsI2s7dJ6pb0T865r4/y51xh\nh6yvq0ub77hDDdu3Vx2F7Nnzuq6++mH19CzTrFlTQ6oQQLXCmBIxSQ9J2jVas0b4gpi3vu++a2jW\nQIJ4usI2s3mSfizpeR0b65Oku51zm0YcwxV2SMq59cz6ek9RCLk1EF+BX2E7534ijxMm8K88b+11\nX0bmrYFk4tb0hGF9ayC7aNgJEsT61sxbA8nFWiIJwfrWQDawHnYKsC8jABp2ArBOCACJhh175dx6\nycaNrBMCZBwZdoyxTgiQPWTYCcX61gBGomHHFLk1gJPRsGOIfRkBjIYMO2ZYJwTINjLsBPG7Tgjr\nWwPpRcOOWKFQ0Nq2NknSwrlztd9nbt3Ts4zcGkgpGnaECoWCbr/5Zt0/NKTDkn7R06MPPvCAp9x6\n0SLWtwbSjiVSI7S2rU33Dw3pNh1bVHyapPbu7qpfZ8WKTbr00ilqaLg86BIBxAhX2DHQI2mSpLd5\nOLejY6eeeWYfuTWQAVxhR6hh5Up966yztE3S7yTdNXGiGlaurPj8cm79xBOLyK2BDOAKO0JzLrlE\nN0ycqL5Zs7T9vPP06MqVWrBgQUXnklsD2cMcdkT8zluzvjWQTsxhx5Cfeev2dta3BrKIhh0Bv/sy\nNjWxTgiQRTTsGgtiX8Z7782xTgiQQWTYNXRkeFiPzJuny+rr2ZcRwKjIsGOip6VFk+rq2JcRgCc0\n7BrpW79eezZs8Jxbs741ABp2DQzs3avufN5Xbs361gDIsEPG+tYAqkGGHSHWtwYQFBp2iILYl5H1\nrQGU0bBDEsS+jKwTAmAkMuwQlHPry5Ys8TRvnc93a3BwWO3ttxCFABkzXobteXlVM7vezH5pZi+a\n2Srv5aVPObf2Om+9ZUu/vv3tG2jWAE7gKRIxszMkfVPStZJelvRzM9vgnOsLsrgkCiK3Zt4awGi8\nXmHPkfQfzrl9zrnDkh6X9JfBlZVM5dx6YWen59yaeWsAY/HasOskvTTi+/8uPZdZRw8d0rrFizV/\n9WrVzZlT9fnNzQXNmDFFjY3sywhgdF6nRLL5aeI49r/wgt753vd6mrceHDyol156g5tjAIzLa8N+\nWdJFI76/SMeusk/Q2tp6/HEul1Mul/P44+LvgtmzddN3v+vp3PPOO1s//GF9sAUBSIRisahisVjR\nsZ7G+szsTEm7JX1Y0v9I+pmkJSM/dMzyWB8AeBX4renOuSNm9jlJBUlnSHqICREACBc3zgBAjIRy\n4wwAoLZo2ACQELFs2JV+YpomvOds4D1nQ1jvmYYdE7znbOA9Z0OmGjYA4FQ0bABIiFDH+kJ5YQBI\nubHG+kJr2ACAYBGJAEBC0LABICFi17CztvWYmT1sZq+a2c6oa6kVM7vIzJ4xsxfM7N/NbHnUNYXN\nzM42s5+a2XNmtsvMvhx1TbVgZmeY2Q4z2xh1LbVgZvvM7PnSe/5Z4K8fpwy7tPXYbo3YekwnrQKY\nNmY2X9Kbkh5zzs2Mup5aMLOpkqY6554zs7dL2ibppjT/PUuSmZ3jnPtdabXLn0j6gnPuJ1HXFSYz\nu0PS5ZImOec+FnU9YTOzfkmXO+d+E8brx+0KO3NbjznnnpU0EHUdteSce8U591zp8ZuS+iRNi7aq\n8Dnnfld6eJaOrXIZyj/quDCzCyV9VNJ3JGVpZ47Q3mvcGjZbj2WMmU2XNFvST6OtJHxmNsHMnpP0\nqqRnnHO7oq4pZF+TdKekt6IupIacpKfNrNfMPh30i8etYccnn0HoSnHIOklNpSvtVHPOveWcmyXp\nQkl/ama5iEsKjZndIGm/c26HsnV1fbVzbrakP5f02VLkGZi4NeyKth5D8pnZ2yR9X9I/Oueeirqe\nWnLODUr6oaQroq4lRH8i6WOlTLdD0p+Z2WMR1xQ659z/lv77mqQuHYt5AxO3ht0r6Y/NbLqZnSVp\nsaQNEdeEgNmxnYYfkrTLOff1qOupBTObYmbnlx5PlHSdpB3RVhUe59xq59xFzrmLJX1C0lbn3Cej\nritMZnaOmU0qPT5X0kckBTr9FauG7Zw7Iqm89dguSZ0ZmBzokPSvki4xs5fM7FNR11QDV0u6TdI1\npfGnHWZ2fdRFhewCSVtLGfZPJW10zm2JuKZaykLc+W5Jz474O+52zm0O8gfEaqwPADC2WF1hAwDG\nRsMGgISgYQNAQtCwASAhaNgAkBA0bABICBo2ACQEDRsAEuL/AFYNWgmCywNbAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.contour(X, Y, phi_X, levels=[1,4])\n", "plt.scatter(phi(x1)[1], phi(x1)[2], c='r')\n", "plt.scatter(phi(x2)[1], phi(x2)[2], c='g')\n", "plt.scatter(phi(x3)[1], phi(x3)[2], c='b')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optimizacijski postupak\n", "\n", "* Ništa se ne mijenja u odnosu na ono što smo već izveli, samo umjesto $\\mathbf{X}$ imamo dizajn-matricu $\\boldsymbol{\\Phi}$\n", "\n", "\n", "* Dizajn-matrica:\n", "$$\n", "\\boldsymbol{\\Phi} = \n", "\\begin{pmatrix}\n", "1 & \\phi_1(\\mathbf{x}^{(1)}) & \\dots & \\phi_m(\\mathbf{x}^{(1)})\\\\\n", "1 & \\phi_1(\\mathbf{x}^{(2)}) & \\dots & \\phi_m(\\mathbf{x}^{(2)})\\\\\n", "\\vdots\\\\\n", "1 & \\phi_1(\\mathbf{x}^{(N)}) & \\dots & \\phi_m(\\mathbf{x}^{(N)})\\\\\n", "\\end{pmatrix}_{N\\times m}\n", "=\n", "\\begin{pmatrix}\n", "\\mathbf{\\phi}(\\mathbf{x}^{(1)})^\\intercal \\\\\n", "\\mathbf{\\phi}(\\mathbf{x}^{(2)})^\\intercal \\\\\n", "\\vdots\\\\\n", "\\mathbf{\\phi}(\\mathbf{x}^{(N)})^\\intercal \\\\\n", "\\end{pmatrix}_{N\\times m}\n", "$$\n", "\n", "* Prije smo imali:\n", "$$\n", "\\mathbf{w} = (\\mathbf{X}^\\intercal\\mathbf{X})^{-1}\\mathbf{X}^\\intercal\\mathbf{y} = \\color{red}{\\mathbf{X}^{+}}\\mathbf{y}\n", "$$\n", "a sada imamo:\n", "$$\n", "\\mathbf{w} = (\\boldsymbol{\\Phi}^\\intercal\\boldsymbol{\\Phi})^{-1}\\boldsymbol{\\Phi}^\\intercal\\mathbf{y} = \\color{red}{\\boldsymbol{\\Phi}^{+}}\\mathbf{y}\n", "$$\n", "gdje\n", "$$\n", "\\boldsymbol{\\Phi}^{+}=(\\boldsymbol{\\Phi}^\\intercal\\boldsymbol{\\Phi})^{-1}\\boldsymbol{\\Phi}^\\intercal\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Odabir modela" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Poopćeni linearan model regresije ima jedan **hiperparametar**: funkciju preslikavanje $\\boldsymbol{\\phi}$\n", "\n", "\n", "* Alternativno, možemo reći da se radi o dva hiperparametra:\n", " * izgled baznih funkcija $\\phi_j$\n", " * broj baznih funkcija $m$ (dimenzija prostora značajki)\n", "\n", "\n", "* Hiperparametre treba namjestiti tako da odgovaraju podatcima, odnosno treba\n", "dobro **odabrati model**\n", "\n", "\n", "* U suprotnom model može biti **podnaučen** ili **prenaučen**\n", "\n", "\n", "* Ako model ima mnogo parametra, lako ga je prenaučiti\n", "\n", "\n", "* Sprečavanje prenaučenosti:\n", " 1. Koristiti više primjera za učenje\n", " 2. Odabrati model unakrsnom provjerom\n", " 3. **Regularizacija**\n", " 4. Bayesovska regresija (bayesovski odabir modela) $\\Rightarrow$ nećemo raditi\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Regularizirana regresija\n", "\n", "\n", "### Ideja\n", "\n", "* Opažanje: kod linearnih modela, što je model složeniji, to ima veće vrijednosti parametara $\\mathbf{w}$\n", "\n", "\n", "* Prenaučeni linearni modeli imaju:\n", " * ukupno previše parametara (težina) i/ili\n", " * prevelike vrijednosti pojedinačnih parametara\n", "\n", "\n", "* Ideja: **ograničiti rast vrijednosti parametara** kažnjavanjem hipoteza s visokim vrijednostima parametara\n", "\n", "\n", "* Time ostvarujemo **kompromis** između točnosti i jednostavnosti modela i to već **pri samom učenju** modela\n", "\n", "\n", "* Efektivno se **graničava složenost** modela i sprečava se prenaučenost\n", "\n", "\n", "* Cilj: što više parametara (težina) pritegnuti na nulu $\\Rightarrow$ **rijetki modeli** (engl. *sparse models*)\n", "\n", "\n", "* Rijetki modeli su:\n", " * teži za prenaučiti\n", " * računalno jednostavniji\n", " * interpretabilniji\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Regularizacija\n", "\n", "* U funkciju pogreške (koju minimiziramo) ugrađujemo mjeru složenosti modela:\n", "\n", "$$\n", " E' = \\textrm{empirijska pogreška} + \\color{red}{\\lambda\\times\\textrm{složenost modela}}\n", "$$\n", "\n", "$$\n", " E'(\\mathbf{w}|\\mathcal{D}) = E(\\mathbf{w}|\\mathcal{D}) + \\underbrace{\\color{red}{\\lambda E_R(\\mathbf{w})}}_{\\text{reg. izraz}}\n", "$$\n", "\n", "* $\\lambda$ je **regularizacijski faktor**\n", " * $\\lambda=0\\ \\Rightarrow$ neregularizirana funkcija pogreške\n", " * Veća vrijednost regularizacijskog faktora $\\lambda$ uzrokuje smanjuje efektivne složenost modela\n", "\n", "\n", "* [Skica: Regularizirana regresija]\n", "\n", "\n", "* Općenit regularizacijski izraz: **p-norma vektora težina**\n", "$$\n", " E_R(\\mathbf{w}) = \\|\\mathbf{w}\\|_p = \\Big(\\sum_{j=\\color{red}{1}}^m |w_j|^p\\Big)^{\\frac{1}{p}}\n", "$$\n", "\n", "\n", "* L2-norma ($p=2$):\n", "$$\\|\\mathbf{w}\\|_2 = \\sqrt{\\sum_{j=\\color{red}{1}}^m w_j^2} = \\sqrt{\\mathbf{w}^\\intercal\\mathbf{w}}$$\n", "\n", "\n", "* L1-norma ($p=1$):\n", "$$\\|\\mathbf{w}\\|_1 = \\sum_{j=\\color{red}{1}}^m |w_j|$$\n", "\n", "\n", "* L0-norma ($p=0$):\n", "$$\\|\\mathbf{w}\\|_0 = \\sum_{j=\\color{red}{1}}^m \\mathbf{1}\\{w_j\\neq 0\\}$$\n", "\n", "\n", "* **NB:** Težina $w_0$ se ne regularizira\n", " * **Q:** Zašto?\n", " \n", " \n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Regularizirani linearni model regresije\n", " \n", "* **L2-regularizacija** ili Tikhononova regularizacija $\\Rightarrow$ **Ridge regression**:\n", "$$\n", "E(\\mathbf{w}|\\mathcal{D})=\\frac{1}{2}\n", "\\sum_{i=1}^N\\big(\\mathbf{w}^\\intercal\\boldsymbol{\\phi}(\\mathbf{x}^{(i)}) - y^{(i)}\\big)^2\n", "+ \\color{red}{\\frac{\\lambda}{2}\\|\\mathbf{w}\\|^2_2}\n", "$$\n", " * ima rješenje u zatvorenoj formi\n", " \n", "\n", "* **L1-regularizacija** $\\Rightarrow$ **LASSO regularization** (engl. *least absolute shrinkage and selection operator*)\n", "$$\n", "E(\\mathbf{w}|\\mathcal{D})=\\frac{1}{2}\n", "\\sum_{i=1}^N\\big(\\mathbf{w}^\\intercal\\boldsymbol{\\phi}(\\mathbf{x}^{(i)}) - y^{(i)}\\big)^2\n", "+ \\color{red}{\\frac{\\lambda}{2}\\|\\mathbf{w}\\|_1}\n", "$$\n", " * nema rješenje u zatvorenoj formi!\n", "\n", "\n", "* **L0-regularizacija**\n", "$$\n", "E(\\mathbf{w}|\\mathcal{D})=\\frac{1}{2}\n", "\\sum_{i=1}^N\\big(\\mathbf{w}^\\intercal\\mathbf{\\phi}(\\mathbf{x}^{(i)}) - y^{(i)}\\big)^2\n", "+ \\color{red}{\\frac{\\lambda}{2}\\sum_{j=1}^m\\mathbf{1}\\{w_j\\neq0\\}}\n", "$$\n", " * NP-potpun problem!\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### L2-regularizacija\n", "\n", "* Linearna regresija sa L2-regularizacijom ima rješenje u zatvorenoj formi:\n", "\n", "$$\n", "\\begin{align*}\n", "E'(\\mathbf{w}|\\mathcal{D}) &= \\frac{1}{2}\n", "(\\boldsymbol{\\Phi}\\mathbf{w} - \\mathbf{y})^\\intercal\n", "(\\boldsymbol{\\Phi}\\mathbf{w} - \\mathbf{y}) + \\color{red}{\\frac{\\lambda}{2}\\mathbf{w}^\\intercal\\mathbf{w}}\\\\\n", "&=\n", "\\frac{1}{2}\n", "(\\mathbf{w}^\\intercal\\boldsymbol{\\Phi}^\\intercal\\boldsymbol{\\Phi}\\mathbf{w} - 2\\mathbf{y}^\\intercal\\boldsymbol{\\Phi}\\mathbf{w} + \\mathbf{y}^\\intercal\\mathbf{y}\n", "+ \\color{red}{\\lambda\\mathbf{w}^\\intercal\\mathbf{w}})\\\\\n", "\\nabla_{\\mathbf{w}}E' &= \n", "\\boldsymbol{\\Phi}^\\intercal\\boldsymbol{\\Phi}\\mathbf{w} - \\boldsymbol{\\Phi}^\\intercal\\mathbf{y} + \\color{red}{\\lambda\\mathbf{w}} \\\\\n", "&=\n", "(\\boldsymbol{\\Phi}^\\intercal\\boldsymbol{\\Phi} + \\color{red}{\\lambda\\mathbf{I}})\\mathbf{w} - \\boldsymbol{\\Phi}^\\intercal\\mathbf{y} = 0 \\\\\n", "\\mathbf{w} &= (\\boldsymbol{\\Phi}^\\intercal\\boldsymbol{\\Phi} + \\color{red}{\\lambda\\mathbf{I}})^{-1}\\boldsymbol{\\Phi}^\\intercal\\mathbf{y}\\\\\n", "\\end{align*}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Napomene\n", "\n", "* Iznos parametra $w_j$ odgovara važnosti značajke, a predznak upućuje na njezin utjecaj (pozitivan ili negativan) na izlaznu vrijednost\n", "\n", "\n", "* Regularizacija smanjuje složenost modela na način da prigušuje vrijednosti pojedinih značajki, odnosno efektivno ih izbacuje (kada $w_j\\to0$)\n", " * Ako je model nelinearan, to znači smanjivanje nelinearnosti\n", " \n", " \n", "* Težinu $w_0$ treba izuzeti iz regularizacijskog izraza (jer ona definira pomak) ili treba centrirati podatke tako da $\\overline{y}=0$, jer onda $w_0\\to0$\n", "\n", "\n", "* L2-regularizacija kažnjava težine proporcionalno njihovom iznosu (velike težine više, a manje težine manje) Teško će parametri biti pritegnuti baš na nulu. Zato **L2-regularizacija ne rezultira rijetkim modelima**\n", "\n", "\n", "* L1-regularizirana regresija rezultira rijetkim modelima, ali nema rješenja u zatvorenoj formi (međutim mogu se koristiti iterativni optimizacijski postupci\n", "\n", "\n", "* Regularizacija je korisna kod modela s puno parametara, jer je takve modele lako prenaučiti\n", "\n", "\n", "* Regularizacija smanjuje mogućnost prenaučenosti, ali ostaje problem odabira hiperparametra $\\lambda$\n", " * Taj se odabir najčešće radi **unakrsnom provjerom**\n", " \n", " \n", "* **Q:** Koju optimalnu vrijednost za $\\lambda$ bismo dobili kada bismo optimizaciju radili na skupu za učenje?\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Sažetak\n", "\n", "\n", "* **Linearan model regresije** linearan je u parametrima\n", "\n", "\n", "* Parametri linearnog modela uz kvadratnu funkciju gubitka imaju rješenje u zatvorenoj formi u obliku **pseudoinverza dizajn-matrice**\n", "\n", "\n", "* Nelinearnost regresijske funkcije ostvaruje se uporabom nelinearnih **baznih funkcija** (preslikavanjem ulaznog prostora u prostor značajki\n", "\n", "\n", "* Uz pretpostavku normalno distribuiranog šuma, **MLE je istovjetan postupku najmanjih kvadrata**, što daje probabilističko opravdanje za uporabu kvadratne funkcije gubitka\n", "\n", "\n", "* **Regularizacija smanjuje prenaučenost** ugradnjom dodatnog izraza u funkciju pogreške kojim se kažnjava složenost modela\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }