{ "cells": [ { "cell_type": "markdown", "id": "91c570c3", "metadata": {}, "source": [ "Intro\n", "\n", "- definition: classification problem\n", " - $(x,c) \\in X \\times C$, $X = R^d$, $C = Z_r$\n", " - joint probability $P(x,c)$\n", "- classifier\n", "- training set $T = \\\\{(x_1,c_1),...,(x_N,c_N)\\\\}$\n", "- test set\n", "- error rate\n", "- empirical error rate\n", "\n", "Bayes\n", "\n", "- Bayes formula\n", "- Bayes risk\n", "- zero-one loss function, posterior\n", "\n", "Classifiers\n", "\n", "- classifier: $c = C(X)$\n", "- discriminant functions: $c = \\arg\\max_c g(x)$\n", "- posterior probabilities: $c = \\arg\\max_c P(c|x)$\n", "\n", "Nearest Neighbor Classifier\n", "\n", "- $C_{NN}(x) = c_k$ where $k = \\arg\\min_i ||x_i-x||$ for $(x_i,c_i)$ in training set\n", "- $C_{NN}$ has at most twice the Bayes error rate (why?)\n", "- k-NN classifier: use a vote among the $k$ nearest neighbors\n", "- remember $||x-y||^2 = x^2+y^2-2x\\cdot y$" ] }, { "cell_type": "markdown", "id": "8616d2c8", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "id": "8f798ee6", "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": {}, "nbformat": 4, "nbformat_minor": 5 }