{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Luokittelu - logistinen regressio" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Klassinen esimerkki luokittelusta on kurjenmiekkojen (iris) luokittelu kolmeen lajiin (setosa, versicolor, virginica) terä- (petal) ja verholehtien (sepal) koon mukaan. Seuraavassa käytän luokitteluun logistista regressiota.\n", "\n", "

### Logistisen regression idea

\n", "\n", "Logistinen regressio on luokittelumenetelmä eikä sitä pidä sekoittaa tavalliseen regressioon.\n", "\n", "Logistinen regressio antaa myös todennäköisyydet luokkiin kuulumisille.\n", "\n", "Logistisen regression perusideaa vaikea selittää lyhyesti. Lue aihetta koskeva artikkelini: https://tilastoapu.wordpress.com/2014/04/25/logistinen-regressio/" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib as plt\n", "import seaborn as sns\n", "%matplotlib inline\n", "\n", "#Vaikuttaa kaavioiden ulkoasuun:\n", "sns.set()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal_lengthsepal_widthpetal_lengthpetal_widthspecies
05.13.51.40.2setosa
14.93.01.40.2setosa
24.73.21.30.2setosa
34.63.11.50.2setosa
45.03.61.40.2setosa
\n", "