{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# \ud83d\udcdd Exercise M1.02\n",
    "\n",
    "The goal of this exercise is to fit a similar model as in the previous\n",
    "notebook to get familiar with manipulating scikit-learn objects and in\n",
    "particular the `.fit/.predict/.score` API."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's load the adult census dataset with only numerical variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "adult_census = pd.read_csv(\"../datasets/adult-census-numeric.csv\")\n",
    "data = adult_census.drop(columns=\"class\")\n",
    "target = adult_census[\"class\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the previous notebook we used `model = KNeighborsClassifier()`. All\n",
    "scikit-learn models can be created without arguments. This is convenient\n",
    "because it means that you don't need to understand the full details of a model\n",
    "before starting to use it.\n",
    "\n",
    "One of the `KNeighborsClassifier` parameters is `n_neighbors`. It controls the\n",
    "number of neighbors we are going to use to make a prediction for a new data\n",
    "point.\n",
    "\n",
    "What is the default value of the `n_neighbors` parameter?\n",
    "\n",
    "**Hint**: Look at the documentation on the [scikit-learn\n",
    "website](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html)\n",
    "or directly access the description inside your notebook by running the\n",
    "following cell. This opens a pager pointing to the documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "\n",
    "KNeighborsClassifier?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a `KNeighborsClassifier` model with `n_neighbors=50`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fit this model on the data and target loaded above"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use your model to make predictions on the first 10 data points inside the\n",
    "data. Do they match the actual target values?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compute the accuracy on the training data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now load the test data from `\"../datasets/adult-census-numeric-test.csv\"` and\n",
    "compute the accuracy on the test data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "main_language": "python"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}