{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Задача 14 \n", "\n", "Используя данные о школьниках, выявить степень их алкогольной зависимости. В данных, взятых с UCI 'Students' (исходная выборка изъята из UCI, но осталась в других источниках), содержится информация о 30 признаках для каждого школьника, включая социальные и гендерные, а также указана материальная обеспеченность и количество свободного времени. Выбрать на свой взгляд наиболее весомые признаки и предсказать степень употребления алкоголя по выходным или будним по шкале от 0 до 5. \n", "\n", "Данные: https://github.com/amanchoudhary/student-alcohol-consumption-prediction" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "from sklearn.preprocessing import OneHotEncoder, scale\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.metrics import mean_squared_error, accuracy_score\n", "import seaborn as sns" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | school | \n", "sex | \n", "age | \n", "address | \n", "famsize | \n", "Pstatus | \n", "Medu | \n", "Fedu | \n", "Mjob | \n", "Fjob | \n", "... | \n", "famrel | \n", "freetime | \n", "goout | \n", "Dalc | \n", "Walc | \n", "health | \n", "absences | \n", "G1 | \n", "G2 | \n", "G3 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "GP | \n", "F | \n", "18 | \n", "U | \n", "GT3 | \n", "A | \n", "4 | \n", "4 | \n", "at_home | \n", "teacher | \n", "... | \n", "4 | \n", "3 | \n", "4 | \n", "1 | \n", "1 | \n", "3 | \n", "4 | \n", "0 | \n", "11 | \n", "11 | \n", "
1 | \n", "GP | \n", "F | \n", "17 | \n", "U | \n", "GT3 | \n", "T | \n", "1 | \n", "1 | \n", "at_home | \n", "other | \n", "... | \n", "5 | \n", "3 | \n", "3 | \n", "1 | \n", "1 | \n", "3 | \n", "2 | \n", "9 | \n", "11 | \n", "11 | \n", "
2 | \n", "GP | \n", "F | \n", "15 | \n", "U | \n", "LE3 | \n", "T | \n", "1 | \n", "1 | \n", "at_home | \n", "other | \n", "... | \n", "4 | \n", "3 | \n", "2 | \n", "2 | \n", "3 | \n", "3 | \n", "6 | \n", "12 | \n", "13 | \n", "12 | \n", "
3 rows × 33 columns
\n", "\n", " | GP_school | \n", "is_female | \n", "age | \n", "address_is_rural | \n", "big_family | \n", "parents_live_apart | \n", "Medu | \n", "Fedu | \n", "Mjob | \n", "Fjob | \n", "... | \n", "famrel | \n", "freetime | \n", "goout | \n", "Dalc | \n", "Walc | \n", "health | \n", "absences | \n", "G1 | \n", "G2 | \n", "G3 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1 | \n", "1 | \n", "18 | \n", "0 | \n", "1 | \n", "1 | \n", "4 | \n", "4 | \n", "at_home | \n", "teacher | \n", "... | \n", "4 | \n", "3 | \n", "4 | \n", "1 | \n", "1 | \n", "3 | \n", "4 | \n", "0 | \n", "11 | \n", "11 | \n", "
1 | \n", "1 | \n", "1 | \n", "17 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "at_home | \n", "other | \n", "... | \n", "5 | \n", "3 | \n", "3 | \n", "1 | \n", "1 | \n", "3 | \n", "2 | \n", "9 | \n", "11 | \n", "11 | \n", "
2 | \n", "1 | \n", "1 | \n", "15 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "at_home | \n", "other | \n", "... | \n", "4 | \n", "3 | \n", "2 | \n", "2 | \n", "3 | \n", "3 | \n", "6 | \n", "12 | \n", "13 | \n", "12 | \n", "
3 rows × 33 columns
\n", "\n", " | GP_school | \n", "is_female | \n", "age | \n", "address_is_rural | \n", "big_family | \n", "parents_live_apart | \n", "Medu | \n", "Fedu | \n", "traveltime | \n", "studytime | \n", "... | \n", "x0_teacher | \n", "x1_at_home | \n", "x1_health | \n", "x1_services | \n", "x1_teacher | \n", "x2_course | \n", "x2_home | \n", "x2_reputation | \n", "x3_father | \n", "x3_mother | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1.0 | \n", "1.0 | \n", "18.0 | \n", "0.0 | \n", "1.0 | \n", "1.0 | \n", "4.0 | \n", "4.0 | \n", "2.0 | \n", "2.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "
1 | \n", "1.0 | \n", "1.0 | \n", "17.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "2.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "
2 | \n", "1.0 | \n", "1.0 | \n", "15.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "2.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "
3 rows × 42 columns
\n", "\n", " | GP_school | \n", "is_female | \n", "age | \n", "address_is_rural | \n", "big_family | \n", "parents_live_apart | \n", "traveltime | \n", "studytime | \n", "failures | \n", "schoolsup | \n", "... | \n", "x1_at_home | \n", "x1_health | \n", "x1_services | \n", "x1_teacher | \n", "x2_course | \n", "x2_home | \n", "x2_reputation | \n", "x3_father | \n", "x3_mother | \n", "parents_education | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.730944 | \n", "0.833377 | \n", "1.031695 | \n", "-0.660182 | \n", "0.648175 | \n", "2.666927 | \n", "0.576718 | \n", "0.083653 | \n", "-0.374305 | \n", "2.923032 | \n", "... | \n", "-0.263045 | \n", "-0.19168 | \n", "-0.621894 | \n", "4.126473 | \n", "1.130130 | \n", "-0.545894 | \n", "-0.53161 | \n", "-0.555399 | \n", "0.652973 | \n", "1.568580 | \n", "
1 | \n", "0.730944 | \n", "0.833377 | \n", "0.210137 | \n", "-0.660182 | \n", "0.648175 | \n", "-0.374963 | \n", "-0.760032 | \n", "0.083653 | \n", "-0.374305 | \n", "-0.342110 | \n", "... | \n", "-0.263045 | \n", "-0.19168 | \n", "-0.621894 | \n", "-0.242338 | \n", "1.130130 | \n", "-0.545894 | \n", "-0.53161 | \n", "1.800508 | \n", "-1.531457 | \n", "-1.392181 | \n", "
2 | \n", "0.730944 | \n", "0.833377 | \n", "-1.432980 | \n", "-0.660182 | \n", "-1.542792 | \n", "-0.374963 | \n", "-0.760032 | \n", "0.083653 | \n", "-0.374305 | \n", "2.923032 | \n", "... | \n", "-0.263045 | \n", "-0.19168 | \n", "-0.621894 | \n", "-0.242338 | \n", "-0.884854 | \n", "-0.545894 | \n", "-0.53161 | \n", "-0.555399 | \n", "0.652973 | \n", "-1.392181 | \n", "
3 rows × 37 columns
\n", "