{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Майнор по Анализу Данных, Группа ИАД-4\n",
"## 19/10/2017 Практика с нейронными сетями"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using TensorFlow backend.\n"
]
}
],
"source": [
"from keras.models import Sequential\n",
"from keras.layers import Dense\n",
"\n",
"import numpy\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"\n",
"from sklearn.metrics import roc_auc_score, roc_curve\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"RND_SEED = 7\n",
"plt.style.use('ggplot')\n",
"\n",
"numpy.random.seed(RND_SEED)\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Классификация"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Загрузка данных"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Для тренеровки мы будем использовать достаточно известный набор данных [Pima Indians](http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes).\n",
"\n",
"Признаки такие:
\n",
"1. Number of times pregnant\n",
"2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test\n",
"3. Diastolic blood pressure (mm Hg)\n",
"4. Triceps skin fold thickness (mm)\n",
"5. 2-Hour serum insulin (mu U/ml)\n",
"6. Body mass index (weight in kg/(height in m)^2)\n",
"7. Diabetes pedigree function\n",
"8. Age (years)\n",
"9. Class variable (0 or 1)\n"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data',\n",
" sep=',', header=None)"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | 0 | \n", "1 | \n", "2 | \n", "3 | \n", "4 | \n", "5 | \n", "6 | \n", "7 | \n", "8 | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "6 | \n", "148 | \n", "72 | \n", "35 | \n", "0 | \n", "33.6 | \n", "0.627 | \n", "50 | \n", "1 | \n", "
1 | \n", "1 | \n", "85 | \n", "66 | \n", "29 | \n", "0 | \n", "26.6 | \n", "0.351 | \n", "31 | \n", "0 | \n", "
2 | \n", "8 | \n", "183 | \n", "64 | \n", "0 | \n", "0 | \n", "23.3 | \n", "0.672 | \n", "32 | \n", "1 | \n", "
3 | \n", "1 | \n", "89 | \n", "66 | \n", "23 | \n", "94 | \n", "28.1 | \n", "0.167 | \n", "21 | \n", "0 | \n", "
4 | \n", "0 | \n", "137 | \n", "40 | \n", "35 | \n", "168 | \n", "43.1 | \n", "2.288 | \n", "33 | \n", "1 | \n", "