{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MLSP 2014 Schizophrenia Classification Challenge\n",
    "\n",
    "For details, see https://www.kaggle.com/c/mlsp-2014-mri/overview\n",
    "\n",
    "## submission #842079"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from os import path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data\n",
    "\n",
    "For more details, see https://www.kaggle.com/c/mlsp-2014-mri/data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### About FNC Features\n",
    "\n",
    "Functional Network Connectivity (FNC) are correlation values that summarize the overall connection between independent brain maps over time. Therefore, the FNC feature gives a picture of the connectivity pattern over time between independent networks (or brain maps). The provided FNC information was obtained from functional magnetic resonance imaging (fMRI) from a set of schizophrenic patients and healthy controls at rest, using group independent component analysis (GICA). The GICA decomposition of the fMRI data resulted in a set of brain maps, and corresponding timecourses. These timecourses indicated the activity level of the corresponding brain map at each point in time. The FNC feature are the correlations between these timecourses. In a way, FNC indicates a subject's overall level of 'synchronicity' between brain areas. Because this information is derived from functional MRI scans, FNCs are considered a functional modality feature (i.e., they describe patterns of the brain function). More about FNCs can be found here: [FNC paper](http://cercor.oxfordjournals.org/content/early/2012/11/09/cercor.bhs352.abstract)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "FNC_train = pd.read_csv(path.join('Train', 'train_FNC.csv'), index_col=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### About SBM Loadings\n",
    "\n",
    "Source-Based Morphometry (SBM) loadings correspond to the weights of brain maps obtained from the application of independent component analysis (ICA) on the gray-matter concentration maps of all subjects. Gray-matter corresponds to the outer-sheet of the brain; it is the brain region in which much of the brain signal processing actually occurs. In a way, the concentration of gray-matter is indicative of the \"computational power\" available in a certain region of the brain. Processing gray-matter concentration maps with ICA yields independent brain maps whose expression levels (i.e., loadings) vary across subjects. Simply put, a near-zero loading for a given ICA-derived brain map indicates that the brain regions outlined in that map are lowly present in the subject (i.e., the gray-matter concentration in those regions are very low in that subject). Because this information is derived from structural MRI scans, SBM loadings are considered a structural modality feature (i.e., they describe patterns of the brain structure). More about SBM loadings can be found here: [SBM paper](http://www.ncbi.nlm.nih.gov/pubmed/22470337)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "SBM_train = pd.read_csv(path.join('Train', 'train_SBM.csv'), index_col=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Labels for the training set.\n",
    "\n",
    "The labels are indicated in the \"Class\" column. 0 = 'Healthy Control', 1 = 'Schizophrenic Patient'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "labels = pd.read_csv(path.join('Train', 'train_labels.csv'), index_col=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Feature selection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.feature_selection import SelectKBest, f_classif"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "FNC_selector = SelectKBest(f_classif).fit(FNC_train, labels['Class'])\n",
    "SBM_selector = SelectKBest(f_classif).fit(SBM_train, labels['Class'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 1296x432 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "fig = plt.figure(figsize=(18,6))\n",
    "plot = fig.add_subplot(111, xlabel='feature number', ylabel='score', title='Univariate FNC feature scores')\n",
    "indices = np.arange(FNC_train.shape[-1])\n",
    "FNC_scores = -np.log10(FNC_selector.pvalues_)\n",
    "FNC_scores /= FNC_scores.max()\n",
    "plot.plot(indices, FNC_scores, 'r^')\n",
    "plot.vlines(indices, [0], FNC_scores)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 1296x432 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "fig = plt.figure(figsize=(18,6))\n",
    "plot = fig.add_subplot(111, xlabel='feature number', ylabel='score', title='Univariate SBM feature scores')\n",
    "indices = np.arange(SBM_train.shape[-1])\n",
    "SBM_scores = -np.log10(SBM_selector.pvalues_)\n",
    "SBM_scores /= SBM_scores.max()\n",
    "plot.plot(indices, SBM_scores, 'b^')\n",
    "plot.vlines(indices, [0], SBM_scores)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this submission I selected 60 FNC features and 8 SBM features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "FNC_selector.k = 60\n",
    "SBM_selector.k = 8"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "FNC_features = FNC_train.columns[FNC_selector.get_support()]\n",
    "SBM_features = SBM_train.columns[SBM_selector.get_support()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train = pd.concat([ FNC_train[FNC_features], SBM_train[SBM_features]], axis=1)\n",
    "y_train = labels['Class']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>FNC13</th>\n",
       "      <th>FNC30</th>\n",
       "      <th>FNC33</th>\n",
       "      <th>FNC35</th>\n",
       "      <th>FNC37</th>\n",
       "      <th>FNC38</th>\n",
       "      <th>FNC40</th>\n",
       "      <th>FNC41</th>\n",
       "      <th>FNC42</th>\n",
       "      <th>FNC43</th>\n",
       "      <th>...</th>\n",
       "      <th>FNC353</th>\n",
       "      <th>FNC368</th>\n",
       "      <th>SBM_map7</th>\n",
       "      <th>SBM_map17</th>\n",
       "      <th>SBM_map36</th>\n",
       "      <th>SBM_map52</th>\n",
       "      <th>SBM_map61</th>\n",
       "      <th>SBM_map64</th>\n",
       "      <th>SBM_map67</th>\n",
       "      <th>SBM_map75</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>120873</th>\n",
       "      <td>0.270490</td>\n",
       "      <td>0.036615</td>\n",
       "      <td>0.21516</td>\n",
       "      <td>0.069346</td>\n",
       "      <td>-0.086613</td>\n",
       "      <td>0.054857</td>\n",
       "      <td>-0.365520</td>\n",
       "      <td>-0.273410</td>\n",
       "      <td>-0.275500</td>\n",
       "      <td>-0.035595</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.23049</td>\n",
       "      <td>-0.060204</td>\n",
       "      <td>-0.264192</td>\n",
       "      <td>0.137624</td>\n",
       "      <td>-1.062109</td>\n",
       "      <td>0.791762</td>\n",
       "      <td>-0.982331</td>\n",
       "      <td>1.070363</td>\n",
       "      <td>0.220316</td>\n",
       "      <td>-0.002006</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>135376</th>\n",
       "      <td>-0.088119</td>\n",
       "      <td>0.450290</td>\n",
       "      <td>0.70298</td>\n",
       "      <td>0.543640</td>\n",
       "      <td>0.244000</td>\n",
       "      <td>0.512400</td>\n",
       "      <td>0.439300</td>\n",
       "      <td>0.125780</td>\n",
       "      <td>0.191420</td>\n",
       "      <td>-0.058085</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.31699</td>\n",
       "      <td>0.369300</td>\n",
       "      <td>-0.466051</td>\n",
       "      <td>0.972934</td>\n",
       "      <td>0.044317</td>\n",
       "      <td>-0.073326</td>\n",
       "      <td>-0.057543</td>\n",
       "      <td>0.371701</td>\n",
       "      <td>-0.513081</td>\n",
       "      <td>-0.295125</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>139149</th>\n",
       "      <td>-0.361020</td>\n",
       "      <td>0.203270</td>\n",
       "      <td>0.51565</td>\n",
       "      <td>0.114280</td>\n",
       "      <td>0.262910</td>\n",
       "      <td>0.018740</td>\n",
       "      <td>0.088855</td>\n",
       "      <td>-0.211420</td>\n",
       "      <td>0.026982</td>\n",
       "      <td>0.093953</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.17144</td>\n",
       "      <td>0.521330</td>\n",
       "      <td>1.439242</td>\n",
       "      <td>-1.488153</td>\n",
       "      <td>0.414747</td>\n",
       "      <td>-0.910225</td>\n",
       "      <td>0.597229</td>\n",
       "      <td>1.220756</td>\n",
       "      <td>-0.059213</td>\n",
       "      <td>0.350434</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>146791</th>\n",
       "      <td>-0.060777</td>\n",
       "      <td>0.658060</td>\n",
       "      <td>0.64302</td>\n",
       "      <td>0.589520</td>\n",
       "      <td>0.485810</td>\n",
       "      <td>0.574150</td>\n",
       "      <td>0.344040</td>\n",
       "      <td>0.255670</td>\n",
       "      <td>0.091637</td>\n",
       "      <td>0.183470</td>\n",
       "      <td>...</td>\n",
       "      <td>0.27329</td>\n",
       "      <td>0.144460</td>\n",
       "      <td>-0.492673</td>\n",
       "      <td>0.187573</td>\n",
       "      <td>-0.026555</td>\n",
       "      <td>-3.013096</td>\n",
       "      <td>0.829697</td>\n",
       "      <td>-0.450726</td>\n",
       "      <td>-0.791032</td>\n",
       "      <td>0.448966</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>153870</th>\n",
       "      <td>0.048705</td>\n",
       "      <td>0.158000</td>\n",
       "      <td>0.25707</td>\n",
       "      <td>0.152580</td>\n",
       "      <td>-0.105510</td>\n",
       "      <td>-0.234190</td>\n",
       "      <td>-0.127320</td>\n",
       "      <td>0.143880</td>\n",
       "      <td>-0.286530</td>\n",
       "      <td>-0.333980</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.49929</td>\n",
       "      <td>-0.179310</td>\n",
       "      <td>-1.105922</td>\n",
       "      <td>1.961955</td>\n",
       "      <td>-1.027496</td>\n",
       "      <td>0.474353</td>\n",
       "      <td>-0.978412</td>\n",
       "      <td>0.158492</td>\n",
       "      <td>0.889753</td>\n",
       "      <td>-0.551440</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>934330</th>\n",
       "      <td>-0.373800</td>\n",
       "      <td>0.706610</td>\n",
       "      <td>0.53835</td>\n",
       "      <td>0.428510</td>\n",
       "      <td>0.533970</td>\n",
       "      <td>0.317520</td>\n",
       "      <td>-0.042524</td>\n",
       "      <td>0.249370</td>\n",
       "      <td>0.191810</td>\n",
       "      <td>0.184660</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.46915</td>\n",
       "      <td>0.108670</td>\n",
       "      <td>-0.515736</td>\n",
       "      <td>-0.112326</td>\n",
       "      <td>0.458916</td>\n",
       "      <td>-0.516803</td>\n",
       "      <td>0.826910</td>\n",
       "      <td>-0.225792</td>\n",
       "      <td>0.369724</td>\n",
       "      <td>0.567717</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>950671</th>\n",
       "      <td>-0.029000</td>\n",
       "      <td>0.516310</td>\n",
       "      <td>0.41210</td>\n",
       "      <td>0.354370</td>\n",
       "      <td>0.311470</td>\n",
       "      <td>-0.067289</td>\n",
       "      <td>0.011223</td>\n",
       "      <td>0.173140</td>\n",
       "      <td>-0.039187</td>\n",
       "      <td>0.137150</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.42424</td>\n",
       "      <td>0.304690</td>\n",
       "      <td>-0.933527</td>\n",
       "      <td>-0.347191</td>\n",
       "      <td>0.183240</td>\n",
       "      <td>0.914623</td>\n",
       "      <td>0.482055</td>\n",
       "      <td>0.073327</td>\n",
       "      <td>-0.455141</td>\n",
       "      <td>0.977018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>963924</th>\n",
       "      <td>-0.197060</td>\n",
       "      <td>0.011246</td>\n",
       "      <td>0.35735</td>\n",
       "      <td>0.623080</td>\n",
       "      <td>0.317980</td>\n",
       "      <td>-0.081810</td>\n",
       "      <td>-0.102280</td>\n",
       "      <td>0.029382</td>\n",
       "      <td>0.205780</td>\n",
       "      <td>0.374430</td>\n",
       "      <td>...</td>\n",
       "      <td>0.42983</td>\n",
       "      <td>0.357230</td>\n",
       "      <td>-0.523021</td>\n",
       "      <td>1.369376</td>\n",
       "      <td>-0.976704</td>\n",
       "      <td>-1.429466</td>\n",
       "      <td>-0.114078</td>\n",
       "      <td>-0.476524</td>\n",
       "      <td>-0.556896</td>\n",
       "      <td>-0.424864</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>993348</th>\n",
       "      <td>-0.087478</td>\n",
       "      <td>0.420390</td>\n",
       "      <td>0.29361</td>\n",
       "      <td>0.026402</td>\n",
       "      <td>-0.041024</td>\n",
       "      <td>0.355390</td>\n",
       "      <td>0.163750</td>\n",
       "      <td>-0.186250</td>\n",
       "      <td>-0.297480</td>\n",
       "      <td>0.179980</td>\n",
       "      <td>...</td>\n",
       "      <td>0.15890</td>\n",
       "      <td>0.462660</td>\n",
       "      <td>0.462689</td>\n",
       "      <td>-1.749746</td>\n",
       "      <td>-0.385862</td>\n",
       "      <td>0.839745</td>\n",
       "      <td>-0.163926</td>\n",
       "      <td>0.953385</td>\n",
       "      <td>0.402673</td>\n",
       "      <td>-0.421040</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>993946</th>\n",
       "      <td>-0.061554</td>\n",
       "      <td>-0.125080</td>\n",
       "      <td>0.15613</td>\n",
       "      <td>0.269830</td>\n",
       "      <td>-0.178200</td>\n",
       "      <td>-0.133630</td>\n",
       "      <td>0.019102</td>\n",
       "      <td>0.383180</td>\n",
       "      <td>-0.082001</td>\n",
       "      <td>-0.058647</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.17677</td>\n",
       "      <td>0.307260</td>\n",
       "      <td>1.522190</td>\n",
       "      <td>-0.281454</td>\n",
       "      <td>-0.907096</td>\n",
       "      <td>1.241959</td>\n",
       "      <td>-0.626201</td>\n",
       "      <td>0.229092</td>\n",
       "      <td>1.358773</td>\n",
       "      <td>-1.125141</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>86 rows × 68 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           FNC13     FNC30    FNC33     FNC35     FNC37     FNC38     FNC40  \\\n",
       "Id                                                                            \n",
       "120873  0.270490  0.036615  0.21516  0.069346 -0.086613  0.054857 -0.365520   \n",
       "135376 -0.088119  0.450290  0.70298  0.543640  0.244000  0.512400  0.439300   \n",
       "139149 -0.361020  0.203270  0.51565  0.114280  0.262910  0.018740  0.088855   \n",
       "146791 -0.060777  0.658060  0.64302  0.589520  0.485810  0.574150  0.344040   \n",
       "153870  0.048705  0.158000  0.25707  0.152580 -0.105510 -0.234190 -0.127320   \n",
       "...          ...       ...      ...       ...       ...       ...       ...   \n",
       "934330 -0.373800  0.706610  0.53835  0.428510  0.533970  0.317520 -0.042524   \n",
       "950671 -0.029000  0.516310  0.41210  0.354370  0.311470 -0.067289  0.011223   \n",
       "963924 -0.197060  0.011246  0.35735  0.623080  0.317980 -0.081810 -0.102280   \n",
       "993348 -0.087478  0.420390  0.29361  0.026402 -0.041024  0.355390  0.163750   \n",
       "993946 -0.061554 -0.125080  0.15613  0.269830 -0.178200 -0.133630  0.019102   \n",
       "\n",
       "           FNC41     FNC42     FNC43  ...   FNC353    FNC368  SBM_map7  \\\n",
       "Id                                    ...                                \n",
       "120873 -0.273410 -0.275500 -0.035595  ... -0.23049 -0.060204 -0.264192   \n",
       "135376  0.125780  0.191420 -0.058085  ... -0.31699  0.369300 -0.466051   \n",
       "139149 -0.211420  0.026982  0.093953  ... -0.17144  0.521330  1.439242   \n",
       "146791  0.255670  0.091637  0.183470  ...  0.27329  0.144460 -0.492673   \n",
       "153870  0.143880 -0.286530 -0.333980  ... -0.49929 -0.179310 -1.105922   \n",
       "...          ...       ...       ...  ...      ...       ...       ...   \n",
       "934330  0.249370  0.191810  0.184660  ... -0.46915  0.108670 -0.515736   \n",
       "950671  0.173140 -0.039187  0.137150  ... -0.42424  0.304690 -0.933527   \n",
       "963924  0.029382  0.205780  0.374430  ...  0.42983  0.357230 -0.523021   \n",
       "993348 -0.186250 -0.297480  0.179980  ...  0.15890  0.462660  0.462689   \n",
       "993946  0.383180 -0.082001 -0.058647  ... -0.17677  0.307260  1.522190   \n",
       "\n",
       "        SBM_map17  SBM_map36  SBM_map52  SBM_map61  SBM_map64  SBM_map67  \\\n",
       "Id                                                                         \n",
       "120873   0.137624  -1.062109   0.791762  -0.982331   1.070363   0.220316   \n",
       "135376   0.972934   0.044317  -0.073326  -0.057543   0.371701  -0.513081   \n",
       "139149  -1.488153   0.414747  -0.910225   0.597229   1.220756  -0.059213   \n",
       "146791   0.187573  -0.026555  -3.013096   0.829697  -0.450726  -0.791032   \n",
       "153870   1.961955  -1.027496   0.474353  -0.978412   0.158492   0.889753   \n",
       "...           ...        ...        ...        ...        ...        ...   \n",
       "934330  -0.112326   0.458916  -0.516803   0.826910  -0.225792   0.369724   \n",
       "950671  -0.347191   0.183240   0.914623   0.482055   0.073327  -0.455141   \n",
       "963924   1.369376  -0.976704  -1.429466  -0.114078  -0.476524  -0.556896   \n",
       "993348  -1.749746  -0.385862   0.839745  -0.163926   0.953385   0.402673   \n",
       "993946  -0.281454  -0.907096   1.241959  -0.626201   0.229092   1.358773   \n",
       "\n",
       "        SBM_map75  \n",
       "Id                 \n",
       "120873  -0.002006  \n",
       "135376  -0.295125  \n",
       "139149   0.350434  \n",
       "146791   0.448966  \n",
       "153870  -0.551440  \n",
       "...           ...  \n",
       "934330   0.567717  \n",
       "950671   0.977018  \n",
       "963924  -0.424864  \n",
       "993348  -0.421040  \n",
       "993946  -1.125141  \n",
       "\n",
       "[86 rows x 68 columns]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Class</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Id</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>120873</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>135376</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>139149</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>146791</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>153870</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>934330</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>950671</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>963924</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>993348</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>993946</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>86 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        Class\n",
       "Id           \n",
       "120873      1\n",
       "135376      0\n",
       "139149      0\n",
       "146791      0\n",
       "153870      1\n",
       "...       ...\n",
       "934330      0\n",
       "950671      0\n",
       "963924      1\n",
       "993348      0\n",
       "993946      1\n",
       "\n",
       "[86 rows x 1 columns]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "labels"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.neighbors import KNeighborsClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "clf = KNeighborsClassifier(n_neighbors=35).fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.872093023255814\n"
     ]
    }
   ],
   "source": [
    "print(clf.score(X_train, labels))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Submission"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "FNC_test = pd.read_csv(path.join('Test', 'test_FNC.csv'), index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "SBM_test = pd.read_csv(path.join('Test', 'test_SBM.csv'), index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>FNC13</th>\n",
       "      <th>FNC30</th>\n",
       "      <th>FNC33</th>\n",
       "      <th>FNC35</th>\n",
       "      <th>FNC37</th>\n",
       "      <th>FNC38</th>\n",
       "      <th>FNC40</th>\n",
       "      <th>FNC41</th>\n",
       "      <th>FNC42</th>\n",
       "      <th>FNC43</th>\n",
       "      <th>...</th>\n",
       "      <th>FNC353</th>\n",
       "      <th>FNC368</th>\n",
       "      <th>SBM_map7</th>\n",
       "      <th>SBM_map17</th>\n",
       "      <th>SBM_map36</th>\n",
       "      <th>SBM_map52</th>\n",
       "      <th>SBM_map61</th>\n",
       "      <th>SBM_map64</th>\n",
       "      <th>SBM_map67</th>\n",
       "      <th>SBM_map75</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>100004</th>\n",
       "      <td>0.113166</td>\n",
       "      <td>0.304551</td>\n",
       "      <td>0.678723</td>\n",
       "      <td>0.676364</td>\n",
       "      <td>0.102174</td>\n",
       "      <td>0.022148</td>\n",
       "      <td>0.317406</td>\n",
       "      <td>0.464976</td>\n",
       "      <td>0.236121</td>\n",
       "      <td>-0.047108</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.483997</td>\n",
       "      <td>0.150698</td>\n",
       "      <td>-2.404130</td>\n",
       "      <td>2.256762</td>\n",
       "      <td>-2.011786</td>\n",
       "      <td>0.139369</td>\n",
       "      <td>1.123770</td>\n",
       "      <td>2.083006</td>\n",
       "      <td>1.145440</td>\n",
       "      <td>0.192076</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100015</th>\n",
       "      <td>-0.054457</td>\n",
       "      <td>0.315034</td>\n",
       "      <td>0.770686</td>\n",
       "      <td>0.787717</td>\n",
       "      <td>0.366940</td>\n",
       "      <td>-0.299074</td>\n",
       "      <td>0.699394</td>\n",
       "      <td>0.452051</td>\n",
       "      <td>0.032888</td>\n",
       "      <td>0.262658</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.252089</td>\n",
       "      <td>0.318086</td>\n",
       "      <td>-0.612468</td>\n",
       "      <td>1.711094</td>\n",
       "      <td>0.185261</td>\n",
       "      <td>-2.084801</td>\n",
       "      <td>1.397832</td>\n",
       "      <td>1.046136</td>\n",
       "      <td>-0.191733</td>\n",
       "      <td>0.174160</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100026</th>\n",
       "      <td>0.002372</td>\n",
       "      <td>-0.108333</td>\n",
       "      <td>-0.058818</td>\n",
       "      <td>0.316538</td>\n",
       "      <td>0.081580</td>\n",
       "      <td>0.113453</td>\n",
       "      <td>-0.050571</td>\n",
       "      <td>0.068699</td>\n",
       "      <td>0.275885</td>\n",
       "      <td>0.562116</td>\n",
       "      <td>...</td>\n",
       "      <td>0.294130</td>\n",
       "      <td>0.093333</td>\n",
       "      <td>-0.752907</td>\n",
       "      <td>1.386814</td>\n",
       "      <td>-0.123830</td>\n",
       "      <td>0.046525</td>\n",
       "      <td>1.906989</td>\n",
       "      <td>-2.661633</td>\n",
       "      <td>-0.193911</td>\n",
       "      <td>-0.476647</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100030</th>\n",
       "      <td>0.040945</td>\n",
       "      <td>0.675230</td>\n",
       "      <td>0.537128</td>\n",
       "      <td>0.124338</td>\n",
       "      <td>0.073727</td>\n",
       "      <td>0.278834</td>\n",
       "      <td>0.161171</td>\n",
       "      <td>0.316369</td>\n",
       "      <td>0.067719</td>\n",
       "      <td>-0.046476</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.270263</td>\n",
       "      <td>0.442313</td>\n",
       "      <td>1.041755</td>\n",
       "      <td>-0.949757</td>\n",
       "      <td>0.167515</td>\n",
       "      <td>-1.693663</td>\n",
       "      <td>-1.997087</td>\n",
       "      <td>-2.083782</td>\n",
       "      <td>1.154107</td>\n",
       "      <td>2.790871</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100047</th>\n",
       "      <td>-0.245284</td>\n",
       "      <td>0.624100</td>\n",
       "      <td>0.294196</td>\n",
       "      <td>0.332166</td>\n",
       "      <td>0.447499</td>\n",
       "      <td>0.264495</td>\n",
       "      <td>0.446376</td>\n",
       "      <td>0.400933</td>\n",
       "      <td>0.041004</td>\n",
       "      <td>0.538643</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.204068</td>\n",
       "      <td>0.394492</td>\n",
       "      <td>-1.775324</td>\n",
       "      <td>0.415556</td>\n",
       "      <td>2.410666</td>\n",
       "      <td>0.021838</td>\n",
       "      <td>1.578984</td>\n",
       "      <td>1.402592</td>\n",
       "      <td>-1.230440</td>\n",
       "      <td>-1.544345</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999956</th>\n",
       "      <td>0.182524</td>\n",
       "      <td>0.241864</td>\n",
       "      <td>0.606688</td>\n",
       "      <td>-0.000529</td>\n",
       "      <td>0.566684</td>\n",
       "      <td>-0.232798</td>\n",
       "      <td>0.733727</td>\n",
       "      <td>-0.056436</td>\n",
       "      <td>-0.227445</td>\n",
       "      <td>0.171759</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.184820</td>\n",
       "      <td>0.407409</td>\n",
       "      <td>-1.123315</td>\n",
       "      <td>0.428509</td>\n",
       "      <td>2.027816</td>\n",
       "      <td>0.849052</td>\n",
       "      <td>1.079632</td>\n",
       "      <td>-1.525021</td>\n",
       "      <td>2.479249</td>\n",
       "      <td>0.548342</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999979</th>\n",
       "      <td>-0.172684</td>\n",
       "      <td>0.259993</td>\n",
       "      <td>0.571281</td>\n",
       "      <td>-0.075842</td>\n",
       "      <td>-0.262201</td>\n",
       "      <td>0.019818</td>\n",
       "      <td>0.504592</td>\n",
       "      <td>-0.185592</td>\n",
       "      <td>-0.809122</td>\n",
       "      <td>0.389232</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.310955</td>\n",
       "      <td>0.473550</td>\n",
       "      <td>-1.270540</td>\n",
       "      <td>0.013607</td>\n",
       "      <td>-0.047793</td>\n",
       "      <td>-0.755896</td>\n",
       "      <td>0.192909</td>\n",
       "      <td>-0.336587</td>\n",
       "      <td>0.062809</td>\n",
       "      <td>-0.690686</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999989</th>\n",
       "      <td>-0.408067</td>\n",
       "      <td>0.862003</td>\n",
       "      <td>0.646698</td>\n",
       "      <td>0.700784</td>\n",
       "      <td>0.572094</td>\n",
       "      <td>-0.678907</td>\n",
       "      <td>0.192104</td>\n",
       "      <td>-0.556889</td>\n",
       "      <td>-0.246365</td>\n",
       "      <td>-0.199630</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.167875</td>\n",
       "      <td>-0.437320</td>\n",
       "      <td>-0.501659</td>\n",
       "      <td>-0.680883</td>\n",
       "      <td>1.595677</td>\n",
       "      <td>0.466304</td>\n",
       "      <td>0.430025</td>\n",
       "      <td>1.452216</td>\n",
       "      <td>0.759654</td>\n",
       "      <td>1.607317</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999992</th>\n",
       "      <td>-0.160676</td>\n",
       "      <td>-0.055138</td>\n",
       "      <td>0.247362</td>\n",
       "      <td>0.478874</td>\n",
       "      <td>0.195440</td>\n",
       "      <td>0.358874</td>\n",
       "      <td>-0.124268</td>\n",
       "      <td>0.284184</td>\n",
       "      <td>0.330500</td>\n",
       "      <td>-0.258015</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.497435</td>\n",
       "      <td>0.783766</td>\n",
       "      <td>-1.248628</td>\n",
       "      <td>0.815399</td>\n",
       "      <td>1.427952</td>\n",
       "      <td>-0.305261</td>\n",
       "      <td>-2.599313</td>\n",
       "      <td>-0.161822</td>\n",
       "      <td>0.183669</td>\n",
       "      <td>-1.313079</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999994</th>\n",
       "      <td>0.005356</td>\n",
       "      <td>0.403151</td>\n",
       "      <td>0.538632</td>\n",
       "      <td>0.792229</td>\n",
       "      <td>-0.226379</td>\n",
       "      <td>-0.229716</td>\n",
       "      <td>-0.044973</td>\n",
       "      <td>-0.262501</td>\n",
       "      <td>-0.039308</td>\n",
       "      <td>-0.207124</td>\n",
       "      <td>...</td>\n",
       "      <td>0.300068</td>\n",
       "      <td>0.562554</td>\n",
       "      <td>-1.980608</td>\n",
       "      <td>-1.398193</td>\n",
       "      <td>-1.263830</td>\n",
       "      <td>-0.303426</td>\n",
       "      <td>0.785046</td>\n",
       "      <td>2.369759</td>\n",
       "      <td>-0.168919</td>\n",
       "      <td>-0.668412</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>119748 rows × 68 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           FNC13     FNC30     FNC33     FNC35     FNC37     FNC38     FNC40  \\\n",
       "Id                                                                             \n",
       "100004  0.113166  0.304551  0.678723  0.676364  0.102174  0.022148  0.317406   \n",
       "100015 -0.054457  0.315034  0.770686  0.787717  0.366940 -0.299074  0.699394   \n",
       "100026  0.002372 -0.108333 -0.058818  0.316538  0.081580  0.113453 -0.050571   \n",
       "100030  0.040945  0.675230  0.537128  0.124338  0.073727  0.278834  0.161171   \n",
       "100047 -0.245284  0.624100  0.294196  0.332166  0.447499  0.264495  0.446376   \n",
       "...          ...       ...       ...       ...       ...       ...       ...   \n",
       "999956  0.182524  0.241864  0.606688 -0.000529  0.566684 -0.232798  0.733727   \n",
       "999979 -0.172684  0.259993  0.571281 -0.075842 -0.262201  0.019818  0.504592   \n",
       "999989 -0.408067  0.862003  0.646698  0.700784  0.572094 -0.678907  0.192104   \n",
       "999992 -0.160676 -0.055138  0.247362  0.478874  0.195440  0.358874 -0.124268   \n",
       "999994  0.005356  0.403151  0.538632  0.792229 -0.226379 -0.229716 -0.044973   \n",
       "\n",
       "           FNC41     FNC42     FNC43  ...    FNC353    FNC368  SBM_map7  \\\n",
       "Id                                    ...                                 \n",
       "100004  0.464976  0.236121 -0.047108  ... -0.483997  0.150698 -2.404130   \n",
       "100015  0.452051  0.032888  0.262658  ... -0.252089  0.318086 -0.612468   \n",
       "100026  0.068699  0.275885  0.562116  ...  0.294130  0.093333 -0.752907   \n",
       "100030  0.316369  0.067719 -0.046476  ... -0.270263  0.442313  1.041755   \n",
       "100047  0.400933  0.041004  0.538643  ... -0.204068  0.394492 -1.775324   \n",
       "...          ...       ...       ...  ...       ...       ...       ...   \n",
       "999956 -0.056436 -0.227445  0.171759  ... -0.184820  0.407409 -1.123315   \n",
       "999979 -0.185592 -0.809122  0.389232  ... -0.310955  0.473550 -1.270540   \n",
       "999989 -0.556889 -0.246365 -0.199630  ... -0.167875 -0.437320 -0.501659   \n",
       "999992  0.284184  0.330500 -0.258015  ... -0.497435  0.783766 -1.248628   \n",
       "999994 -0.262501 -0.039308 -0.207124  ...  0.300068  0.562554 -1.980608   \n",
       "\n",
       "        SBM_map17  SBM_map36  SBM_map52  SBM_map61  SBM_map64  SBM_map67  \\\n",
       "Id                                                                         \n",
       "100004   2.256762  -2.011786   0.139369   1.123770   2.083006   1.145440   \n",
       "100015   1.711094   0.185261  -2.084801   1.397832   1.046136  -0.191733   \n",
       "100026   1.386814  -0.123830   0.046525   1.906989  -2.661633  -0.193911   \n",
       "100030  -0.949757   0.167515  -1.693663  -1.997087  -2.083782   1.154107   \n",
       "100047   0.415556   2.410666   0.021838   1.578984   1.402592  -1.230440   \n",
       "...           ...        ...        ...        ...        ...        ...   \n",
       "999956   0.428509   2.027816   0.849052   1.079632  -1.525021   2.479249   \n",
       "999979   0.013607  -0.047793  -0.755896   0.192909  -0.336587   0.062809   \n",
       "999989  -0.680883   1.595677   0.466304   0.430025   1.452216   0.759654   \n",
       "999992   0.815399   1.427952  -0.305261  -2.599313  -0.161822   0.183669   \n",
       "999994  -1.398193  -1.263830  -0.303426   0.785046   2.369759  -0.168919   \n",
       "\n",
       "        SBM_map75  \n",
       "Id                 \n",
       "100004   0.192076  \n",
       "100015   0.174160  \n",
       "100026  -0.476647  \n",
       "100030   2.790871  \n",
       "100047  -1.544345  \n",
       "...           ...  \n",
       "999956   0.548342  \n",
       "999979  -0.690686  \n",
       "999989   1.607317  \n",
       "999992  -1.313079  \n",
       "999994  -0.668412  \n",
       "\n",
       "[119748 rows x 68 columns]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_test = pd.concat([FNC_test[FNC_features], SBM_test[SBM_features]], axis=1)\n",
    "X_test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Healthy Control</th>\n",
       "      <th>Schizophrenic Patient</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>100004</th>\n",
       "      <td>0.342857</td>\n",
       "      <td>0.657143</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100015</th>\n",
       "      <td>0.542857</td>\n",
       "      <td>0.457143</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100026</th>\n",
       "      <td>0.600000</td>\n",
       "      <td>0.400000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100030</th>\n",
       "      <td>0.685714</td>\n",
       "      <td>0.314286</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100047</th>\n",
       "      <td>0.600000</td>\n",
       "      <td>0.400000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999956</th>\n",
       "      <td>0.485714</td>\n",
       "      <td>0.514286</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999979</th>\n",
       "      <td>0.428571</td>\n",
       "      <td>0.571429</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999989</th>\n",
       "      <td>0.600000</td>\n",
       "      <td>0.400000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999992</th>\n",
       "      <td>0.342857</td>\n",
       "      <td>0.657143</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999994</th>\n",
       "      <td>0.514286</td>\n",
       "      <td>0.485714</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>119748 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        Healthy Control  Schizophrenic Patient\n",
       "Id                                            \n",
       "100004         0.342857               0.657143\n",
       "100015         0.542857               0.457143\n",
       "100026         0.600000               0.400000\n",
       "100030         0.685714               0.314286\n",
       "100047         0.600000               0.400000\n",
       "...                 ...                    ...\n",
       "999956         0.485714               0.514286\n",
       "999979         0.428571               0.571429\n",
       "999989         0.600000               0.400000\n",
       "999992         0.342857               0.657143\n",
       "999994         0.514286               0.485714\n",
       "\n",
       "[119748 rows x 2 columns]"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_test = clf.predict_proba(X_test)\n",
    "submission = pd.DataFrame(y_test, index=X_test.index, columns=['Healthy Control', 'Schizophrenic Patient'])\n",
    "submission"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "submission.to_csv('submission.csv', columns=['Schizophrenic Patient'], header=['Probability'])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}