{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tutorial 4: Multi-level column indices (`MultiIndex`)\n",
    "\n",
    "A MultiIndex is an index with multiple, hierarchical levels. We use a multiindex for the column headers of a dataframe. This allows each column to have multiple keys associated with it--one key for each level.\n",
    "\n",
    "The multiindex is a native part of the `pandas` package. For more documentation, see their [MultiIndex / advanced indexing page](https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html).\n",
    "\n",
    "MultiIndexes make it easy to associate multiple identifiers with a single column. In this tutorial, we'll work with phosphoproteomics data from the endometrial cancer dataset.\n",
    "\n",
    "Phosphoproteomics is the large-scale study of proteins that have undergone a process called phosphorylation. This is particularly important in the context of disease, as the process is often disrupted in cancer cells.\n",
    "\n",
    "Let's start by importing our required package and loading the endometrial dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import cptac\n",
    "en = cptac.Ucec()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see the data sources available in our dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Data type</th>\n",
       "      <th>Available sources</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>CNV</td>\n",
       "      <td>[bcm, washu]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>circular_RNA</td>\n",
       "      <td>[bcm]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>miRNA</td>\n",
       "      <td>[bcm, washu]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>proteomics</td>\n",
       "      <td>[bcm, umich]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>transcriptomics</td>\n",
       "      <td>[bcm, broad, washu]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>ancestry_prediction</td>\n",
       "      <td>[harmonized]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>somatic_mutation</td>\n",
       "      <td>[harmonized, washu]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>clinical</td>\n",
       "      <td>[mssm]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>follow-up</td>\n",
       "      <td>[mssm]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>medical_history</td>\n",
       "      <td>[mssm]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>acetylproteomics</td>\n",
       "      <td>[umich]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>phosphoproteomics</td>\n",
       "      <td>[umich]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>cibersort</td>\n",
       "      <td>[washu]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>hla_typing</td>\n",
       "      <td>[washu]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>tumor_purity</td>\n",
       "      <td>[washu]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>xcell</td>\n",
       "      <td>[washu]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              Data type    Available sources\n",
       "0                   CNV         [bcm, washu]\n",
       "1          circular_RNA                [bcm]\n",
       "2                 miRNA         [bcm, washu]\n",
       "3            proteomics         [bcm, umich]\n",
       "4       transcriptomics  [bcm, broad, washu]\n",
       "5   ancestry_prediction         [harmonized]\n",
       "6      somatic_mutation  [harmonized, washu]\n",
       "7              clinical               [mssm]\n",
       "8             follow-up               [mssm]\n",
       "9       medical_history               [mssm]\n",
       "10     acetylproteomics              [umich]\n",
       "11    phosphoproteomics              [umich]\n",
       "12            cibersort              [washu]\n",
       "13           hla_typing              [washu]\n",
       "14         tumor_purity              [washu]\n",
       "15                xcell              [washu]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "en.list_data_sources()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we see different types of omics data (e.g., CNV, proteomics, phosphoproteomics) available from various sources (e.g., washu, umich).\n",
    "\n",
    "We will retrieve phosphoproteomics, proteomics and CNV data from the respective sources."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "cptac warning: Your version of cptac (1.5.1) is out-of-date. Latest is 1.5.0. Please run 'pip install --upgrade cptac' to update it. (C:\\Users\\sabme\\anaconda3\\lib\\threading.py, line 910)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>Name</th>\n",
       "      <th>A1BG</th>\n",
       "      <th>A1CF</th>\n",
       "      <th>A2M</th>\n",
       "      <th>A2ML1</th>\n",
       "      <th>A3GALT2</th>\n",
       "      <th>A4GALT</th>\n",
       "      <th>A4GNT</th>\n",
       "      <th>AAAS</th>\n",
       "      <th>AACS</th>\n",
       "      <th>AADAC</th>\n",
       "      <th>...</th>\n",
       "      <th>ZW10</th>\n",
       "      <th>ZWILCH</th>\n",
       "      <th>ZWINT</th>\n",
       "      <th>ZXDC</th>\n",
       "      <th>ZYG11A</th>\n",
       "      <th>ZYG11B</th>\n",
       "      <th>ZYX</th>\n",
       "      <th>ZZEF1</th>\n",
       "      <th>ZZZ3</th>\n",
       "      <th>pk</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSG00000121410.10</th>\n",
       "      <th>ENSG00000148584.13</th>\n",
       "      <th>ENSG00000175899.13</th>\n",
       "      <th>ENSG00000166535.18</th>\n",
       "      <th>ENSG00000184389.9</th>\n",
       "      <th>ENSG00000128274.14</th>\n",
       "      <th>ENSG00000118017.3</th>\n",
       "      <th>ENSG00000094914.11</th>\n",
       "      <th>ENSG00000081760.15</th>\n",
       "      <th>ENSG00000114771.12</th>\n",
       "      <th>...</th>\n",
       "      <th>ENSG00000086827.7</th>\n",
       "      <th>ENSG00000174442.10</th>\n",
       "      <th>ENSG00000122952.15</th>\n",
       "      <th>ENSG00000070476.13</th>\n",
       "      <th>ENSG00000203995.8</th>\n",
       "      <th>ENSG00000162378.11</th>\n",
       "      <th>ENSG00000159840.14</th>\n",
       "      <th>ENSG00000074755.13</th>\n",
       "      <th>ENSG00000036549.11</th>\n",
       "      <th>ENSG00000091436.15</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>-0.00659</td>\n",
       "      <td>-0.01982</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01418</td>\n",
       "      <td>-0.00839</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.01641</td>\n",
       "      <td>-0.00963</td>\n",
       "      <td>-0.01982</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>-0.01418</td>\n",
       "      <td>-0.01418</td>\n",
       "      <td>-0.01897</td>\n",
       "      <td>-0.00529</td>\n",
       "      <td>-0.01418</td>\n",
       "      <td>-0.01480</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.02578</td>\n",
       "      <td>0.00726</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.00732</td>\n",
       "      <td>0.01642</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>...</td>\n",
       "      <td>0.01583</td>\n",
       "      <td>0.01844</td>\n",
       "      <td>0.00726</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>0.00732</td>\n",
       "      <td>0.00732</td>\n",
       "      <td>0.01200</td>\n",
       "      <td>0.01969</td>\n",
       "      <td>0.00732</td>\n",
       "      <td>0.01121</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>0.01262</td>\n",
       "      <td>0.00425</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>0.00166</td>\n",
       "      <td>0.00549</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.00305</td>\n",
       "      <td>0.00214</td>\n",
       "      <td>0.00425</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>0.00166</td>\n",
       "      <td>0.00166</td>\n",
       "      <td>0.01408</td>\n",
       "      <td>0.00683</td>\n",
       "      <td>0.00166</td>\n",
       "      <td>0.00208</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>0.00100</td>\n",
       "      <td>0.41191</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02436</td>\n",
       "      <td>-0.01198</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.01982</td>\n",
       "      <td>-0.02071</td>\n",
       "      <td>0.41191</td>\n",
       "      <td>-0.61621</td>\n",
       "      <td>-0.02436</td>\n",
       "      <td>-0.02436</td>\n",
       "      <td>-0.02182</td>\n",
       "      <td>-0.00336</td>\n",
       "      <td>-0.02436</td>\n",
       "      <td>-0.02548</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00098</th>\n",
       "      <td>1.01075</td>\n",
       "      <td>0.27221</td>\n",
       "      <td>-0.39802</td>\n",
       "      <td>-0.39802</td>\n",
       "      <td>0.00226</td>\n",
       "      <td>0.31684</td>\n",
       "      <td>0.31108</td>\n",
       "      <td>-0.38507</td>\n",
       "      <td>-0.41089</td>\n",
       "      <td>0.31108</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.39843</td>\n",
       "      <td>-0.38591</td>\n",
       "      <td>0.27221</td>\n",
       "      <td>0.31108</td>\n",
       "      <td>0.01711</td>\n",
       "      <td>0.01711</td>\n",
       "      <td>-0.01434</td>\n",
       "      <td>-0.34344</td>\n",
       "      <td>-0.01427</td>\n",
       "      <td>0.53267</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3N-01520</th>\n",
       "      <td>-0.05661</td>\n",
       "      <td>-0.06508</td>\n",
       "      <td>-0.06174</td>\n",
       "      <td>-0.06174</td>\n",
       "      <td>-0.05318</td>\n",
       "      <td>-0.05054</td>\n",
       "      <td>-0.06413</td>\n",
       "      <td>-0.06174</td>\n",
       "      <td>-0.06174</td>\n",
       "      <td>-0.06413</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.05110</td>\n",
       "      <td>-0.05228</td>\n",
       "      <td>-0.06508</td>\n",
       "      <td>-0.06413</td>\n",
       "      <td>-0.05318</td>\n",
       "      <td>-0.05318</td>\n",
       "      <td>0.37759</td>\n",
       "      <td>-0.04914</td>\n",
       "      <td>-0.05318</td>\n",
       "      <td>-0.05533</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3N-01521</th>\n",
       "      <td>-0.36477</td>\n",
       "      <td>-0.00244</td>\n",
       "      <td>-0.06953</td>\n",
       "      <td>-0.06953</td>\n",
       "      <td>0.38241</td>\n",
       "      <td>-0.34150</td>\n",
       "      <td>0.31502</td>\n",
       "      <td>-0.06953</td>\n",
       "      <td>-0.04038</td>\n",
       "      <td>0.31502</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.05879</td>\n",
       "      <td>-0.19572</td>\n",
       "      <td>-0.00244</td>\n",
       "      <td>0.79082</td>\n",
       "      <td>0.38241</td>\n",
       "      <td>0.38241</td>\n",
       "      <td>0.58054</td>\n",
       "      <td>-0.34785</td>\n",
       "      <td>0.38241</td>\n",
       "      <td>0.00842</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3N-01537</th>\n",
       "      <td>0.09203</td>\n",
       "      <td>0.00535</td>\n",
       "      <td>0.08807</td>\n",
       "      <td>0.08807</td>\n",
       "      <td>-0.19341</td>\n",
       "      <td>-0.29275</td>\n",
       "      <td>0.08943</td>\n",
       "      <td>0.08807</td>\n",
       "      <td>0.07883</td>\n",
       "      <td>0.08943</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.11330</td>\n",
       "      <td>0.07121</td>\n",
       "      <td>0.00535</td>\n",
       "      <td>0.08943</td>\n",
       "      <td>-0.10289</td>\n",
       "      <td>-0.10289</td>\n",
       "      <td>-0.01151</td>\n",
       "      <td>-0.30164</td>\n",
       "      <td>-0.10289</td>\n",
       "      <td>0.02389</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3N-01802</th>\n",
       "      <td>-0.06298</td>\n",
       "      <td>-0.04134</td>\n",
       "      <td>0.05070</td>\n",
       "      <td>0.05070</td>\n",
       "      <td>-0.00420</td>\n",
       "      <td>-0.18427</td>\n",
       "      <td>0.17981</td>\n",
       "      <td>0.08711</td>\n",
       "      <td>-0.12670</td>\n",
       "      <td>0.17981</td>\n",
       "      <td>...</td>\n",
       "      <td>0.14454</td>\n",
       "      <td>0.05509</td>\n",
       "      <td>-0.04134</td>\n",
       "      <td>0.17981</td>\n",
       "      <td>-0.12128</td>\n",
       "      <td>-0.12128</td>\n",
       "      <td>-0.06015</td>\n",
       "      <td>0.14747</td>\n",
       "      <td>-0.13738</td>\n",
       "      <td>-0.01938</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3N-01825</th>\n",
       "      <td>0.12974</td>\n",
       "      <td>0.03784</td>\n",
       "      <td>0.11400</td>\n",
       "      <td>0.11400</td>\n",
       "      <td>0.04662</td>\n",
       "      <td>0.01662</td>\n",
       "      <td>0.13939</td>\n",
       "      <td>0.11400</td>\n",
       "      <td>-0.00039</td>\n",
       "      <td>0.13939</td>\n",
       "      <td>...</td>\n",
       "      <td>0.02229</td>\n",
       "      <td>-0.00842</td>\n",
       "      <td>0.03784</td>\n",
       "      <td>0.13939</td>\n",
       "      <td>0.04662</td>\n",
       "      <td>0.04662</td>\n",
       "      <td>0.03222</td>\n",
       "      <td>-0.02923</td>\n",
       "      <td>0.04662</td>\n",
       "      <td>0.02880</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>95 rows × 18919 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name                      A1BG               A1CF                A2M  \\\n",
       "Database_ID ENSG00000121410.10 ENSG00000148584.13 ENSG00000175899.13   \n",
       "Patient_ID                                                             \n",
       "C3L-00006             -0.00659           -0.01982           -0.01402   \n",
       "C3L-00008              0.02578            0.00726            0.01350   \n",
       "C3L-00032              0.01262            0.00425           -0.00275   \n",
       "C3L-00090              0.00100            0.41191           -0.02299   \n",
       "C3L-00098              1.01075            0.27221           -0.39802   \n",
       "...                        ...                ...                ...   \n",
       "C3N-01520             -0.05661           -0.06508           -0.06174   \n",
       "C3N-01521             -0.36477           -0.00244           -0.06953   \n",
       "C3N-01537              0.09203            0.00535            0.08807   \n",
       "C3N-01802             -0.06298           -0.04134            0.05070   \n",
       "C3N-01825              0.12974            0.03784            0.11400   \n",
       "\n",
       "Name                     A2ML1           A3GALT2             A4GALT  \\\n",
       "Database_ID ENSG00000166535.18 ENSG00000184389.9 ENSG00000128274.14   \n",
       "Patient_ID                                                            \n",
       "C3L-00006             -0.01402          -0.01418           -0.00839   \n",
       "C3L-00008              0.01350           0.00732            0.01642   \n",
       "C3L-00032             -0.00275           0.00166            0.00549   \n",
       "C3L-00090             -0.02299          -0.02436           -0.01198   \n",
       "C3L-00098             -0.39802           0.00226            0.31684   \n",
       "...                        ...               ...                ...   \n",
       "C3N-01520             -0.06174          -0.05318           -0.05054   \n",
       "C3N-01521             -0.06953           0.38241           -0.34150   \n",
       "C3N-01537              0.08807          -0.19341           -0.29275   \n",
       "C3N-01802              0.05070          -0.00420           -0.18427   \n",
       "C3N-01825              0.11400           0.04662            0.01662   \n",
       "\n",
       "Name                    A4GNT               AAAS               AACS  \\\n",
       "Database_ID ENSG00000118017.3 ENSG00000094914.11 ENSG00000081760.15   \n",
       "Patient_ID                                                            \n",
       "C3L-00006            -0.01305           -0.01402           -0.01402   \n",
       "C3L-00008             0.01005            0.01225            0.01225   \n",
       "C3L-00032            -0.00038           -0.00275           -0.00275   \n",
       "C3L-00090            -0.03307           -0.02299           -0.02299   \n",
       "C3L-00098             0.31108           -0.38507           -0.41089   \n",
       "...                       ...                ...                ...   \n",
       "C3N-01520            -0.06413           -0.06174           -0.06174   \n",
       "C3N-01521             0.31502           -0.06953           -0.04038   \n",
       "C3N-01537             0.08943            0.08807            0.07883   \n",
       "C3N-01802             0.17981            0.08711           -0.12670   \n",
       "C3N-01825             0.13939            0.11400           -0.00039   \n",
       "\n",
       "Name                     AADAC  ...              ZW10             ZWILCH  \\\n",
       "Database_ID ENSG00000114771.12  ... ENSG00000086827.7 ENSG00000174442.10   \n",
       "Patient_ID                      ...                                        \n",
       "C3L-00006             -0.01305  ...          -0.01641           -0.00963   \n",
       "C3L-00008              0.01005  ...           0.01583            0.01844   \n",
       "C3L-00032             -0.00038  ...          -0.00305            0.00214   \n",
       "C3L-00090             -0.03307  ...          -0.01982           -0.02071   \n",
       "C3L-00098              0.31108  ...          -0.39843           -0.38591   \n",
       "...                        ...  ...               ...                ...   \n",
       "C3N-01520             -0.06413  ...          -0.05110           -0.05228   \n",
       "C3N-01521              0.31502  ...          -0.05879           -0.19572   \n",
       "C3N-01537              0.08943  ...          -0.11330            0.07121   \n",
       "C3N-01802              0.17981  ...           0.14454            0.05509   \n",
       "C3N-01825              0.13939  ...           0.02229           -0.00842   \n",
       "\n",
       "Name                     ZWINT               ZXDC            ZYG11A  \\\n",
       "Database_ID ENSG00000122952.15 ENSG00000070476.13 ENSG00000203995.8   \n",
       "Patient_ID                                                            \n",
       "C3L-00006             -0.01982           -0.01305          -0.01418   \n",
       "C3L-00008              0.00726            0.01005           0.00732   \n",
       "C3L-00032              0.00425           -0.00038           0.00166   \n",
       "C3L-00090              0.41191           -0.61621          -0.02436   \n",
       "C3L-00098              0.27221            0.31108           0.01711   \n",
       "...                        ...                ...               ...   \n",
       "C3N-01520             -0.06508           -0.06413          -0.05318   \n",
       "C3N-01521             -0.00244            0.79082           0.38241   \n",
       "C3N-01537              0.00535            0.08943          -0.10289   \n",
       "C3N-01802             -0.04134            0.17981          -0.12128   \n",
       "C3N-01825              0.03784            0.13939           0.04662   \n",
       "\n",
       "Name                    ZYG11B                ZYX              ZZEF1  \\\n",
       "Database_ID ENSG00000162378.11 ENSG00000159840.14 ENSG00000074755.13   \n",
       "Patient_ID                                                             \n",
       "C3L-00006             -0.01418           -0.01897           -0.00529   \n",
       "C3L-00008              0.00732            0.01200            0.01969   \n",
       "C3L-00032              0.00166            0.01408            0.00683   \n",
       "C3L-00090             -0.02436           -0.02182           -0.00336   \n",
       "C3L-00098              0.01711           -0.01434           -0.34344   \n",
       "...                        ...                ...                ...   \n",
       "C3N-01520             -0.05318            0.37759           -0.04914   \n",
       "C3N-01521              0.38241            0.58054           -0.34785   \n",
       "C3N-01537             -0.10289           -0.01151           -0.30164   \n",
       "C3N-01802             -0.12128           -0.06015            0.14747   \n",
       "C3N-01825              0.04662            0.03222           -0.02923   \n",
       "\n",
       "Name                      ZZZ3                 pk  \n",
       "Database_ID ENSG00000036549.11 ENSG00000091436.15  \n",
       "Patient_ID                                         \n",
       "C3L-00006             -0.01418           -0.01480  \n",
       "C3L-00008              0.00732            0.01121  \n",
       "C3L-00032              0.00166            0.00208  \n",
       "C3L-00090             -0.02436           -0.02548  \n",
       "C3L-00098             -0.01427            0.53267  \n",
       "...                        ...                ...  \n",
       "C3N-01520             -0.05318           -0.05533  \n",
       "C3N-01521              0.38241            0.00842  \n",
       "C3N-01537             -0.10289            0.02389  \n",
       "C3N-01802             -0.13738           -0.01938  \n",
       "C3N-01825              0.04662            0.02880  \n",
       "\n",
       "[95 rows x 18919 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Some data may take several minutes to load\n",
    "en.get_proteomics('umich')\n",
    "en.get_phosphoproteomics('umich')\n",
    "en.get_CNV('washu')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that our data is loaded, let's take a look at the structure of the phosphoproteomics data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>Name</th>\n",
       "      <th>ARF5</th>\n",
       "      <th>M6PR</th>\n",
       "      <th colspan=\"8\" halign=\"left\">ESRRA</th>\n",
       "      <th>...</th>\n",
       "      <th colspan=\"2\" halign=\"left\">SCRIB</th>\n",
       "      <th colspan=\"6\" halign=\"left\">TSGA10</th>\n",
       "      <th colspan=\"2\" halign=\"left\">SVIL</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Site</th>\n",
       "      <th>S137</th>\n",
       "      <th>S267</th>\n",
       "      <th>S19</th>\n",
       "      <th>S22</th>\n",
       "      <th>S19S22</th>\n",
       "      <th>T31</th>\n",
       "      <th>S19S22</th>\n",
       "      <th>S19S22S27</th>\n",
       "      <th>S19S22T31</th>\n",
       "      <th>S27</th>\n",
       "      <th>...</th>\n",
       "      <th>S1575T1588S1594</th>\n",
       "      <th>S1594</th>\n",
       "      <th>S11</th>\n",
       "      <th>S173</th>\n",
       "      <th>S213</th>\n",
       "      <th>S391</th>\n",
       "      <th>S779</th>\n",
       "      <th>S101</th>\n",
       "      <th>S296</th>\n",
       "      <th>S459</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Peptide</th>\n",
       "      <th>QDMPNAMPVsELTDK</th>\n",
       "      <th>GVGDDQLGEEsEERDDHLLPM</th>\n",
       "      <th>AEPAsPDSPK</th>\n",
       "      <th>AEPASPDsPK</th>\n",
       "      <th>AEPAsPDsPK</th>\n",
       "      <th>AEPASPDSPKGSSETEtEPPVALAPGPAPTR</th>\n",
       "      <th>AEPAsPDsPKGSSETETEPPVALAPGPAPTR</th>\n",
       "      <th>AEPAsPDsPKGSsETETEPPVALAPGPAPTR</th>\n",
       "      <th>AEPAsPDsPKGSSETEtEPPVALAPGPAPTR</th>\n",
       "      <th>GSsETETEPPVALAPGPAPTR</th>\n",
       "      <th>...</th>\n",
       "      <th>LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR</th>\n",
       "      <th>LsPDFAEELR</th>\n",
       "      <th>sPGRDPELQVEAAEVTTK</th>\n",
       "      <th>sPSRLDSFVK</th>\n",
       "      <th>RPsPTAR</th>\n",
       "      <th>AMDTEsELGR</th>\n",
       "      <th>GLDRsLEENLCYR;GLDRsLEENLCYRDF</th>\n",
       "      <th>EVVSSQVDDLTsHNEHLCK</th>\n",
       "      <th>DSEGDTPsLINWPSSK</th>\n",
       "      <th>LPsPTVAR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSP00000000233.5</th>\n",
       "      <th>ENSP00000000412.3</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>...</th>\n",
       "      <th>ENSP00000501177.1</th>\n",
       "      <th>ENSP00000501177.1</th>\n",
       "      <th>ENSP00000501312.1</th>\n",
       "      <th>ENSP00000501312.1</th>\n",
       "      <th>ENSP00000501312.1</th>\n",
       "      <th>ENSP00000501312.1</th>\n",
       "      <th>ENSP00000501312.1</th>\n",
       "      <th>ENSP00000501312.1</th>\n",
       "      <th>ENSP00000501521.1</th>\n",
       "      <th>ENSP00000501521.1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>NaN</td>\n",
       "      <td>0.573633</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.304721</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.667426</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.905606</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.069911</td>\n",
       "      <td>-0.584774</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.561657</td>\n",
       "      <td>-0.652457</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.003632</td>\n",
       "      <td>-0.393734</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.789193</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.488427</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.431599</td>\n",
       "      <td>-1.079638</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.211020</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.131605</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.104862</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.439041</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "      <td>NaN</td>\n",
       "      <td>0.220473</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.290506</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.399718</td>\n",
       "      <td>-0.875016</td>\n",
       "      <td>-0.579824</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.505807</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.521725</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>NaN</td>\n",
       "      <td>0.161496</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.708453</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.405402</td>\n",
       "      <td>1.253045</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.265813</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.069439</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.510268</td>\n",
       "      <td>-1.889144</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.592203</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.126482</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 85661 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name                     ARF5                  M6PR             ESRRA  \\\n",
       "Site                     S137                  S267               S19   \n",
       "Peptide       QDMPNAMPVsELTDK GVGDDQLGEEsEERDDHLLPM        AEPAsPDSPK   \n",
       "Database_ID ENSP00000000233.5     ENSP00000000412.3 ENSP00000000442.6   \n",
       "Patient_ID                                                              \n",
       "C3L-00006                 NaN              0.573633               NaN   \n",
       "C3L-00008            0.003632             -0.393734               NaN   \n",
       "C3L-00032                 NaN             -0.211020               NaN   \n",
       "C3L-00084                 NaN              0.220473               NaN   \n",
       "C3L-00090                 NaN              0.161496               NaN   \n",
       "\n",
       "Name                                             \\\n",
       "Site                      S22            S19S22   \n",
       "Peptide            AEPASPDsPK        AEPAsPDsPK   \n",
       "Database_ID ENSP00000000442.6 ENSP00000000442.6   \n",
       "Patient_ID                                        \n",
       "C3L-00006                 NaN          0.304721   \n",
       "C3L-00008                 NaN          0.789193   \n",
       "C3L-00032                 NaN               NaN   \n",
       "C3L-00084                 NaN         -0.290506   \n",
       "C3L-00090                 NaN          0.708453   \n",
       "\n",
       "Name                                                                         \\\n",
       "Site                                    T31                          S19S22   \n",
       "Peptide     AEPASPDSPKGSSETEtEPPVALAPGPAPTR AEPAsPDsPKGSSETETEPPVALAPGPAPTR   \n",
       "Database_ID               ENSP00000000442.6               ENSP00000000442.6   \n",
       "Patient_ID                                                                    \n",
       "C3L-00006                               NaN                             NaN   \n",
       "C3L-00008                               NaN                             NaN   \n",
       "C3L-00032                               NaN                             NaN   \n",
       "C3L-00084                               NaN                             NaN   \n",
       "C3L-00090                               NaN                        0.405402   \n",
       "\n",
       "Name                                                                         \\\n",
       "Site                              S19S22S27                       S19S22T31   \n",
       "Peptide     AEPAsPDsPKGSsETETEPPVALAPGPAPTR AEPAsPDsPKGSSETEtEPPVALAPGPAPTR   \n",
       "Database_ID               ENSP00000000442.6               ENSP00000000442.6   \n",
       "Patient_ID                                                                    \n",
       "C3L-00006                               NaN                             NaN   \n",
       "C3L-00008                               NaN                             NaN   \n",
       "C3L-00032                               NaN                             NaN   \n",
       "C3L-00084                               NaN                             NaN   \n",
       "C3L-00090                          1.253045                             NaN   \n",
       "\n",
       "Name                               ...  \\\n",
       "Site                          S27  ...   \n",
       "Peptide     GSsETETEPPVALAPGPAPTR  ...   \n",
       "Database_ID     ENSP00000000442.6  ...   \n",
       "Patient_ID                         ...   \n",
       "C3L-00006                     NaN  ...   \n",
       "C3L-00008                     NaN  ...   \n",
       "C3L-00032                0.131605  ...   \n",
       "C3L-00084                     NaN  ...   \n",
       "C3L-00090                0.265813  ...   \n",
       "\n",
       "Name                                          SCRIB                    \\\n",
       "Site                                S1575T1588S1594             S1594   \n",
       "Peptide     LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR        LsPDFAEELR   \n",
       "Database_ID                       ENSP00000501177.1 ENSP00000501177.1   \n",
       "Patient_ID                                                              \n",
       "C3L-00006                                       NaN          0.667426   \n",
       "C3L-00008                                       NaN               NaN   \n",
       "C3L-00032                                       NaN          0.104862   \n",
       "C3L-00084                                       NaN          0.399718   \n",
       "C3L-00090                                       NaN          1.069439   \n",
       "\n",
       "Name                    TSGA10                                      \\\n",
       "Site                       S11              S173              S213   \n",
       "Peptide     sPGRDPELQVEAAEVTTK        sPSRLDSFVK           RPsPTAR   \n",
       "Database_ID  ENSP00000501312.1 ENSP00000501312.1 ENSP00000501312.1   \n",
       "Patient_ID                                                           \n",
       "C3L-00006                  NaN          0.905606               NaN   \n",
       "C3L-00008                  NaN         -0.488427               NaN   \n",
       "C3L-00032                  NaN               NaN               NaN   \n",
       "C3L-00084            -0.875016         -0.579824               NaN   \n",
       "C3L-00090                  NaN          0.510268         -1.889144   \n",
       "\n",
       "Name                                                         \\\n",
       "Site                     S391                          S779   \n",
       "Peptide            AMDTEsELGR GLDRsLEENLCYR;GLDRsLEENLCYRDF   \n",
       "Database_ID ENSP00000501312.1             ENSP00000501312.1   \n",
       "Patient_ID                                                    \n",
       "C3L-00006           -0.069911                     -0.584774   \n",
       "C3L-00008                 NaN                           NaN   \n",
       "C3L-00032                 NaN                           NaN   \n",
       "C3L-00084                 NaN                     -0.505807   \n",
       "C3L-00090                 NaN                     -0.592203   \n",
       "\n",
       "Name                                         SVIL                    \n",
       "Site                       S101              S296              S459  \n",
       "Peptide     EVVSSQVDDLTsHNEHLCK  DSEGDTPsLINWPSSK          LPsPTVAR  \n",
       "Database_ID   ENSP00000501312.1 ENSP00000501521.1 ENSP00000501521.1  \n",
       "Patient_ID                                                           \n",
       "C3L-00006                   NaN         -0.561657         -0.652457  \n",
       "C3L-00008                   NaN         -0.431599         -1.079638  \n",
       "C3L-00032                   NaN               NaN         -1.439041  \n",
       "C3L-00084                   NaN               NaN         -1.521725  \n",
       "C3L-00090                   NaN               NaN         -1.126482  \n",
       "\n",
       "[5 rows x 85661 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Display first 5 rows of phosphoproteomics data\n",
    "en.get_phosphoproteomics('umich').head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Join functions with multiindices \n",
    "The join functions have been written to handle multiindices. More information on the join functions can be found in the joining_dataframes tutorial. \n",
    "An example of joining a multiindexed dataframe (in this case phosphoproteomics) with a non multiindexed dataframe (in this case CNV) is below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>Name</th>\n",
       "      <th>A1BG_washu_CNV</th>\n",
       "      <th>A1CF_washu_CNV</th>\n",
       "      <th>A2M_washu_CNV</th>\n",
       "      <th>A2ML1_washu_CNV</th>\n",
       "      <th>A3GALT2_washu_CNV</th>\n",
       "      <th>A4GALT_washu_CNV</th>\n",
       "      <th>A4GNT_washu_CNV</th>\n",
       "      <th>AAAS_washu_CNV</th>\n",
       "      <th>AACS_washu_CNV</th>\n",
       "      <th>AADAC_washu_CNV</th>\n",
       "      <th>...</th>\n",
       "      <th colspan=\"2\" halign=\"left\">SCRIB_umich_phosphoproteomics</th>\n",
       "      <th colspan=\"6\" halign=\"left\">TSGA10_umich_phosphoproteomics</th>\n",
       "      <th colspan=\"2\" halign=\"left\">SVIL_umich_phosphoproteomics</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Site</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>...</th>\n",
       "      <th>S1575T1588S1594</th>\n",
       "      <th>S1594</th>\n",
       "      <th>S11</th>\n",
       "      <th>S173</th>\n",
       "      <th>S213</th>\n",
       "      <th>S391</th>\n",
       "      <th>S779</th>\n",
       "      <th>S101</th>\n",
       "      <th>S296</th>\n",
       "      <th>S459</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Peptide</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>...</th>\n",
       "      <th>LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR</th>\n",
       "      <th>LsPDFAEELR</th>\n",
       "      <th>sPGRDPELQVEAAEVTTK</th>\n",
       "      <th>sPSRLDSFVK</th>\n",
       "      <th>RPsPTAR</th>\n",
       "      <th>AMDTEsELGR</th>\n",
       "      <th>GLDRsLEENLCYR;GLDRsLEENLCYRDF</th>\n",
       "      <th>EVVSSQVDDLTsHNEHLCK</th>\n",
       "      <th>DSEGDTPsLINWPSSK</th>\n",
       "      <th>LPsPTVAR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>-0.00659</td>\n",
       "      <td>-0.01982</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01418</td>\n",
       "      <td>-0.00839</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.667426</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.905606</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.069911</td>\n",
       "      <td>-0.584774</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.561657</td>\n",
       "      <td>-0.652457</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.02578</td>\n",
       "      <td>0.00726</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.00732</td>\n",
       "      <td>0.01642</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.488427</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.431599</td>\n",
       "      <td>-1.079638</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>0.01262</td>\n",
       "      <td>0.00425</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>0.00166</td>\n",
       "      <td>0.00549</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.104862</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.439041</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.399718</td>\n",
       "      <td>-0.875016</td>\n",
       "      <td>-0.579824</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.505807</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.521725</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>0.00100</td>\n",
       "      <td>0.41191</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02436</td>\n",
       "      <td>-0.01198</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.069439</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.510268</td>\n",
       "      <td>-1.889144</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.592203</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.126482</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 104580 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name       A1BG_washu_CNV A1CF_washu_CNV A2M_washu_CNV A2ML1_washu_CNV  \\\n",
       "Site                                                                     \n",
       "Peptide                                                                  \n",
       "Patient_ID                                                               \n",
       "C3L-00006        -0.00659       -0.01982      -0.01402        -0.01402   \n",
       "C3L-00008         0.02578        0.00726       0.01350         0.01350   \n",
       "C3L-00032         0.01262        0.00425      -0.00275        -0.00275   \n",
       "C3L-00084             NaN            NaN           NaN             NaN   \n",
       "C3L-00090         0.00100        0.41191      -0.02299        -0.02299   \n",
       "\n",
       "Name       A3GALT2_washu_CNV A4GALT_washu_CNV A4GNT_washu_CNV AAAS_washu_CNV  \\\n",
       "Site                                                                           \n",
       "Peptide                                                                        \n",
       "Patient_ID                                                                     \n",
       "C3L-00006           -0.01418         -0.00839        -0.01305       -0.01402   \n",
       "C3L-00008            0.00732          0.01642         0.01005        0.01225   \n",
       "C3L-00032            0.00166          0.00549        -0.00038       -0.00275   \n",
       "C3L-00084                NaN              NaN             NaN            NaN   \n",
       "C3L-00090           -0.02436         -0.01198        -0.03307       -0.02299   \n",
       "\n",
       "Name       AACS_washu_CNV AADAC_washu_CNV  ...  \\\n",
       "Site                                       ...   \n",
       "Peptide                                    ...   \n",
       "Patient_ID                                 ...   \n",
       "C3L-00006        -0.01402        -0.01305  ...   \n",
       "C3L-00008         0.01225         0.01005  ...   \n",
       "C3L-00032        -0.00275        -0.00038  ...   \n",
       "C3L-00084             NaN             NaN  ...   \n",
       "C3L-00090        -0.02299        -0.03307  ...   \n",
       "\n",
       "Name                 SCRIB_umich_phosphoproteomics             \\\n",
       "Site                               S1575T1588S1594      S1594   \n",
       "Peptide    LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR LsPDFAEELR   \n",
       "Patient_ID                                                      \n",
       "C3L-00006                                      NaN   0.667426   \n",
       "C3L-00008                                      NaN        NaN   \n",
       "C3L-00032                                      NaN   0.104862   \n",
       "C3L-00084                                      NaN   0.399718   \n",
       "C3L-00090                                      NaN   1.069439   \n",
       "\n",
       "Name       TSGA10_umich_phosphoproteomics                                  \\\n",
       "Site                                  S11       S173      S213       S391   \n",
       "Peptide                sPGRDPELQVEAAEVTTK sPSRLDSFVK   RPsPTAR AMDTEsELGR   \n",
       "Patient_ID                                                                  \n",
       "C3L-00006                             NaN   0.905606       NaN  -0.069911   \n",
       "C3L-00008                             NaN  -0.488427       NaN        NaN   \n",
       "C3L-00032                             NaN        NaN       NaN        NaN   \n",
       "C3L-00084                       -0.875016  -0.579824       NaN        NaN   \n",
       "C3L-00090                             NaN   0.510268 -1.889144        NaN   \n",
       "\n",
       "Name                                                          \\\n",
       "Site                                S779                S101   \n",
       "Peptide    GLDRsLEENLCYR;GLDRsLEENLCYRDF EVVSSQVDDLTsHNEHLCK   \n",
       "Patient_ID                                                     \n",
       "C3L-00006                      -0.584774                 NaN   \n",
       "C3L-00008                            NaN                 NaN   \n",
       "C3L-00032                            NaN                 NaN   \n",
       "C3L-00084                      -0.505807                 NaN   \n",
       "C3L-00090                      -0.592203                 NaN   \n",
       "\n",
       "Name       SVIL_umich_phosphoproteomics            \n",
       "Site                               S296      S459  \n",
       "Peptide                DSEGDTPsLINWPSSK  LPsPTVAR  \n",
       "Patient_ID                                         \n",
       "C3L-00006                     -0.561657 -0.652457  \n",
       "C3L-00008                     -0.431599 -1.079638  \n",
       "C3L-00032                           NaN -1.439041  \n",
       "C3L-00084                           NaN -1.521725  \n",
       "C3L-00090                           NaN -1.126482  \n",
       "\n",
       "[5 rows x 104580 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "phospho_and_CNV = en.join_omics_to_omics(df1_name=\"CNV\", df2_name=\"phosphoproteomics\", df1_source='washu', df2_source = 'umich')\n",
    "phospho_and_CNV.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since the C3L-00084 row doesn't have CNV data, it is filled in with NANs, so that it can be joined to the CNV dataframe. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to select from multiindex\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Selecting based on all levels\n",
    "We can select single columns by passing the proper keys for all levels of the multiindex. For example, to get the proteomics for ARF5, we'd do the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSP00000000233.5</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>-0.056513</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.549959</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>0.088681</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "      <td>-0.846555</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>0.539019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00098</th>\n",
       "      <td>-0.017370</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00136</th>\n",
       "      <td>0.230347</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00137</th>\n",
       "      <td>0.191915</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00139</th>\n",
       "      <td>-0.410142</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00143</th>\n",
       "      <td>-0.170514</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Database_ID  ENSP00000000233.5\n",
       "Patient_ID                    \n",
       "C3L-00006            -0.056513\n",
       "C3L-00008             0.549959\n",
       "C3L-00032             0.088681\n",
       "C3L-00084            -0.846555\n",
       "C3L-00090             0.539019\n",
       "C3L-00098            -0.017370\n",
       "C3L-00136             0.230347\n",
       "C3L-00137             0.191915\n",
       "C3L-00139            -0.410142\n",
       "C3L-00143            -0.170514"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prot = en.get_proteomics('umich')\n",
    "all_levels_selection = prot[\"ARF5\"]\n",
    "\n",
    "#Display the first 10 rows of the desired data\n",
    "all_levels_selection.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Selecting based on one level\n",
    "We can easily select multiple columns from our multiindex dataframe, based on just the \"Name\" level of the multiindex:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>Name</th>\n",
       "      <th>ARF5</th>\n",
       "      <th>AKAP11</th>\n",
       "      <th>ARHGEF5</th>\n",
       "      <th>APPBP2</th>\n",
       "      <th>AQR</th>\n",
       "      <th>ACAP1</th>\n",
       "      <th>ANO8</th>\n",
       "      <th>AP3M2</th>\n",
       "      <th>ASNS</th>\n",
       "      <th>ALDH3A2</th>\n",
       "      <th>...</th>\n",
       "      <th>ABCC2</th>\n",
       "      <th>ANKS1A</th>\n",
       "      <th>AL034430.2</th>\n",
       "      <th>AP5Z1</th>\n",
       "      <th>ATP8B1</th>\n",
       "      <th>AKR1B15</th>\n",
       "      <th>AC004706.3</th>\n",
       "      <th>ATAD3B</th>\n",
       "      <th colspan=\"2\" halign=\"left\">ANK2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSP00000000233.5</th>\n",
       "      <th>ENSP00000025301.2</th>\n",
       "      <th>ENSP00000056217.5</th>\n",
       "      <th>ENSP00000083182.3</th>\n",
       "      <th>ENSP00000156471.5</th>\n",
       "      <th>ENSP00000158762.3</th>\n",
       "      <th>ENSP00000159087.4</th>\n",
       "      <th>ENSP00000174653.3</th>\n",
       "      <th>ENSP00000175506.4</th>\n",
       "      <th>ENSP00000176643.6</th>\n",
       "      <th>...</th>\n",
       "      <th>ENSP00000497274.1</th>\n",
       "      <th>ENSP00000497393.1</th>\n",
       "      <th>ENSP00000497510.1</th>\n",
       "      <th>ENSP00000497815.1</th>\n",
       "      <th>ENSP00000497896.1</th>\n",
       "      <th>ENSP00000498877.1</th>\n",
       "      <th>ENSP00000499350.1</th>\n",
       "      <th>ENSP00000500094.1</th>\n",
       "      <th>ENSP00000500102.1</th>\n",
       "      <th>ENSP00000500937.1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>-0.056513</td>\n",
       "      <td>-0.385278</td>\n",
       "      <td>0.188877</td>\n",
       "      <td>-0.059319</td>\n",
       "      <td>0.276154</td>\n",
       "      <td>-0.252270</td>\n",
       "      <td>1.280740</td>\n",
       "      <td>0.086567</td>\n",
       "      <td>0.334008</td>\n",
       "      <td>1.048464</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.454359</td>\n",
       "      <td>1.346643</td>\n",
       "      <td>-0.186762</td>\n",
       "      <td>-0.361594</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.549959</td>\n",
       "      <td>-0.491451</td>\n",
       "      <td>0.277281</td>\n",
       "      <td>0.225857</td>\n",
       "      <td>0.400321</td>\n",
       "      <td>-0.485365</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.544367</td>\n",
       "      <td>1.634042</td>\n",
       "      <td>-0.848812</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.438931</td>\n",
       "      <td>0.250021</td>\n",
       "      <td>0.005658</td>\n",
       "      <td>1.065706</td>\n",
       "      <td>-0.310341</td>\n",
       "      <td>0.060549</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.62873</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>0.088681</td>\n",
       "      <td>0.203899</td>\n",
       "      <td>0.261918</td>\n",
       "      <td>0.192734</td>\n",
       "      <td>-0.244333</td>\n",
       "      <td>0.169655</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.223638</td>\n",
       "      <td>0.358561</td>\n",
       "      <td>-0.314030</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.411541</td>\n",
       "      <td>0.043151</td>\n",
       "      <td>0.461451</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.300528</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "      <td>-0.846555</td>\n",
       "      <td>-0.286751</td>\n",
       "      <td>-0.468015</td>\n",
       "      <td>0.249142</td>\n",
       "      <td>0.013797</td>\n",
       "      <td>-0.606966</td>\n",
       "      <td>-0.303256</td>\n",
       "      <td>-0.398076</td>\n",
       "      <td>1.017079</td>\n",
       "      <td>-0.385280</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.423702</td>\n",
       "      <td>-0.524652</td>\n",
       "      <td>0.111429</td>\n",
       "      <td>0.172027</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.102475</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>0.539019</td>\n",
       "      <td>-0.098589</td>\n",
       "      <td>0.605331</td>\n",
       "      <td>0.571185</td>\n",
       "      <td>0.178541</td>\n",
       "      <td>-0.567123</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.053186</td>\n",
       "      <td>0.390269</td>\n",
       "      <td>1.059128</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.737756</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.580644</td>\n",
       "      <td>-0.108808</td>\n",
       "      <td>0.429643</td>\n",
       "      <td>-0.218494</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.314156</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 1001 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name                     ARF5            AKAP11           ARHGEF5  \\\n",
       "Database_ID ENSP00000000233.5 ENSP00000025301.2 ENSP00000056217.5   \n",
       "Patient_ID                                                          \n",
       "C3L-00006           -0.056513         -0.385278          0.188877   \n",
       "C3L-00008            0.549959         -0.491451          0.277281   \n",
       "C3L-00032            0.088681          0.203899          0.261918   \n",
       "C3L-00084           -0.846555         -0.286751         -0.468015   \n",
       "C3L-00090            0.539019         -0.098589          0.605331   \n",
       "\n",
       "Name                   APPBP2               AQR             ACAP1  \\\n",
       "Database_ID ENSP00000083182.3 ENSP00000156471.5 ENSP00000158762.3   \n",
       "Patient_ID                                                          \n",
       "C3L-00006           -0.059319          0.276154         -0.252270   \n",
       "C3L-00008            0.225857          0.400321         -0.485365   \n",
       "C3L-00032            0.192734         -0.244333          0.169655   \n",
       "C3L-00084            0.249142          0.013797         -0.606966   \n",
       "C3L-00090            0.571185          0.178541         -0.567123   \n",
       "\n",
       "Name                     ANO8             AP3M2              ASNS  \\\n",
       "Database_ID ENSP00000159087.4 ENSP00000174653.3 ENSP00000175506.4   \n",
       "Patient_ID                                                          \n",
       "C3L-00006            1.280740          0.086567          0.334008   \n",
       "C3L-00008                 NaN         -0.544367          1.634042   \n",
       "C3L-00032                 NaN          0.223638          0.358561   \n",
       "C3L-00084           -0.303256         -0.398076          1.017079   \n",
       "C3L-00090                 NaN          0.053186          0.390269   \n",
       "\n",
       "Name                  ALDH3A2  ...             ABCC2            ANKS1A  \\\n",
       "Database_ID ENSP00000176643.6  ... ENSP00000497274.1 ENSP00000497393.1   \n",
       "Patient_ID                     ...                                       \n",
       "C3L-00006            1.048464  ...               NaN          0.454359   \n",
       "C3L-00008           -0.848812  ...               NaN          0.438931   \n",
       "C3L-00032           -0.314030  ...               NaN               NaN   \n",
       "C3L-00084           -0.385280  ...               NaN          1.423702   \n",
       "C3L-00090            1.059128  ...         -0.737756               NaN   \n",
       "\n",
       "Name               AL034430.2             AP5Z1            ATP8B1  \\\n",
       "Database_ID ENSP00000497510.1 ENSP00000497815.1 ENSP00000497896.1   \n",
       "Patient_ID                                                          \n",
       "C3L-00006            1.346643         -0.186762         -0.361594   \n",
       "C3L-00008            0.250021          0.005658          1.065706   \n",
       "C3L-00032            0.411541          0.043151          0.461451   \n",
       "C3L-00084           -0.524652          0.111429          0.172027   \n",
       "C3L-00090            0.580644         -0.108808          0.429643   \n",
       "\n",
       "Name                  AKR1B15        AC004706.3            ATAD3B  \\\n",
       "Database_ID ENSP00000498877.1 ENSP00000499350.1 ENSP00000500094.1   \n",
       "Patient_ID                                                          \n",
       "C3L-00006                 NaN               NaN               NaN   \n",
       "C3L-00008           -0.310341          0.060549               NaN   \n",
       "C3L-00032                 NaN               NaN          0.300528   \n",
       "C3L-00084                 NaN               NaN          0.102475   \n",
       "C3L-00090           -0.218494               NaN          0.314156   \n",
       "\n",
       "Name                     ANK2                    \n",
       "Database_ID ENSP00000500102.1 ENSP00000500937.1  \n",
       "Patient_ID                                       \n",
       "C3L-00006                 NaN               NaN  \n",
       "C3L-00008            -0.62873               NaN  \n",
       "C3L-00032                 NaN               NaN  \n",
       "C3L-00084                 NaN               NaN  \n",
       "C3L-00090                 NaN               NaN  \n",
       "\n",
       "[5 rows x 1001 columns]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gene1_filter = prot.columns.get_level_values(\"Name\").str.startswith(\"A\") # Select all columns where the gene starts with \"A\". This will grab every column where the key \"Name\" starts with AA\n",
    "gene1_data = prot.loc[:, gene1_filter]\n",
    "gene1_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Selecting based on a different level of the multiindex\n",
    "We can also select based on one of the inner levels of the multiindex. For example, to get data for all tyrosine phosphorylation sites:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Empty DataFrame\n",
       "Columns: []\n",
       "Index: [C3L-00006, C3L-00008, C3L-00032, C3L-00084, C3L-00090]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_site_filter = prot.columns.get_level_values(\"Database_ID\").str.contains(\"ENSp\") # Create a boolean filter selecting all columns where the Site level contains a \"Y\"\n",
    "\n",
    "y_sites = prot.loc[:, y_site_filter] # Select the columns\n",
    "y_sites.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to use `cptac.utils.reduce_multiindex()`\n",
    "To make it easier to work with multi-level indices, we provide the `reduce_multiindex` function, available for import from the `cptac.utils` submodule. It can both drop levels from a multiindex, and \"flatten\" a multi-level index into a single-level index by concatenating the keys from multiple levels into a single key for each column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "import cptac.utils as ut"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dropping Levels\n",
    "We can drop levels based on index or name. We can also drop single or multiple levels at once. \n",
    "Note that it will warn you if duplicate column key combinations arise due to dropping levels. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dropping by index or name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "cptac warning: Due to dropping the specified levels, dataframe now has 1299 duplicated column headers. (C:\\Users\\sabme\\AppData\\Local\\Temp\\ipykernel_21892\\2675409348.py, line 1)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Name</th>\n",
       "      <th>ARF5</th>\n",
       "      <th>M6PR</th>\n",
       "      <th>ESRRA</th>\n",
       "      <th>FKBP4</th>\n",
       "      <th>NDUFAF7</th>\n",
       "      <th>FUCA2</th>\n",
       "      <th>DBNDD1</th>\n",
       "      <th>SEMA3F</th>\n",
       "      <th>CFTR</th>\n",
       "      <th>CYP51A1</th>\n",
       "      <th>...</th>\n",
       "      <th>SCRIB</th>\n",
       "      <th>WIZ</th>\n",
       "      <th>BPIFB4</th>\n",
       "      <th>LDB1</th>\n",
       "      <th>WIZ</th>\n",
       "      <th>TSGA10</th>\n",
       "      <th>RFX7</th>\n",
       "      <th>SWSAP1</th>\n",
       "      <th>MSANTD2</th>\n",
       "      <th>SVIL</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>-0.056513</td>\n",
       "      <td>0.016557</td>\n",
       "      <td>0.002569</td>\n",
       "      <td>0.389819</td>\n",
       "      <td>0.603610</td>\n",
       "      <td>-0.332543</td>\n",
       "      <td>-0.790426</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.822732</td>\n",
       "      <td>0.039134</td>\n",
       "      <td>...</td>\n",
       "      <td>0.161720</td>\n",
       "      <td>-0.884807</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.268247</td>\n",
       "      <td>0.125392</td>\n",
       "      <td>-0.880833</td>\n",
       "      <td>0.108554</td>\n",
       "      <td>0.107413</td>\n",
       "      <td>-0.085833</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.549959</td>\n",
       "      <td>-0.206129</td>\n",
       "      <td>0.905784</td>\n",
       "      <td>-0.303631</td>\n",
       "      <td>0.018767</td>\n",
       "      <td>0.503513</td>\n",
       "      <td>0.950955</td>\n",
       "      <td>0.080142</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.063213</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.054284</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.106450</td>\n",
       "      <td>0.380557</td>\n",
       "      <td>-0.756099</td>\n",
       "      <td>0.264611</td>\n",
       "      <td>0.044423</td>\n",
       "      <td>-0.248319</td>\n",
       "      <td>-1.206596</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>0.088681</td>\n",
       "      <td>-0.154447</td>\n",
       "      <td>-0.190515</td>\n",
       "      <td>0.170753</td>\n",
       "      <td>0.196356</td>\n",
       "      <td>0.544194</td>\n",
       "      <td>-0.179078</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.377405</td>\n",
       "      <td>...</td>\n",
       "      <td>-1.086905</td>\n",
       "      <td>0.055991</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.021986</td>\n",
       "      <td>-0.229645</td>\n",
       "      <td>1.923986</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.176694</td>\n",
       "      <td>-0.332384</td>\n",
       "      <td>-1.330653</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "      <td>-0.846555</td>\n",
       "      <td>0.027740</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.178700</td>\n",
       "      <td>0.264054</td>\n",
       "      <td>-0.183548</td>\n",
       "      <td>0.077215</td>\n",
       "      <td>-0.247164</td>\n",
       "      <td>0.152277</td>\n",
       "      <td>-0.279549</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.125796</td>\n",
       "      <td>0.944212</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.917409</td>\n",
       "      <td>0.026862</td>\n",
       "      <td>-0.885976</td>\n",
       "      <td>-0.006510</td>\n",
       "      <td>-0.014162</td>\n",
       "      <td>0.365158</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>0.539019</td>\n",
       "      <td>0.956619</td>\n",
       "      <td>-0.039516</td>\n",
       "      <td>0.323656</td>\n",
       "      <td>0.064605</td>\n",
       "      <td>0.173433</td>\n",
       "      <td>-0.524325</td>\n",
       "      <td>-0.038590</td>\n",
       "      <td>-0.311486</td>\n",
       "      <td>0.309905</td>\n",
       "      <td>...</td>\n",
       "      <td>0.853362</td>\n",
       "      <td>-0.716947</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.286277</td>\n",
       "      <td>-0.046076</td>\n",
       "      <td>0.089645</td>\n",
       "      <td>-0.444506</td>\n",
       "      <td>-0.072531</td>\n",
       "      <td>-0.463495</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 12662 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name            ARF5      M6PR     ESRRA     FKBP4   NDUFAF7     FUCA2  \\\n",
       "Patient_ID                                                               \n",
       "C3L-00006  -0.056513  0.016557  0.002569  0.389819  0.603610 -0.332543   \n",
       "C3L-00008   0.549959 -0.206129  0.905784 -0.303631  0.018767  0.503513   \n",
       "C3L-00032   0.088681 -0.154447 -0.190515  0.170753  0.196356  0.544194   \n",
       "C3L-00084  -0.846555  0.027740       NaN  0.178700  0.264054 -0.183548   \n",
       "C3L-00090   0.539019  0.956619 -0.039516  0.323656  0.064605  0.173433   \n",
       "\n",
       "Name          DBNDD1    SEMA3F      CFTR   CYP51A1  ...     SCRIB       WIZ  \\\n",
       "Patient_ID                                          ...                       \n",
       "C3L-00006  -0.790426       NaN  0.822732  0.039134  ...  0.161720 -0.884807   \n",
       "C3L-00008   0.950955  0.080142       NaN -0.063213  ...       NaN  0.054284   \n",
       "C3L-00032  -0.179078       NaN       NaN  0.377405  ... -1.086905  0.055991   \n",
       "C3L-00084   0.077215 -0.247164  0.152277 -0.279549  ... -0.125796  0.944212   \n",
       "C3L-00090  -0.524325 -0.038590 -0.311486  0.309905  ...  0.853362 -0.716947   \n",
       "\n",
       "Name        BPIFB4      LDB1       WIZ    TSGA10      RFX7    SWSAP1  \\\n",
       "Patient_ID                                                             \n",
       "C3L-00006      NaN  0.268247  0.125392 -0.880833  0.108554  0.107413   \n",
       "C3L-00008      NaN -0.106450  0.380557 -0.756099  0.264611  0.044423   \n",
       "C3L-00032      NaN -0.021986 -0.229645  1.923986       NaN -0.176694   \n",
       "C3L-00084      NaN  0.917409  0.026862 -0.885976 -0.006510 -0.014162   \n",
       "C3L-00090      NaN -0.286277 -0.046076  0.089645 -0.444506 -0.072531   \n",
       "\n",
       "Name         MSANTD2      SVIL  \n",
       "Patient_ID                      \n",
       "C3L-00006  -0.085833       NaN  \n",
       "C3L-00008  -0.248319 -1.206596  \n",
       "C3L-00032  -0.332384 -1.330653  \n",
       "C3L-00084   0.365158       NaN  \n",
       "C3L-00090  -0.463495       NaN  \n",
       "\n",
       "[5 rows x 12662 columns]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ut.reduce_multiindex(df=prot, levels_to_drop=\"Database_ID\").head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dropping single or multiple levels at once\n",
    "By passing a list (or array-like) to levels_to drop, we can drop multiple levels of the multiindex at the same time. Note that we must leave at least one existing level. \n",
    "\n",
    "We will show this with the colon data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>Name</th>\n",
       "      <th>ARF5</th>\n",
       "      <th>M6PR</th>\n",
       "      <th>ESRRA</th>\n",
       "      <th>FKBP4</th>\n",
       "      <th>NDUFAF7</th>\n",
       "      <th>FUCA2</th>\n",
       "      <th>CFTR</th>\n",
       "      <th>CYP51A1</th>\n",
       "      <th>USP28</th>\n",
       "      <th>TMEM176A</th>\n",
       "      <th>...</th>\n",
       "      <th>TMUB1</th>\n",
       "      <th>CSNK1A1</th>\n",
       "      <th>MICAL2</th>\n",
       "      <th>ANK2</th>\n",
       "      <th>SEPTIN7</th>\n",
       "      <th>ATAD3B</th>\n",
       "      <th>ETNK1</th>\n",
       "      <th>MYO6</th>\n",
       "      <th>WIZ</th>\n",
       "      <th>HSPA12A</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSP00000000233.5</th>\n",
       "      <th>ENSP00000000412.3</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000001008.4</th>\n",
       "      <th>ENSP00000002125.4</th>\n",
       "      <th>ENSP00000002165.5</th>\n",
       "      <th>ENSP00000003084.6</th>\n",
       "      <th>ENSP00000003100.8</th>\n",
       "      <th>ENSP00000003302.4</th>\n",
       "      <th>ENSP00000004103.3</th>\n",
       "      <th>...</th>\n",
       "      <th>ENSP00000499339.1</th>\n",
       "      <th>ENSP00000499757.1</th>\n",
       "      <th>ENSP00000499778.1</th>\n",
       "      <th>ENSP00000499869.1</th>\n",
       "      <th>ENSP00000499937.1</th>\n",
       "      <th>ENSP00000500094.1</th>\n",
       "      <th>ENSP00000500633.1</th>\n",
       "      <th>ENSP00000500710.1</th>\n",
       "      <th>ENSP00000501300.1</th>\n",
       "      <th>ENSP00000501491.1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01CO005</th>\n",
       "      <td>-0.203037</td>\n",
       "      <td>-0.223341</td>\n",
       "      <td>-0.283633</td>\n",
       "      <td>-0.612614</td>\n",
       "      <td>0.514855</td>\n",
       "      <td>-0.824026</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.045383</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.248511</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.042548</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.925011</td>\n",
       "      <td>-0.173468</td>\n",
       "      <td>-0.180521</td>\n",
       "      <td>0.139707</td>\n",
       "      <td>-0.882283</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO006</th>\n",
       "      <td>0.188931</td>\n",
       "      <td>0.544620</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.571640</td>\n",
       "      <td>-0.209734</td>\n",
       "      <td>0.799090</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.338493</td>\n",
       "      <td>-0.042567</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.411664</td>\n",
       "      <td>-0.454109</td>\n",
       "      <td>-0.725892</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.707588</td>\n",
       "      <td>-0.846624</td>\n",
       "      <td>0.329813</td>\n",
       "      <td>-0.311147</td>\n",
       "      <td>-0.446358</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO008</th>\n",
       "      <td>0.404810</td>\n",
       "      <td>-0.246523</td>\n",
       "      <td>-0.053940</td>\n",
       "      <td>0.252995</td>\n",
       "      <td>0.190861</td>\n",
       "      <td>0.101419</td>\n",
       "      <td>-0.502876</td>\n",
       "      <td>0.627060</td>\n",
       "      <td>0.089815</td>\n",
       "      <td>-0.106411</td>\n",
       "      <td>...</td>\n",
       "      <td>0.192279</td>\n",
       "      <td>-0.558236</td>\n",
       "      <td>-0.093708</td>\n",
       "      <td>-1.874293</td>\n",
       "      <td>-0.248307</td>\n",
       "      <td>-0.899186</td>\n",
       "      <td>-0.526260</td>\n",
       "      <td>0.668713</td>\n",
       "      <td>0.109366</td>\n",
       "      <td>-1.125296</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO013</th>\n",
       "      <td>-0.276982</td>\n",
       "      <td>-0.017659</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.455055</td>\n",
       "      <td>0.500686</td>\n",
       "      <td>-0.350366</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.263168</td>\n",
       "      <td>0.683830</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>0.220231</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.241860</td>\n",
       "      <td>-3.939263</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.514931</td>\n",
       "      <td>-0.078267</td>\n",
       "      <td>0.122032</td>\n",
       "      <td>0.130764</td>\n",
       "      <td>-1.146911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO014</th>\n",
       "      <td>-0.160155</td>\n",
       "      <td>0.100022</td>\n",
       "      <td>0.259696</td>\n",
       "      <td>0.341345</td>\n",
       "      <td>-0.310265</td>\n",
       "      <td>0.095461</td>\n",
       "      <td>-0.745855</td>\n",
       "      <td>1.006614</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.198671</td>\n",
       "      <td>0.226146</td>\n",
       "      <td>0.036229</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.189468</td>\n",
       "      <td>0.117736</td>\n",
       "      <td>0.586529</td>\n",
       "      <td>-0.006767</td>\n",
       "      <td>-1.106068</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 9457 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name                     ARF5              M6PR             ESRRA  \\\n",
       "Database_ID ENSP00000000233.5 ENSP00000000412.3 ENSP00000000442.6   \n",
       "Patient_ID                                                          \n",
       "01CO005             -0.203037         -0.223341         -0.283633   \n",
       "01CO006              0.188931          0.544620               NaN   \n",
       "01CO008              0.404810         -0.246523         -0.053940   \n",
       "01CO013             -0.276982         -0.017659               NaN   \n",
       "01CO014             -0.160155          0.100022          0.259696   \n",
       "\n",
       "Name                    FKBP4           NDUFAF7             FUCA2  \\\n",
       "Database_ID ENSP00000001008.4 ENSP00000002125.4 ENSP00000002165.5   \n",
       "Patient_ID                                                          \n",
       "01CO005             -0.612614          0.514855         -0.824026   \n",
       "01CO006             -0.571640         -0.209734          0.799090   \n",
       "01CO008              0.252995          0.190861          0.101419   \n",
       "01CO013             -0.455055          0.500686         -0.350366   \n",
       "01CO014              0.341345         -0.310265          0.095461   \n",
       "\n",
       "Name                     CFTR           CYP51A1             USP28  \\\n",
       "Database_ID ENSP00000003084.6 ENSP00000003100.8 ENSP00000003302.4   \n",
       "Patient_ID                                                          \n",
       "01CO005                   NaN          0.045383               NaN   \n",
       "01CO006                   NaN         -0.338493         -0.042567   \n",
       "01CO008             -0.502876          0.627060          0.089815   \n",
       "01CO013                   NaN          0.263168          0.683830   \n",
       "01CO014             -0.745855          1.006614               NaN   \n",
       "\n",
       "Name                 TMEM176A  ...             TMUB1           CSNK1A1  \\\n",
       "Database_ID ENSP00000004103.3  ... ENSP00000499339.1 ENSP00000499757.1   \n",
       "Patient_ID                     ...                                       \n",
       "01CO005             -0.248511  ...               NaN               NaN   \n",
       "01CO006                   NaN  ...         -0.411664         -0.454109   \n",
       "01CO008             -0.106411  ...          0.192279         -0.558236   \n",
       "01CO013                   NaN  ...          0.220231               NaN   \n",
       "01CO014                   NaN  ...         -0.198671          0.226146   \n",
       "\n",
       "Name                   MICAL2              ANK2           SEPTIN7  \\\n",
       "Database_ID ENSP00000499778.1 ENSP00000499869.1 ENSP00000499937.1   \n",
       "Patient_ID                                                          \n",
       "01CO005             -0.042548               NaN               NaN   \n",
       "01CO006             -0.725892               NaN               NaN   \n",
       "01CO008             -0.093708         -1.874293         -0.248307   \n",
       "01CO013              0.241860         -3.939263               NaN   \n",
       "01CO014              0.036229               NaN               NaN   \n",
       "\n",
       "Name                   ATAD3B             ETNK1              MYO6  \\\n",
       "Database_ID ENSP00000500094.1 ENSP00000500633.1 ENSP00000500710.1   \n",
       "Patient_ID                                                          \n",
       "01CO005              0.925011         -0.173468         -0.180521   \n",
       "01CO006             -0.707588         -0.846624          0.329813   \n",
       "01CO008             -0.899186         -0.526260          0.668713   \n",
       "01CO013              0.514931         -0.078267          0.122032   \n",
       "01CO014              1.189468          0.117736          0.586529   \n",
       "\n",
       "Name                      WIZ           HSPA12A  \n",
       "Database_ID ENSP00000501300.1 ENSP00000501491.1  \n",
       "Patient_ID                                       \n",
       "01CO005              0.139707         -0.882283  \n",
       "01CO006             -0.311147         -0.446358  \n",
       "01CO008              0.109366         -1.125296  \n",
       "01CO013              0.130764         -1.146911  \n",
       "01CO014             -0.006767         -1.106068  \n",
       "\n",
       "[5 rows x 9457 columns]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "colon = cptac.Coad()\n",
    "prot = colon.get_proteomics('umich')\n",
    "prot.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSP00000000233.5</th>\n",
       "      <th>ENSP00000000412.3</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000001008.4</th>\n",
       "      <th>ENSP00000002125.4</th>\n",
       "      <th>ENSP00000002165.5</th>\n",
       "      <th>ENSP00000003084.6</th>\n",
       "      <th>ENSP00000003100.8</th>\n",
       "      <th>ENSP00000003302.4</th>\n",
       "      <th>ENSP00000004103.3</th>\n",
       "      <th>...</th>\n",
       "      <th>ENSP00000499339.1</th>\n",
       "      <th>ENSP00000499757.1</th>\n",
       "      <th>ENSP00000499778.1</th>\n",
       "      <th>ENSP00000499869.1</th>\n",
       "      <th>ENSP00000499937.1</th>\n",
       "      <th>ENSP00000500094.1</th>\n",
       "      <th>ENSP00000500633.1</th>\n",
       "      <th>ENSP00000500710.1</th>\n",
       "      <th>ENSP00000501300.1</th>\n",
       "      <th>ENSP00000501491.1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01CO005</th>\n",
       "      <td>-0.203037</td>\n",
       "      <td>-0.223341</td>\n",
       "      <td>-0.283633</td>\n",
       "      <td>-0.612614</td>\n",
       "      <td>0.514855</td>\n",
       "      <td>-0.824026</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.045383</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.248511</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.042548</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.925011</td>\n",
       "      <td>-0.173468</td>\n",
       "      <td>-0.180521</td>\n",
       "      <td>0.139707</td>\n",
       "      <td>-0.882283</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO006</th>\n",
       "      <td>0.188931</td>\n",
       "      <td>0.544620</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.571640</td>\n",
       "      <td>-0.209734</td>\n",
       "      <td>0.799090</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.338493</td>\n",
       "      <td>-0.042567</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.411664</td>\n",
       "      <td>-0.454109</td>\n",
       "      <td>-0.725892</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.707588</td>\n",
       "      <td>-0.846624</td>\n",
       "      <td>0.329813</td>\n",
       "      <td>-0.311147</td>\n",
       "      <td>-0.446358</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO008</th>\n",
       "      <td>0.404810</td>\n",
       "      <td>-0.246523</td>\n",
       "      <td>-0.053940</td>\n",
       "      <td>0.252995</td>\n",
       "      <td>0.190861</td>\n",
       "      <td>0.101419</td>\n",
       "      <td>-0.502876</td>\n",
       "      <td>0.627060</td>\n",
       "      <td>0.089815</td>\n",
       "      <td>-0.106411</td>\n",
       "      <td>...</td>\n",
       "      <td>0.192279</td>\n",
       "      <td>-0.558236</td>\n",
       "      <td>-0.093708</td>\n",
       "      <td>-1.874293</td>\n",
       "      <td>-0.248307</td>\n",
       "      <td>-0.899186</td>\n",
       "      <td>-0.526260</td>\n",
       "      <td>0.668713</td>\n",
       "      <td>0.109366</td>\n",
       "      <td>-1.125296</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO013</th>\n",
       "      <td>-0.276982</td>\n",
       "      <td>-0.017659</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.455055</td>\n",
       "      <td>0.500686</td>\n",
       "      <td>-0.350366</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.263168</td>\n",
       "      <td>0.683830</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>0.220231</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.241860</td>\n",
       "      <td>-3.939263</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.514931</td>\n",
       "      <td>-0.078267</td>\n",
       "      <td>0.122032</td>\n",
       "      <td>0.130764</td>\n",
       "      <td>-1.146911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO014</th>\n",
       "      <td>-0.160155</td>\n",
       "      <td>0.100022</td>\n",
       "      <td>0.259696</td>\n",
       "      <td>0.341345</td>\n",
       "      <td>-0.310265</td>\n",
       "      <td>0.095461</td>\n",
       "      <td>-0.745855</td>\n",
       "      <td>1.006614</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.198671</td>\n",
       "      <td>0.226146</td>\n",
       "      <td>0.036229</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.189468</td>\n",
       "      <td>0.117736</td>\n",
       "      <td>0.586529</td>\n",
       "      <td>-0.006767</td>\n",
       "      <td>-1.106068</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 9457 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Database_ID  ENSP00000000233.5  ENSP00000000412.3  ENSP00000000442.6  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.203037          -0.223341          -0.283633   \n",
       "01CO006               0.188931           0.544620                NaN   \n",
       "01CO008               0.404810          -0.246523          -0.053940   \n",
       "01CO013              -0.276982          -0.017659                NaN   \n",
       "01CO014              -0.160155           0.100022           0.259696   \n",
       "\n",
       "Database_ID  ENSP00000001008.4  ENSP00000002125.4  ENSP00000002165.5  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.612614           0.514855          -0.824026   \n",
       "01CO006              -0.571640          -0.209734           0.799090   \n",
       "01CO008               0.252995           0.190861           0.101419   \n",
       "01CO013              -0.455055           0.500686          -0.350366   \n",
       "01CO014               0.341345          -0.310265           0.095461   \n",
       "\n",
       "Database_ID  ENSP00000003084.6  ENSP00000003100.8  ENSP00000003302.4  \\\n",
       "Patient_ID                                                             \n",
       "01CO005                    NaN           0.045383                NaN   \n",
       "01CO006                    NaN          -0.338493          -0.042567   \n",
       "01CO008              -0.502876           0.627060           0.089815   \n",
       "01CO013                    NaN           0.263168           0.683830   \n",
       "01CO014              -0.745855           1.006614                NaN   \n",
       "\n",
       "Database_ID  ENSP00000004103.3  ...  ENSP00000499339.1  ENSP00000499757.1  \\\n",
       "Patient_ID                      ...                                         \n",
       "01CO005              -0.248511  ...                NaN                NaN   \n",
       "01CO006                    NaN  ...          -0.411664          -0.454109   \n",
       "01CO008              -0.106411  ...           0.192279          -0.558236   \n",
       "01CO013                    NaN  ...           0.220231                NaN   \n",
       "01CO014                    NaN  ...          -0.198671           0.226146   \n",
       "\n",
       "Database_ID  ENSP00000499778.1  ENSP00000499869.1  ENSP00000499937.1  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.042548                NaN                NaN   \n",
       "01CO006              -0.725892                NaN                NaN   \n",
       "01CO008              -0.093708          -1.874293          -0.248307   \n",
       "01CO013               0.241860          -3.939263                NaN   \n",
       "01CO014               0.036229                NaN                NaN   \n",
       "\n",
       "Database_ID  ENSP00000500094.1  ENSP00000500633.1  ENSP00000500710.1  \\\n",
       "Patient_ID                                                             \n",
       "01CO005               0.925011          -0.173468          -0.180521   \n",
       "01CO006              -0.707588          -0.846624           0.329813   \n",
       "01CO008              -0.899186          -0.526260           0.668713   \n",
       "01CO013               0.514931          -0.078267           0.122032   \n",
       "01CO014               1.189468           0.117736           0.586529   \n",
       "\n",
       "Database_ID  ENSP00000501300.1  ENSP00000501491.1  \n",
       "Patient_ID                                         \n",
       "01CO005               0.139707          -0.882283  \n",
       "01CO006              -0.311147          -0.446358  \n",
       "01CO008               0.109366          -1.125296  \n",
       "01CO013               0.130764          -1.146911  \n",
       "01CO014              -0.006767          -1.106068  \n",
       "\n",
       "[5 rows x 9457 columns]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Drop level 'Name'\n",
    "ut.reduce_multiindex(df=prot, levels_to_drop='Name').head()\n",
    "#You can also pass a list in order to drop multiple levels"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Combining levels (Flattening)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can combine levels of a multiindexed dataframe. When combined the levels will be sepereated by an underscore, by default. We could specify a different seperator using the `sep` parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Name</th>\n",
       "      <th>ARF5_ENSP00000000233.5</th>\n",
       "      <th>M6PR_ENSP00000000412.3</th>\n",
       "      <th>ESRRA_ENSP00000000442.6</th>\n",
       "      <th>FKBP4_ENSP00000001008.4</th>\n",
       "      <th>NDUFAF7_ENSP00000002125.4</th>\n",
       "      <th>FUCA2_ENSP00000002165.5</th>\n",
       "      <th>CFTR_ENSP00000003084.6</th>\n",
       "      <th>CYP51A1_ENSP00000003100.8</th>\n",
       "      <th>USP28_ENSP00000003302.4</th>\n",
       "      <th>TMEM176A_ENSP00000004103.3</th>\n",
       "      <th>...</th>\n",
       "      <th>TMUB1_ENSP00000499339.1</th>\n",
       "      <th>CSNK1A1_ENSP00000499757.1</th>\n",
       "      <th>MICAL2_ENSP00000499778.1</th>\n",
       "      <th>ANK2_ENSP00000499869.1</th>\n",
       "      <th>SEPTIN7_ENSP00000499937.1</th>\n",
       "      <th>ATAD3B_ENSP00000500094.1</th>\n",
       "      <th>ETNK1_ENSP00000500633.1</th>\n",
       "      <th>MYO6_ENSP00000500710.1</th>\n",
       "      <th>WIZ_ENSP00000501300.1</th>\n",
       "      <th>HSPA12A_ENSP00000501491.1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01CO005</th>\n",
       "      <td>-0.203037</td>\n",
       "      <td>-0.223341</td>\n",
       "      <td>-0.283633</td>\n",
       "      <td>-0.612614</td>\n",
       "      <td>0.514855</td>\n",
       "      <td>-0.824026</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.045383</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.248511</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.042548</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.925011</td>\n",
       "      <td>-0.173468</td>\n",
       "      <td>-0.180521</td>\n",
       "      <td>0.139707</td>\n",
       "      <td>-0.882283</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO006</th>\n",
       "      <td>0.188931</td>\n",
       "      <td>0.544620</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.571640</td>\n",
       "      <td>-0.209734</td>\n",
       "      <td>0.799090</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.338493</td>\n",
       "      <td>-0.042567</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.411664</td>\n",
       "      <td>-0.454109</td>\n",
       "      <td>-0.725892</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.707588</td>\n",
       "      <td>-0.846624</td>\n",
       "      <td>0.329813</td>\n",
       "      <td>-0.311147</td>\n",
       "      <td>-0.446358</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO008</th>\n",
       "      <td>0.404810</td>\n",
       "      <td>-0.246523</td>\n",
       "      <td>-0.053940</td>\n",
       "      <td>0.252995</td>\n",
       "      <td>0.190861</td>\n",
       "      <td>0.101419</td>\n",
       "      <td>-0.502876</td>\n",
       "      <td>0.627060</td>\n",
       "      <td>0.089815</td>\n",
       "      <td>-0.106411</td>\n",
       "      <td>...</td>\n",
       "      <td>0.192279</td>\n",
       "      <td>-0.558236</td>\n",
       "      <td>-0.093708</td>\n",
       "      <td>-1.874293</td>\n",
       "      <td>-0.248307</td>\n",
       "      <td>-0.899186</td>\n",
       "      <td>-0.526260</td>\n",
       "      <td>0.668713</td>\n",
       "      <td>0.109366</td>\n",
       "      <td>-1.125296</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO013</th>\n",
       "      <td>-0.276982</td>\n",
       "      <td>-0.017659</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.455055</td>\n",
       "      <td>0.500686</td>\n",
       "      <td>-0.350366</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.263168</td>\n",
       "      <td>0.683830</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>0.220231</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.241860</td>\n",
       "      <td>-3.939263</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.514931</td>\n",
       "      <td>-0.078267</td>\n",
       "      <td>0.122032</td>\n",
       "      <td>0.130764</td>\n",
       "      <td>-1.146911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO014</th>\n",
       "      <td>-0.160155</td>\n",
       "      <td>0.100022</td>\n",
       "      <td>0.259696</td>\n",
       "      <td>0.341345</td>\n",
       "      <td>-0.310265</td>\n",
       "      <td>0.095461</td>\n",
       "      <td>-0.745855</td>\n",
       "      <td>1.006614</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.198671</td>\n",
       "      <td>0.226146</td>\n",
       "      <td>0.036229</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.189468</td>\n",
       "      <td>0.117736</td>\n",
       "      <td>0.586529</td>\n",
       "      <td>-0.006767</td>\n",
       "      <td>-1.106068</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 9457 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name        ARF5_ENSP00000000233.5  M6PR_ENSP00000000412.3  \\\n",
       "Patient_ID                                                   \n",
       "01CO005                  -0.203037               -0.223341   \n",
       "01CO006                   0.188931                0.544620   \n",
       "01CO008                   0.404810               -0.246523   \n",
       "01CO013                  -0.276982               -0.017659   \n",
       "01CO014                  -0.160155                0.100022   \n",
       "\n",
       "Name        ESRRA_ENSP00000000442.6  FKBP4_ENSP00000001008.4  \\\n",
       "Patient_ID                                                     \n",
       "01CO005                   -0.283633                -0.612614   \n",
       "01CO006                         NaN                -0.571640   \n",
       "01CO008                   -0.053940                 0.252995   \n",
       "01CO013                         NaN                -0.455055   \n",
       "01CO014                    0.259696                 0.341345   \n",
       "\n",
       "Name        NDUFAF7_ENSP00000002125.4  FUCA2_ENSP00000002165.5  \\\n",
       "Patient_ID                                                       \n",
       "01CO005                      0.514855                -0.824026   \n",
       "01CO006                     -0.209734                 0.799090   \n",
       "01CO008                      0.190861                 0.101419   \n",
       "01CO013                      0.500686                -0.350366   \n",
       "01CO014                     -0.310265                 0.095461   \n",
       "\n",
       "Name        CFTR_ENSP00000003084.6  CYP51A1_ENSP00000003100.8  \\\n",
       "Patient_ID                                                      \n",
       "01CO005                        NaN                   0.045383   \n",
       "01CO006                        NaN                  -0.338493   \n",
       "01CO008                  -0.502876                   0.627060   \n",
       "01CO013                        NaN                   0.263168   \n",
       "01CO014                  -0.745855                   1.006614   \n",
       "\n",
       "Name        USP28_ENSP00000003302.4  TMEM176A_ENSP00000004103.3  ...  \\\n",
       "Patient_ID                                                       ...   \n",
       "01CO005                         NaN                   -0.248511  ...   \n",
       "01CO006                   -0.042567                         NaN  ...   \n",
       "01CO008                    0.089815                   -0.106411  ...   \n",
       "01CO013                    0.683830                         NaN  ...   \n",
       "01CO014                         NaN                         NaN  ...   \n",
       "\n",
       "Name        TMUB1_ENSP00000499339.1  CSNK1A1_ENSP00000499757.1  \\\n",
       "Patient_ID                                                       \n",
       "01CO005                         NaN                        NaN   \n",
       "01CO006                   -0.411664                  -0.454109   \n",
       "01CO008                    0.192279                  -0.558236   \n",
       "01CO013                    0.220231                        NaN   \n",
       "01CO014                   -0.198671                   0.226146   \n",
       "\n",
       "Name        MICAL2_ENSP00000499778.1  ANK2_ENSP00000499869.1  \\\n",
       "Patient_ID                                                     \n",
       "01CO005                    -0.042548                     NaN   \n",
       "01CO006                    -0.725892                     NaN   \n",
       "01CO008                    -0.093708               -1.874293   \n",
       "01CO013                     0.241860               -3.939263   \n",
       "01CO014                     0.036229                     NaN   \n",
       "\n",
       "Name        SEPTIN7_ENSP00000499937.1  ATAD3B_ENSP00000500094.1  \\\n",
       "Patient_ID                                                        \n",
       "01CO005                           NaN                  0.925011   \n",
       "01CO006                           NaN                 -0.707588   \n",
       "01CO008                     -0.248307                 -0.899186   \n",
       "01CO013                           NaN                  0.514931   \n",
       "01CO014                           NaN                  1.189468   \n",
       "\n",
       "Name        ETNK1_ENSP00000500633.1  MYO6_ENSP00000500710.1  \\\n",
       "Patient_ID                                                    \n",
       "01CO005                   -0.173468               -0.180521   \n",
       "01CO006                   -0.846624                0.329813   \n",
       "01CO008                   -0.526260                0.668713   \n",
       "01CO013                   -0.078267                0.122032   \n",
       "01CO014                    0.117736                0.586529   \n",
       "\n",
       "Name        WIZ_ENSP00000501300.1  HSPA12A_ENSP00000501491.1  \n",
       "Patient_ID                                                    \n",
       "01CO005                  0.139707                  -0.882283  \n",
       "01CO006                 -0.311147                  -0.446358  \n",
       "01CO008                  0.109366                  -1.125296  \n",
       "01CO013                  0.130764                  -1.146911  \n",
       "01CO014                 -0.006767                  -1.106068  \n",
       "\n",
       "[5 rows x 9457 columns]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ut.reduce_multiindex(df=prot, flatten=True).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When flatteing levels , NaNs and empty strings will automitically be dropped."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>Name</th>\n",
       "      <th>A1BG_washu_CNV</th>\n",
       "      <th>A1CF_washu_CNV</th>\n",
       "      <th>A2M_washu_CNV</th>\n",
       "      <th>A2ML1_washu_CNV</th>\n",
       "      <th>A3GALT2_washu_CNV</th>\n",
       "      <th>A4GALT_washu_CNV</th>\n",
       "      <th>A4GNT_washu_CNV</th>\n",
       "      <th>AAAS_washu_CNV</th>\n",
       "      <th>AACS_washu_CNV</th>\n",
       "      <th>AADAC_washu_CNV</th>\n",
       "      <th>...</th>\n",
       "      <th colspan=\"2\" halign=\"left\">SCRIB_umich_phosphoproteomics</th>\n",
       "      <th colspan=\"6\" halign=\"left\">TSGA10_umich_phosphoproteomics</th>\n",
       "      <th colspan=\"2\" halign=\"left\">SVIL_umich_phosphoproteomics</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Site</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>...</th>\n",
       "      <th>S1575T1588S1594</th>\n",
       "      <th>S1594</th>\n",
       "      <th>S11</th>\n",
       "      <th>S173</th>\n",
       "      <th>S213</th>\n",
       "      <th>S391</th>\n",
       "      <th>S779</th>\n",
       "      <th>S101</th>\n",
       "      <th>S296</th>\n",
       "      <th>S459</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Peptide</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>...</th>\n",
       "      <th>LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR</th>\n",
       "      <th>LsPDFAEELR</th>\n",
       "      <th>sPGRDPELQVEAAEVTTK</th>\n",
       "      <th>sPSRLDSFVK</th>\n",
       "      <th>RPsPTAR</th>\n",
       "      <th>AMDTEsELGR</th>\n",
       "      <th>GLDRsLEENLCYR;GLDRsLEENLCYRDF</th>\n",
       "      <th>EVVSSQVDDLTsHNEHLCK</th>\n",
       "      <th>DSEGDTPsLINWPSSK</th>\n",
       "      <th>LPsPTVAR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>-0.00659</td>\n",
       "      <td>-0.01982</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01418</td>\n",
       "      <td>-0.00839</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.667426</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.905606</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.069911</td>\n",
       "      <td>-0.584774</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.561657</td>\n",
       "      <td>-0.652457</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.02578</td>\n",
       "      <td>0.00726</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.00732</td>\n",
       "      <td>0.01642</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.488427</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.431599</td>\n",
       "      <td>-1.079638</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>0.01262</td>\n",
       "      <td>0.00425</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>0.00166</td>\n",
       "      <td>0.00549</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.104862</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.439041</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.399718</td>\n",
       "      <td>-0.875016</td>\n",
       "      <td>-0.579824</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.505807</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.521725</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>0.00100</td>\n",
       "      <td>0.41191</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02436</td>\n",
       "      <td>-0.01198</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.069439</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.510268</td>\n",
       "      <td>-1.889144</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.592203</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.126482</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 104580 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name       A1BG_washu_CNV A1CF_washu_CNV A2M_washu_CNV A2ML1_washu_CNV  \\\n",
       "Site                                                                     \n",
       "Peptide                                                                  \n",
       "Patient_ID                                                               \n",
       "C3L-00006        -0.00659       -0.01982      -0.01402        -0.01402   \n",
       "C3L-00008         0.02578        0.00726       0.01350         0.01350   \n",
       "C3L-00032         0.01262        0.00425      -0.00275        -0.00275   \n",
       "C3L-00084             NaN            NaN           NaN             NaN   \n",
       "C3L-00090         0.00100        0.41191      -0.02299        -0.02299   \n",
       "\n",
       "Name       A3GALT2_washu_CNV A4GALT_washu_CNV A4GNT_washu_CNV AAAS_washu_CNV  \\\n",
       "Site                                                                           \n",
       "Peptide                                                                        \n",
       "Patient_ID                                                                     \n",
       "C3L-00006           -0.01418         -0.00839        -0.01305       -0.01402   \n",
       "C3L-00008            0.00732          0.01642         0.01005        0.01225   \n",
       "C3L-00032            0.00166          0.00549        -0.00038       -0.00275   \n",
       "C3L-00084                NaN              NaN             NaN            NaN   \n",
       "C3L-00090           -0.02436         -0.01198        -0.03307       -0.02299   \n",
       "\n",
       "Name       AACS_washu_CNV AADAC_washu_CNV  ...  \\\n",
       "Site                                       ...   \n",
       "Peptide                                    ...   \n",
       "Patient_ID                                 ...   \n",
       "C3L-00006        -0.01402        -0.01305  ...   \n",
       "C3L-00008         0.01225         0.01005  ...   \n",
       "C3L-00032        -0.00275        -0.00038  ...   \n",
       "C3L-00084             NaN             NaN  ...   \n",
       "C3L-00090        -0.02299        -0.03307  ...   \n",
       "\n",
       "Name                 SCRIB_umich_phosphoproteomics             \\\n",
       "Site                               S1575T1588S1594      S1594   \n",
       "Peptide    LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR LsPDFAEELR   \n",
       "Patient_ID                                                      \n",
       "C3L-00006                                      NaN   0.667426   \n",
       "C3L-00008                                      NaN        NaN   \n",
       "C3L-00032                                      NaN   0.104862   \n",
       "C3L-00084                                      NaN   0.399718   \n",
       "C3L-00090                                      NaN   1.069439   \n",
       "\n",
       "Name       TSGA10_umich_phosphoproteomics                                  \\\n",
       "Site                                  S11       S173      S213       S391   \n",
       "Peptide                sPGRDPELQVEAAEVTTK sPSRLDSFVK   RPsPTAR AMDTEsELGR   \n",
       "Patient_ID                                                                  \n",
       "C3L-00006                             NaN   0.905606       NaN  -0.069911   \n",
       "C3L-00008                             NaN  -0.488427       NaN        NaN   \n",
       "C3L-00032                             NaN        NaN       NaN        NaN   \n",
       "C3L-00084                       -0.875016  -0.579824       NaN        NaN   \n",
       "C3L-00090                             NaN   0.510268 -1.889144        NaN   \n",
       "\n",
       "Name                                                          \\\n",
       "Site                                S779                S101   \n",
       "Peptide    GLDRsLEENLCYR;GLDRsLEENLCYRDF EVVSSQVDDLTsHNEHLCK   \n",
       "Patient_ID                                                     \n",
       "C3L-00006                      -0.584774                 NaN   \n",
       "C3L-00008                            NaN                 NaN   \n",
       "C3L-00032                            NaN                 NaN   \n",
       "C3L-00084                      -0.505807                 NaN   \n",
       "C3L-00090                      -0.592203                 NaN   \n",
       "\n",
       "Name       SVIL_umich_phosphoproteomics            \n",
       "Site                               S296      S459  \n",
       "Peptide                DSEGDTPsLINWPSSK  LPsPTVAR  \n",
       "Patient_ID                                         \n",
       "C3L-00006                     -0.561657 -0.652457  \n",
       "C3L-00008                     -0.431599 -1.079638  \n",
       "C3L-00032                           NaN -1.439041  \n",
       "C3L-00084                           NaN -1.521725  \n",
       "C3L-00090                           NaN -1.126482  \n",
       "\n",
       "[5 rows x 104580 columns]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "phospho_and_CNV = en.join_omics_to_omics(df1_name=\"CNV\", df2_name=\"phosphoproteomics\", df1_source = 'washu', df2_source = 'umich')\n",
    "phospho_and_CNV.head()\n",
    "\n",
    "# Note that the CNV columns all have empty strings in the \"Site\" level of the columns,\n",
    "# since the CNV data doesn't have any values for that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Name</th>\n",
       "      <th>A1BG_washu_CNV</th>\n",
       "      <th>A1CF_washu_CNV</th>\n",
       "      <th>A2M_washu_CNV</th>\n",
       "      <th>A2ML1_washu_CNV</th>\n",
       "      <th>A3GALT2_washu_CNV</th>\n",
       "      <th>A4GALT_washu_CNV</th>\n",
       "      <th>A4GNT_washu_CNV</th>\n",
       "      <th>AAAS_washu_CNV</th>\n",
       "      <th>AACS_washu_CNV</th>\n",
       "      <th>AADAC_washu_CNV</th>\n",
       "      <th>...</th>\n",
       "      <th>SCRIB_umich_phosphoproteomics_S1575T1588S1594_LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR</th>\n",
       "      <th>SCRIB_umich_phosphoproteomics_S1594_LsPDFAEELR</th>\n",
       "      <th>TSGA10_umich_phosphoproteomics_S11_sPGRDPELQVEAAEVTTK</th>\n",
       "      <th>TSGA10_umich_phosphoproteomics_S173_sPSRLDSFVK</th>\n",
       "      <th>TSGA10_umich_phosphoproteomics_S213_RPsPTAR</th>\n",
       "      <th>TSGA10_umich_phosphoproteomics_S391_AMDTEsELGR</th>\n",
       "      <th>TSGA10_umich_phosphoproteomics_S779_GLDRsLEENLCYR;GLDRsLEENLCYRDF</th>\n",
       "      <th>TSGA10_umich_phosphoproteomics_S101_EVVSSQVDDLTsHNEHLCK</th>\n",
       "      <th>SVIL_umich_phosphoproteomics_S296_DSEGDTPsLINWPSSK</th>\n",
       "      <th>SVIL_umich_phosphoproteomics_S459_LPsPTVAR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>C3L-00006</th>\n",
       "      <td>-0.00659</td>\n",
       "      <td>-0.01982</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01418</td>\n",
       "      <td>-0.00839</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01402</td>\n",
       "      <td>-0.01305</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.667426</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.905606</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.069911</td>\n",
       "      <td>-0.584774</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.561657</td>\n",
       "      <td>-0.652457</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00008</th>\n",
       "      <td>0.02578</td>\n",
       "      <td>0.00726</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.01350</td>\n",
       "      <td>0.00732</td>\n",
       "      <td>0.01642</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01225</td>\n",
       "      <td>0.01005</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.488427</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.431599</td>\n",
       "      <td>-1.079638</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00032</th>\n",
       "      <td>0.01262</td>\n",
       "      <td>0.00425</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>0.00166</td>\n",
       "      <td>0.00549</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00275</td>\n",
       "      <td>-0.00038</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.104862</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.439041</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00084</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.399718</td>\n",
       "      <td>-0.875016</td>\n",
       "      <td>-0.579824</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.505807</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.521725</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>C3L-00090</th>\n",
       "      <td>0.00100</td>\n",
       "      <td>0.41191</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02436</td>\n",
       "      <td>-0.01198</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.02299</td>\n",
       "      <td>-0.03307</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.069439</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.510268</td>\n",
       "      <td>-1.889144</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.592203</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.126482</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 104580 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name        A1BG_washu_CNV  A1CF_washu_CNV  A2M_washu_CNV  A2ML1_washu_CNV  \\\n",
       "Patient_ID                                                                   \n",
       "C3L-00006         -0.00659        -0.01982       -0.01402         -0.01402   \n",
       "C3L-00008          0.02578         0.00726        0.01350          0.01350   \n",
       "C3L-00032          0.01262         0.00425       -0.00275         -0.00275   \n",
       "C3L-00084              NaN             NaN            NaN              NaN   \n",
       "C3L-00090          0.00100         0.41191       -0.02299         -0.02299   \n",
       "\n",
       "Name        A3GALT2_washu_CNV  A4GALT_washu_CNV  A4GNT_washu_CNV  \\\n",
       "Patient_ID                                                         \n",
       "C3L-00006            -0.01418          -0.00839         -0.01305   \n",
       "C3L-00008             0.00732           0.01642          0.01005   \n",
       "C3L-00032             0.00166           0.00549         -0.00038   \n",
       "C3L-00084                 NaN               NaN              NaN   \n",
       "C3L-00090            -0.02436          -0.01198         -0.03307   \n",
       "\n",
       "Name        AAAS_washu_CNV  AACS_washu_CNV  AADAC_washu_CNV  ...  \\\n",
       "Patient_ID                                                   ...   \n",
       "C3L-00006         -0.01402        -0.01402         -0.01305  ...   \n",
       "C3L-00008          0.01225         0.01225          0.01005  ...   \n",
       "C3L-00032         -0.00275        -0.00275         -0.00038  ...   \n",
       "C3L-00084              NaN             NaN              NaN  ...   \n",
       "C3L-00090         -0.02299        -0.02299         -0.03307  ...   \n",
       "\n",
       "Name        SCRIB_umich_phosphoproteomics_S1575T1588S1594_LAEAPSPAPTPsPTPVEDLGPQTStSPGRLsPDFAEELR  \\\n",
       "Patient_ID                                                                                          \n",
       "C3L-00006                                                 NaN                                       \n",
       "C3L-00008                                                 NaN                                       \n",
       "C3L-00032                                                 NaN                                       \n",
       "C3L-00084                                                 NaN                                       \n",
       "C3L-00090                                                 NaN                                       \n",
       "\n",
       "Name        SCRIB_umich_phosphoproteomics_S1594_LsPDFAEELR  \\\n",
       "Patient_ID                                                   \n",
       "C3L-00006                                         0.667426   \n",
       "C3L-00008                                              NaN   \n",
       "C3L-00032                                         0.104862   \n",
       "C3L-00084                                         0.399718   \n",
       "C3L-00090                                         1.069439   \n",
       "\n",
       "Name        TSGA10_umich_phosphoproteomics_S11_sPGRDPELQVEAAEVTTK  \\\n",
       "Patient_ID                                                          \n",
       "C3L-00006                                                 NaN       \n",
       "C3L-00008                                                 NaN       \n",
       "C3L-00032                                                 NaN       \n",
       "C3L-00084                                           -0.875016       \n",
       "C3L-00090                                                 NaN       \n",
       "\n",
       "Name        TSGA10_umich_phosphoproteomics_S173_sPSRLDSFVK  \\\n",
       "Patient_ID                                                   \n",
       "C3L-00006                                         0.905606   \n",
       "C3L-00008                                        -0.488427   \n",
       "C3L-00032                                              NaN   \n",
       "C3L-00084                                        -0.579824   \n",
       "C3L-00090                                         0.510268   \n",
       "\n",
       "Name        TSGA10_umich_phosphoproteomics_S213_RPsPTAR  \\\n",
       "Patient_ID                                                \n",
       "C3L-00006                                           NaN   \n",
       "C3L-00008                                           NaN   \n",
       "C3L-00032                                           NaN   \n",
       "C3L-00084                                           NaN   \n",
       "C3L-00090                                     -1.889144   \n",
       "\n",
       "Name        TSGA10_umich_phosphoproteomics_S391_AMDTEsELGR  \\\n",
       "Patient_ID                                                   \n",
       "C3L-00006                                        -0.069911   \n",
       "C3L-00008                                              NaN   \n",
       "C3L-00032                                              NaN   \n",
       "C3L-00084                                              NaN   \n",
       "C3L-00090                                              NaN   \n",
       "\n",
       "Name        TSGA10_umich_phosphoproteomics_S779_GLDRsLEENLCYR;GLDRsLEENLCYRDF  \\\n",
       "Patient_ID                                                                      \n",
       "C3L-00006                                           -0.584774                   \n",
       "C3L-00008                                                 NaN                   \n",
       "C3L-00032                                                 NaN                   \n",
       "C3L-00084                                           -0.505807                   \n",
       "C3L-00090                                           -0.592203                   \n",
       "\n",
       "Name        TSGA10_umich_phosphoproteomics_S101_EVVSSQVDDLTsHNEHLCK  \\\n",
       "Patient_ID                                                            \n",
       "C3L-00006                                                 NaN         \n",
       "C3L-00008                                                 NaN         \n",
       "C3L-00032                                                 NaN         \n",
       "C3L-00084                                                 NaN         \n",
       "C3L-00090                                                 NaN         \n",
       "\n",
       "Name        SVIL_umich_phosphoproteomics_S296_DSEGDTPsLINWPSSK  \\\n",
       "Patient_ID                                                       \n",
       "C3L-00006                                           -0.561657    \n",
       "C3L-00008                                           -0.431599    \n",
       "C3L-00032                                                 NaN    \n",
       "C3L-00084                                                 NaN    \n",
       "C3L-00090                                                 NaN    \n",
       "\n",
       "Name        SVIL_umich_phosphoproteomics_S459_LPsPTVAR  \n",
       "Patient_ID                                              \n",
       "C3L-00006                                    -0.652457  \n",
       "C3L-00008                                    -1.079638  \n",
       "C3L-00032                                    -1.439041  \n",
       "C3L-00084                                    -1.521725  \n",
       "C3L-00090                                    -1.126482  \n",
       "\n",
       "[5 rows x 104580 columns]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ut.reduce_multiindex(df=phospho_and_CNV, flatten=True).head()\n",
    "# Notice that the empty strings have been dropped"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Getting a single level index of tuples\n",
    "\n",
    "You can also use `reduce_multiindex` to turn the multi-level column index into a single level index of tuples, with each value in a column's tuple corresponding to the column's value for that level of the index:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>(ARF5, ENSP00000000233.5)</th>\n",
       "      <th>(M6PR, ENSP00000000412.3)</th>\n",
       "      <th>(ESRRA, ENSP00000000442.6)</th>\n",
       "      <th>(FKBP4, ENSP00000001008.4)</th>\n",
       "      <th>(NDUFAF7, ENSP00000002125.4)</th>\n",
       "      <th>(FUCA2, ENSP00000002165.5)</th>\n",
       "      <th>(CFTR, ENSP00000003084.6)</th>\n",
       "      <th>(CYP51A1, ENSP00000003100.8)</th>\n",
       "      <th>(USP28, ENSP00000003302.4)</th>\n",
       "      <th>(TMEM176A, ENSP00000004103.3)</th>\n",
       "      <th>...</th>\n",
       "      <th>(TMUB1, ENSP00000499339.1)</th>\n",
       "      <th>(CSNK1A1, ENSP00000499757.1)</th>\n",
       "      <th>(MICAL2, ENSP00000499778.1)</th>\n",
       "      <th>(ANK2, ENSP00000499869.1)</th>\n",
       "      <th>(SEPTIN7, ENSP00000499937.1)</th>\n",
       "      <th>(ATAD3B, ENSP00000500094.1)</th>\n",
       "      <th>(ETNK1, ENSP00000500633.1)</th>\n",
       "      <th>(MYO6, ENSP00000500710.1)</th>\n",
       "      <th>(WIZ, ENSP00000501300.1)</th>\n",
       "      <th>(HSPA12A, ENSP00000501491.1)</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01CO005</th>\n",
       "      <td>-0.203037</td>\n",
       "      <td>-0.223341</td>\n",
       "      <td>-0.283633</td>\n",
       "      <td>-0.612614</td>\n",
       "      <td>0.514855</td>\n",
       "      <td>-0.824026</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.045383</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.248511</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.042548</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.925011</td>\n",
       "      <td>-0.173468</td>\n",
       "      <td>-0.180521</td>\n",
       "      <td>0.139707</td>\n",
       "      <td>-0.882283</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO006</th>\n",
       "      <td>0.188931</td>\n",
       "      <td>0.544620</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.571640</td>\n",
       "      <td>-0.209734</td>\n",
       "      <td>0.799090</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.338493</td>\n",
       "      <td>-0.042567</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.411664</td>\n",
       "      <td>-0.454109</td>\n",
       "      <td>-0.725892</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.707588</td>\n",
       "      <td>-0.846624</td>\n",
       "      <td>0.329813</td>\n",
       "      <td>-0.311147</td>\n",
       "      <td>-0.446358</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO008</th>\n",
       "      <td>0.404810</td>\n",
       "      <td>-0.246523</td>\n",
       "      <td>-0.053940</td>\n",
       "      <td>0.252995</td>\n",
       "      <td>0.190861</td>\n",
       "      <td>0.101419</td>\n",
       "      <td>-0.502876</td>\n",
       "      <td>0.627060</td>\n",
       "      <td>0.089815</td>\n",
       "      <td>-0.106411</td>\n",
       "      <td>...</td>\n",
       "      <td>0.192279</td>\n",
       "      <td>-0.558236</td>\n",
       "      <td>-0.093708</td>\n",
       "      <td>-1.874293</td>\n",
       "      <td>-0.248307</td>\n",
       "      <td>-0.899186</td>\n",
       "      <td>-0.526260</td>\n",
       "      <td>0.668713</td>\n",
       "      <td>0.109366</td>\n",
       "      <td>-1.125296</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO013</th>\n",
       "      <td>-0.276982</td>\n",
       "      <td>-0.017659</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.455055</td>\n",
       "      <td>0.500686</td>\n",
       "      <td>-0.350366</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.263168</td>\n",
       "      <td>0.683830</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>0.220231</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.241860</td>\n",
       "      <td>-3.939263</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.514931</td>\n",
       "      <td>-0.078267</td>\n",
       "      <td>0.122032</td>\n",
       "      <td>0.130764</td>\n",
       "      <td>-1.146911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO014</th>\n",
       "      <td>-0.160155</td>\n",
       "      <td>0.100022</td>\n",
       "      <td>0.259696</td>\n",
       "      <td>0.341345</td>\n",
       "      <td>-0.310265</td>\n",
       "      <td>0.095461</td>\n",
       "      <td>-0.745855</td>\n",
       "      <td>1.006614</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.198671</td>\n",
       "      <td>0.226146</td>\n",
       "      <td>0.036229</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.189468</td>\n",
       "      <td>0.117736</td>\n",
       "      <td>0.586529</td>\n",
       "      <td>-0.006767</td>\n",
       "      <td>-1.106068</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 9457 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            (ARF5, ENSP00000000233.5)  (M6PR, ENSP00000000412.3)  \\\n",
       "Patient_ID                                                         \n",
       "01CO005                     -0.203037                  -0.223341   \n",
       "01CO006                      0.188931                   0.544620   \n",
       "01CO008                      0.404810                  -0.246523   \n",
       "01CO013                     -0.276982                  -0.017659   \n",
       "01CO014                     -0.160155                   0.100022   \n",
       "\n",
       "            (ESRRA, ENSP00000000442.6)  (FKBP4, ENSP00000001008.4)  \\\n",
       "Patient_ID                                                           \n",
       "01CO005                      -0.283633                   -0.612614   \n",
       "01CO006                            NaN                   -0.571640   \n",
       "01CO008                      -0.053940                    0.252995   \n",
       "01CO013                            NaN                   -0.455055   \n",
       "01CO014                       0.259696                    0.341345   \n",
       "\n",
       "            (NDUFAF7, ENSP00000002125.4)  (FUCA2, ENSP00000002165.5)  \\\n",
       "Patient_ID                                                             \n",
       "01CO005                         0.514855                   -0.824026   \n",
       "01CO006                        -0.209734                    0.799090   \n",
       "01CO008                         0.190861                    0.101419   \n",
       "01CO013                         0.500686                   -0.350366   \n",
       "01CO014                        -0.310265                    0.095461   \n",
       "\n",
       "            (CFTR, ENSP00000003084.6)  (CYP51A1, ENSP00000003100.8)  \\\n",
       "Patient_ID                                                            \n",
       "01CO005                           NaN                      0.045383   \n",
       "01CO006                           NaN                     -0.338493   \n",
       "01CO008                     -0.502876                      0.627060   \n",
       "01CO013                           NaN                      0.263168   \n",
       "01CO014                     -0.745855                      1.006614   \n",
       "\n",
       "            (USP28, ENSP00000003302.4)  (TMEM176A, ENSP00000004103.3)  ...  \\\n",
       "Patient_ID                                                             ...   \n",
       "01CO005                            NaN                      -0.248511  ...   \n",
       "01CO006                      -0.042567                            NaN  ...   \n",
       "01CO008                       0.089815                      -0.106411  ...   \n",
       "01CO013                       0.683830                            NaN  ...   \n",
       "01CO014                            NaN                            NaN  ...   \n",
       "\n",
       "            (TMUB1, ENSP00000499339.1)  (CSNK1A1, ENSP00000499757.1)  \\\n",
       "Patient_ID                                                             \n",
       "01CO005                            NaN                           NaN   \n",
       "01CO006                      -0.411664                     -0.454109   \n",
       "01CO008                       0.192279                     -0.558236   \n",
       "01CO013                       0.220231                           NaN   \n",
       "01CO014                      -0.198671                      0.226146   \n",
       "\n",
       "            (MICAL2, ENSP00000499778.1)  (ANK2, ENSP00000499869.1)  \\\n",
       "Patient_ID                                                           \n",
       "01CO005                       -0.042548                        NaN   \n",
       "01CO006                       -0.725892                        NaN   \n",
       "01CO008                       -0.093708                  -1.874293   \n",
       "01CO013                        0.241860                  -3.939263   \n",
       "01CO014                        0.036229                        NaN   \n",
       "\n",
       "            (SEPTIN7, ENSP00000499937.1)  (ATAD3B, ENSP00000500094.1)  \\\n",
       "Patient_ID                                                              \n",
       "01CO005                              NaN                     0.925011   \n",
       "01CO006                              NaN                    -0.707588   \n",
       "01CO008                        -0.248307                    -0.899186   \n",
       "01CO013                              NaN                     0.514931   \n",
       "01CO014                              NaN                     1.189468   \n",
       "\n",
       "            (ETNK1, ENSP00000500633.1)  (MYO6, ENSP00000500710.1)  \\\n",
       "Patient_ID                                                          \n",
       "01CO005                      -0.173468                  -0.180521   \n",
       "01CO006                      -0.846624                   0.329813   \n",
       "01CO008                      -0.526260                   0.668713   \n",
       "01CO013                      -0.078267                   0.122032   \n",
       "01CO014                       0.117736                   0.586529   \n",
       "\n",
       "            (WIZ, ENSP00000501300.1)  (HSPA12A, ENSP00000501491.1)  \n",
       "Patient_ID                                                          \n",
       "01CO005                     0.139707                     -0.882283  \n",
       "01CO006                    -0.311147                     -0.446358  \n",
       "01CO008                     0.109366                     -1.125296  \n",
       "01CO013                     0.130764                     -1.146911  \n",
       "01CO014                    -0.006767                     -1.106068  \n",
       "\n",
       "[5 rows x 9457 columns]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ut.reduce_multiindex(df=prot, tuples=True).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Turning off warnings\n",
    "\n",
    "If your multiindex operation creates duplicate column headers, or has no effect, `reduce_multiindex` will warn you. You can silence these warnings by passing `True` to the `quiet` parameter:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSP00000000233.5</th>\n",
       "      <th>ENSP00000000412.3</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000001008.4</th>\n",
       "      <th>ENSP00000002125.4</th>\n",
       "      <th>ENSP00000002165.5</th>\n",
       "      <th>ENSP00000003084.6</th>\n",
       "      <th>ENSP00000003100.8</th>\n",
       "      <th>ENSP00000003302.4</th>\n",
       "      <th>ENSP00000004103.3</th>\n",
       "      <th>...</th>\n",
       "      <th>ENSP00000499339.1</th>\n",
       "      <th>ENSP00000499757.1</th>\n",
       "      <th>ENSP00000499778.1</th>\n",
       "      <th>ENSP00000499869.1</th>\n",
       "      <th>ENSP00000499937.1</th>\n",
       "      <th>ENSP00000500094.1</th>\n",
       "      <th>ENSP00000500633.1</th>\n",
       "      <th>ENSP00000500710.1</th>\n",
       "      <th>ENSP00000501300.1</th>\n",
       "      <th>ENSP00000501491.1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01CO005</th>\n",
       "      <td>-0.203037</td>\n",
       "      <td>-0.223341</td>\n",
       "      <td>-0.283633</td>\n",
       "      <td>-0.612614</td>\n",
       "      <td>0.514855</td>\n",
       "      <td>-0.824026</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.045383</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.248511</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.042548</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.925011</td>\n",
       "      <td>-0.173468</td>\n",
       "      <td>-0.180521</td>\n",
       "      <td>0.139707</td>\n",
       "      <td>-0.882283</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO006</th>\n",
       "      <td>0.188931</td>\n",
       "      <td>0.544620</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.571640</td>\n",
       "      <td>-0.209734</td>\n",
       "      <td>0.799090</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.338493</td>\n",
       "      <td>-0.042567</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.411664</td>\n",
       "      <td>-0.454109</td>\n",
       "      <td>-0.725892</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.707588</td>\n",
       "      <td>-0.846624</td>\n",
       "      <td>0.329813</td>\n",
       "      <td>-0.311147</td>\n",
       "      <td>-0.446358</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO008</th>\n",
       "      <td>0.404810</td>\n",
       "      <td>-0.246523</td>\n",
       "      <td>-0.053940</td>\n",
       "      <td>0.252995</td>\n",
       "      <td>0.190861</td>\n",
       "      <td>0.101419</td>\n",
       "      <td>-0.502876</td>\n",
       "      <td>0.627060</td>\n",
       "      <td>0.089815</td>\n",
       "      <td>-0.106411</td>\n",
       "      <td>...</td>\n",
       "      <td>0.192279</td>\n",
       "      <td>-0.558236</td>\n",
       "      <td>-0.093708</td>\n",
       "      <td>-1.874293</td>\n",
       "      <td>-0.248307</td>\n",
       "      <td>-0.899186</td>\n",
       "      <td>-0.526260</td>\n",
       "      <td>0.668713</td>\n",
       "      <td>0.109366</td>\n",
       "      <td>-1.125296</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO013</th>\n",
       "      <td>-0.276982</td>\n",
       "      <td>-0.017659</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.455055</td>\n",
       "      <td>0.500686</td>\n",
       "      <td>-0.350366</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.263168</td>\n",
       "      <td>0.683830</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>0.220231</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.241860</td>\n",
       "      <td>-3.939263</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.514931</td>\n",
       "      <td>-0.078267</td>\n",
       "      <td>0.122032</td>\n",
       "      <td>0.130764</td>\n",
       "      <td>-1.146911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO014</th>\n",
       "      <td>-0.160155</td>\n",
       "      <td>0.100022</td>\n",
       "      <td>0.259696</td>\n",
       "      <td>0.341345</td>\n",
       "      <td>-0.310265</td>\n",
       "      <td>0.095461</td>\n",
       "      <td>-0.745855</td>\n",
       "      <td>1.006614</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.198671</td>\n",
       "      <td>0.226146</td>\n",
       "      <td>0.036229</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.189468</td>\n",
       "      <td>0.117736</td>\n",
       "      <td>0.586529</td>\n",
       "      <td>-0.006767</td>\n",
       "      <td>-1.106068</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 9457 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Database_ID  ENSP00000000233.5  ENSP00000000412.3  ENSP00000000442.6  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.203037          -0.223341          -0.283633   \n",
       "01CO006               0.188931           0.544620                NaN   \n",
       "01CO008               0.404810          -0.246523          -0.053940   \n",
       "01CO013              -0.276982          -0.017659                NaN   \n",
       "01CO014              -0.160155           0.100022           0.259696   \n",
       "\n",
       "Database_ID  ENSP00000001008.4  ENSP00000002125.4  ENSP00000002165.5  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.612614           0.514855          -0.824026   \n",
       "01CO006              -0.571640          -0.209734           0.799090   \n",
       "01CO008               0.252995           0.190861           0.101419   \n",
       "01CO013              -0.455055           0.500686          -0.350366   \n",
       "01CO014               0.341345          -0.310265           0.095461   \n",
       "\n",
       "Database_ID  ENSP00000003084.6  ENSP00000003100.8  ENSP00000003302.4  \\\n",
       "Patient_ID                                                             \n",
       "01CO005                    NaN           0.045383                NaN   \n",
       "01CO006                    NaN          -0.338493          -0.042567   \n",
       "01CO008              -0.502876           0.627060           0.089815   \n",
       "01CO013                    NaN           0.263168           0.683830   \n",
       "01CO014              -0.745855           1.006614                NaN   \n",
       "\n",
       "Database_ID  ENSP00000004103.3  ...  ENSP00000499339.1  ENSP00000499757.1  \\\n",
       "Patient_ID                      ...                                         \n",
       "01CO005              -0.248511  ...                NaN                NaN   \n",
       "01CO006                    NaN  ...          -0.411664          -0.454109   \n",
       "01CO008              -0.106411  ...           0.192279          -0.558236   \n",
       "01CO013                    NaN  ...           0.220231                NaN   \n",
       "01CO014                    NaN  ...          -0.198671           0.226146   \n",
       "\n",
       "Database_ID  ENSP00000499778.1  ENSP00000499869.1  ENSP00000499937.1  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.042548                NaN                NaN   \n",
       "01CO006              -0.725892                NaN                NaN   \n",
       "01CO008              -0.093708          -1.874293          -0.248307   \n",
       "01CO013               0.241860          -3.939263                NaN   \n",
       "01CO014               0.036229                NaN                NaN   \n",
       "\n",
       "Database_ID  ENSP00000500094.1  ENSP00000500633.1  ENSP00000500710.1  \\\n",
       "Patient_ID                                                             \n",
       "01CO005               0.925011          -0.173468          -0.180521   \n",
       "01CO006              -0.707588          -0.846624           0.329813   \n",
       "01CO008              -0.899186          -0.526260           0.668713   \n",
       "01CO013               0.514931          -0.078267           0.122032   \n",
       "01CO014               1.189468           0.117736           0.586529   \n",
       "\n",
       "Database_ID  ENSP00000501300.1  ENSP00000501491.1  \n",
       "Patient_ID                                         \n",
       "01CO005               0.139707          -0.882283  \n",
       "01CO006              -0.311147          -0.446358  \n",
       "01CO008               0.109366          -1.125296  \n",
       "01CO013               0.130764          -1.146911  \n",
       "01CO014              -0.006767          -1.106068  \n",
       "\n",
       "[5 rows x 9457 columns]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ut.reduce_multiindex(df=prot, levels_to_drop=\"Name\").head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Database_ID</th>\n",
       "      <th>ENSP00000000233.5</th>\n",
       "      <th>ENSP00000000412.3</th>\n",
       "      <th>ENSP00000000442.6</th>\n",
       "      <th>ENSP00000001008.4</th>\n",
       "      <th>ENSP00000002125.4</th>\n",
       "      <th>ENSP00000002165.5</th>\n",
       "      <th>ENSP00000003084.6</th>\n",
       "      <th>ENSP00000003100.8</th>\n",
       "      <th>ENSP00000003302.4</th>\n",
       "      <th>ENSP00000004103.3</th>\n",
       "      <th>...</th>\n",
       "      <th>ENSP00000499339.1</th>\n",
       "      <th>ENSP00000499757.1</th>\n",
       "      <th>ENSP00000499778.1</th>\n",
       "      <th>ENSP00000499869.1</th>\n",
       "      <th>ENSP00000499937.1</th>\n",
       "      <th>ENSP00000500094.1</th>\n",
       "      <th>ENSP00000500633.1</th>\n",
       "      <th>ENSP00000500710.1</th>\n",
       "      <th>ENSP00000501300.1</th>\n",
       "      <th>ENSP00000501491.1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Patient_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01CO005</th>\n",
       "      <td>-0.203037</td>\n",
       "      <td>-0.223341</td>\n",
       "      <td>-0.283633</td>\n",
       "      <td>-0.612614</td>\n",
       "      <td>0.514855</td>\n",
       "      <td>-0.824026</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.045383</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.248511</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.042548</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.925011</td>\n",
       "      <td>-0.173468</td>\n",
       "      <td>-0.180521</td>\n",
       "      <td>0.139707</td>\n",
       "      <td>-0.882283</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO006</th>\n",
       "      <td>0.188931</td>\n",
       "      <td>0.544620</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.571640</td>\n",
       "      <td>-0.209734</td>\n",
       "      <td>0.799090</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.338493</td>\n",
       "      <td>-0.042567</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.411664</td>\n",
       "      <td>-0.454109</td>\n",
       "      <td>-0.725892</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.707588</td>\n",
       "      <td>-0.846624</td>\n",
       "      <td>0.329813</td>\n",
       "      <td>-0.311147</td>\n",
       "      <td>-0.446358</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO008</th>\n",
       "      <td>0.404810</td>\n",
       "      <td>-0.246523</td>\n",
       "      <td>-0.053940</td>\n",
       "      <td>0.252995</td>\n",
       "      <td>0.190861</td>\n",
       "      <td>0.101419</td>\n",
       "      <td>-0.502876</td>\n",
       "      <td>0.627060</td>\n",
       "      <td>0.089815</td>\n",
       "      <td>-0.106411</td>\n",
       "      <td>...</td>\n",
       "      <td>0.192279</td>\n",
       "      <td>-0.558236</td>\n",
       "      <td>-0.093708</td>\n",
       "      <td>-1.874293</td>\n",
       "      <td>-0.248307</td>\n",
       "      <td>-0.899186</td>\n",
       "      <td>-0.526260</td>\n",
       "      <td>0.668713</td>\n",
       "      <td>0.109366</td>\n",
       "      <td>-1.125296</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO013</th>\n",
       "      <td>-0.276982</td>\n",
       "      <td>-0.017659</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.455055</td>\n",
       "      <td>0.500686</td>\n",
       "      <td>-0.350366</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.263168</td>\n",
       "      <td>0.683830</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>0.220231</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.241860</td>\n",
       "      <td>-3.939263</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.514931</td>\n",
       "      <td>-0.078267</td>\n",
       "      <td>0.122032</td>\n",
       "      <td>0.130764</td>\n",
       "      <td>-1.146911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>01CO014</th>\n",
       "      <td>-0.160155</td>\n",
       "      <td>0.100022</td>\n",
       "      <td>0.259696</td>\n",
       "      <td>0.341345</td>\n",
       "      <td>-0.310265</td>\n",
       "      <td>0.095461</td>\n",
       "      <td>-0.745855</td>\n",
       "      <td>1.006614</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.198671</td>\n",
       "      <td>0.226146</td>\n",
       "      <td>0.036229</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.189468</td>\n",
       "      <td>0.117736</td>\n",
       "      <td>0.586529</td>\n",
       "      <td>-0.006767</td>\n",
       "      <td>-1.106068</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 9457 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Database_ID  ENSP00000000233.5  ENSP00000000412.3  ENSP00000000442.6  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.203037          -0.223341          -0.283633   \n",
       "01CO006               0.188931           0.544620                NaN   \n",
       "01CO008               0.404810          -0.246523          -0.053940   \n",
       "01CO013              -0.276982          -0.017659                NaN   \n",
       "01CO014              -0.160155           0.100022           0.259696   \n",
       "\n",
       "Database_ID  ENSP00000001008.4  ENSP00000002125.4  ENSP00000002165.5  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.612614           0.514855          -0.824026   \n",
       "01CO006              -0.571640          -0.209734           0.799090   \n",
       "01CO008               0.252995           0.190861           0.101419   \n",
       "01CO013              -0.455055           0.500686          -0.350366   \n",
       "01CO014               0.341345          -0.310265           0.095461   \n",
       "\n",
       "Database_ID  ENSP00000003084.6  ENSP00000003100.8  ENSP00000003302.4  \\\n",
       "Patient_ID                                                             \n",
       "01CO005                    NaN           0.045383                NaN   \n",
       "01CO006                    NaN          -0.338493          -0.042567   \n",
       "01CO008              -0.502876           0.627060           0.089815   \n",
       "01CO013                    NaN           0.263168           0.683830   \n",
       "01CO014              -0.745855           1.006614                NaN   \n",
       "\n",
       "Database_ID  ENSP00000004103.3  ...  ENSP00000499339.1  ENSP00000499757.1  \\\n",
       "Patient_ID                      ...                                         \n",
       "01CO005              -0.248511  ...                NaN                NaN   \n",
       "01CO006                    NaN  ...          -0.411664          -0.454109   \n",
       "01CO008              -0.106411  ...           0.192279          -0.558236   \n",
       "01CO013                    NaN  ...           0.220231                NaN   \n",
       "01CO014                    NaN  ...          -0.198671           0.226146   \n",
       "\n",
       "Database_ID  ENSP00000499778.1  ENSP00000499869.1  ENSP00000499937.1  \\\n",
       "Patient_ID                                                             \n",
       "01CO005              -0.042548                NaN                NaN   \n",
       "01CO006              -0.725892                NaN                NaN   \n",
       "01CO008              -0.093708          -1.874293          -0.248307   \n",
       "01CO013               0.241860          -3.939263                NaN   \n",
       "01CO014               0.036229                NaN                NaN   \n",
       "\n",
       "Database_ID  ENSP00000500094.1  ENSP00000500633.1  ENSP00000500710.1  \\\n",
       "Patient_ID                                                             \n",
       "01CO005               0.925011          -0.173468          -0.180521   \n",
       "01CO006              -0.707588          -0.846624           0.329813   \n",
       "01CO008              -0.899186          -0.526260           0.668713   \n",
       "01CO013               0.514931          -0.078267           0.122032   \n",
       "01CO014               1.189468           0.117736           0.586529   \n",
       "\n",
       "Database_ID  ENSP00000501300.1  ENSP00000501491.1  \n",
       "Patient_ID                                         \n",
       "01CO005               0.139707          -0.882283  \n",
       "01CO006              -0.311147          -0.446358  \n",
       "01CO008               0.109366          -1.125296  \n",
       "01CO013               0.130764          -1.146911  \n",
       "01CO014              -0.006767          -1.106068  \n",
       "\n",
       "[5 rows x 9457 columns]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# No warning will be issued\n",
    "ut.reduce_multiindex(df=prot, levels_to_drop=\"Name\", quiet=True).head()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}