{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from statsmodels.stats.multicomp import (pairwise_tukeyhsd,MultiComparison)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"https://raw.githubusercontent.com/weilixiang/sta2453_project1/master/merged_data.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Variables Description\n",
    "\n",
    "- PlayerID: A numerical number range from 1 to 17, indicating a unique playerID for each player\n",
    "- SessionType: A categorical variable describing the training session types: Skills, Mobility/Recovery, Game, Conditioning, Strength, Combat, Speed\n",
    "- Duration: A float representing session length, ranging from 3.00 to 120.00\n",
    "- RPE: An integer representing rate of perceived exertion (0-10 scale)\n",
    "- SessionLoad: A float obtained by Duration * RPE\n",
    "- DailyLoad: A float that is the sum of SessionLoad for a given day\n",
    "- AcuteLoad: A float that is the average daily load over past 7 days\n",
    "- ChronicLoad: A float that is the average daily load over past 30 days\n",
    "- AcuteChronicRatio: A float that is calculated by AcuteLoad/ChronicLoad\n",
    "- Load_Status: A categorical variable, which is dependent on AcuteChronicRatio.If ratio>1.2, status is high. If ratio<0.8, status is recovering. In between it is normal\n",
    "- GameID: An integer representing a unique game ID, there are 38 games in total\n",
    "- game_date: A date that indicates the date of the game\n",
    "- train_date: A date that indicates the training date, which is one day ahead of game day\n",
    "- AccelImpulse: A float that is the max absolute value of change in speed divided by change in time\n",
    "- AccelLoad: A float that indicates the max load detected by the accelerometer\n",
    "- Speed: A float that is the max movement speed of the player, in meters per second\n",
    "- PerformanceScore: A measurement that is the weighted average of AccelImpulse, AccelLoad, and Speed. This measurement is considered as response variable in our model. It indicates how good the performance is for each player\n",
    "- Outcome: A binary categorical variable, W indicates win, and L indicates loss.\n",
    "- PointsDiff: An integer, which is the difference of scores. A positive difference indicates winning by how many points.\n",
    "- Pain: A binary categorical variable(Yes or No) indicates whether a player is in pain\n",
    "- Illness: A categorical variable indicates whether a player is feeling ill, all possible values are  Yes, No, and Slightly Off\n",
    "- Menstruation: A binary categorical variable(Yes or No) indicates whether a player is currently menstruating. \n",
    "- Nutrition: A binary categorical variable(Excellent or Okay)\n",
    "- NutritionAdjustment: A binary categorical variable indicates whether the player has made a nutrition adjustment that day\n",
    "- EWMScore: A float number that measures the exponential moving average of wellness score"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                    Mixed Linear Model Regression Results\n",
      "=============================================================================\n",
      "Model:                 MixedLM      Dependent Variable:      PerformanceScore\n",
      "No. Observations:      562          Method:                  REML            \n",
      "No. Groups:            17           Scale:                   58.6021         \n",
      "Min. group size:       3            Likelihood:              -1956.4791      \n",
      "Max. group size:       80           Converged:               Yes             \n",
      "Mean group size:       33.1                                                  \n",
      "-----------------------------------------------------------------------------\n",
      "                                  Coef.  Std.Err.   z    P>|z|  [0.025 0.975]\n",
      "-----------------------------------------------------------------------------\n",
      "Intercept                         50.766    8.679  5.849 0.000  33.755 67.777\n",
      "SessionType[T.Game]               -0.501    2.385 -0.210 0.834  -5.175  4.173\n",
      "SessionType[T.Mobility/Recovery]   0.331    2.774  0.119 0.905  -5.106  5.769\n",
      "SessionType[T.Skills]              1.829    2.469  0.741 0.459  -3.010  6.668\n",
      "SessionType[T.Speed]               2.498    3.187  0.784 0.433  -3.748  8.744\n",
      "SessionType[T.Strength]            5.037    3.126  1.611 0.107  -1.090 11.163\n",
      "Load_Status[T.normal]             -6.476    2.475 -2.617 0.009 -11.327 -1.626\n",
      "Load_Status[T.recovering]         -4.472    3.045 -1.468 0.142 -10.440  1.497\n",
      "Outcome[T.W]                      -2.591    1.320 -1.963 0.050  -5.178 -0.004\n",
      "Pain[T.Yes]                        3.424    1.951  1.755 0.079  -0.399  7.248\n",
      "Illness[T.Slightly Off]            4.863    1.910  2.547 0.011   1.120  8.606\n",
      "Illness[T.Yes]                   -10.834    5.806 -1.866 0.062 -22.215  0.546\n",
      "Menstruation[T.Yes]               -0.170    1.288 -0.132 0.895  -2.695  2.355\n",
      "Nutrition[T.Okay]                  4.690    1.452  3.230 0.001   1.844  7.537\n",
      "NutritionAdjustment[T.No]          6.456    4.876  1.324 0.185  -3.100 16.013\n",
      "NutritionAdjustment[T.Yes]         8.217    3.508  2.342 0.019   1.341 15.092\n",
      "Duration                           0.026    0.042  0.626 0.531  -0.056  0.109\n",
      "RPE                                0.419    0.345  1.215 0.224  -0.257  1.095\n",
      "SessionLoad                       -0.014    0.007 -1.926 0.054  -0.028  0.000\n",
      "DailyLoad                          0.009    0.002  3.813 0.000   0.004  0.013\n",
      "AcuteLoad                         -0.034    0.016 -2.043 0.041  -0.066 -0.001\n",
      "ChronicLoad                        0.000    0.013  0.010 0.992  -0.025  0.026\n",
      "AcuteChronicRatio                  6.841    5.802  1.179 0.238  -4.532 18.213\n",
      "PointsDiff                         0.125    0.039  3.182 0.001   0.048  0.202\n",
      "EWMScore                          -0.412    0.244 -1.685 0.092  -0.891  0.067\n",
      "Group Var                         79.990    4.328                            \n",
      "=============================================================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import statsmodels.api as sm\n",
    "\n",
    "import statsmodels.formula.api as smf\n",
    "\n",
    "full_md = smf.mixedlm(\"PerformanceScore  ~ SessionType +  Duration + RPE + SessionLoad + \\\n",
    "    DailyLoad + AcuteLoad + ChronicLoad + AcuteChronicRatio + Load_Status + \\\n",
    "        Outcome + PointsDiff + Pain + Illness + Menstruation + \\\n",
    "       Nutrition + NutritionAdjustment + EWMScore\", df,groups = df[\"PlayerID\"])\n",
    "                 \n",
    "mdf = full_md.fit()\n",
    "print(mdf.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, ChronicLoad has p-value 0.992, which is significantly larger than 0.1, we will drop it for next model. Keep in mind, the REML is about 58.6."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                    Mixed Linear Model Regression Results\n",
      "=============================================================================\n",
      "Model:                 MixedLM      Dependent Variable:      PerformanceScore\n",
      "No. Observations:      562          Method:                  REML            \n",
      "No. Groups:            17           Scale:                   58.4976         \n",
      "Min. group size:       3            Likelihood:              -1953.0431      \n",
      "Max. group size:       80           Converged:               Yes             \n",
      "Mean group size:       33.1                                                  \n",
      "-----------------------------------------------------------------------------\n",
      "                                  Coef.  Std.Err.   z    P>|z|  [0.025 0.975]\n",
      "-----------------------------------------------------------------------------\n",
      "Intercept                         50.784    8.618  5.893 0.000  33.893 67.674\n",
      "SessionType[T.Game]               -0.500    2.380 -0.210 0.834  -5.165  4.165\n",
      "SessionType[T.Mobility/Recovery]   0.334    2.765  0.121 0.904  -5.086  5.754\n",
      "SessionType[T.Skills]              1.832    2.462  0.744 0.457  -2.995  6.658\n",
      "SessionType[T.Speed]               2.501    3.169  0.789 0.430  -3.710  8.711\n",
      "SessionType[T.Strength]            5.038    3.119  1.615 0.106  -1.076 11.152\n",
      "Load_Status[T.normal]             -6.476    2.471 -2.621 0.009 -11.320 -1.633\n",
      "Load_Status[T.recovering]         -4.471    3.039 -1.471 0.141 -10.428  1.485\n",
      "Outcome[T.W]                      -2.591    1.318 -1.966 0.049  -5.174 -0.008\n",
      "Pain[T.Yes]                        3.424    1.942  1.763 0.078  -0.382  7.229\n",
      "Illness[T.Slightly Off]            4.862    1.892  2.570 0.010   1.154  8.569\n",
      "Illness[T.Yes]                   -10.838    5.797 -1.870 0.062 -22.200  0.524\n",
      "Menstruation[T.Yes]               -0.169    1.286 -0.132 0.895  -2.690  2.351\n",
      "Nutrition[T.Okay]                  4.690    1.449  3.237 0.001   1.850  7.529\n",
      "NutritionAdjustment[T.No]          6.454    4.870  1.325 0.185  -3.090 15.999\n",
      "NutritionAdjustment[T.Yes]         8.216    3.504  2.345 0.019   1.348 15.084\n",
      "Duration                           0.026    0.042  0.629 0.529  -0.056  0.108\n",
      "RPE                                0.419    0.344  1.218 0.223  -0.256  1.094\n",
      "SessionLoad                       -0.014    0.007 -1.928 0.054  -0.028  0.000\n",
      "DailyLoad                          0.009    0.002  3.820 0.000   0.004  0.013\n",
      "AcuteLoad                         -0.034    0.008 -4.386 0.000  -0.049 -0.019\n",
      "AcuteChronicRatio                  6.795    3.263  2.082 0.037   0.399 13.190\n",
      "PointsDiff                         0.125    0.039  3.186 0.001   0.048  0.202\n",
      "EWMScore                          -0.411    0.223 -1.844 0.065  -0.848  0.026\n",
      "Group Var                         79.628    4.269                            \n",
      "=============================================================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "md1 = smf.mixedlm(\"PerformanceScore  ~ SessionType +  Duration + RPE + SessionLoad + \\\n",
    "    DailyLoad + AcuteLoad + AcuteChronicRatio + Load_Status + \\\n",
    "        Outcome + PointsDiff + Pain + Illness + Menstruation + \\\n",
    "       Nutrition + NutritionAdjustment + EWMScore\", df,groups = df[\"PlayerID\"])\n",
    "                 \n",
    "mdf = md1.fit()\n",
    "print(mdf.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, SessionType has p-value 0.904, which is significantly larger than 0.1, also all levels are insignificant, we will drop it for next model. Keep in mind, the REML is about 58.5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 Mixed Linear Model Regression Results\n",
      "=======================================================================\n",
      "Model:               MixedLM    Dependent Variable:    PerformanceScore\n",
      "No. Observations:    562        Method:                REML            \n",
      "No. Groups:          17         Scale:                 58.6317         \n",
      "Min. group size:     3          Likelihood:            -1963.8942      \n",
      "Max. group size:     80         Converged:             Yes             \n",
      "Mean group size:     33.1                                              \n",
      "-----------------------------------------------------------------------\n",
      "                            Coef.  Std.Err.   z    P>|z|  [0.025 0.975]\n",
      "-----------------------------------------------------------------------\n",
      "Intercept                   52.713    8.434  6.250 0.000  36.182 69.244\n",
      "Load_Status[T.normal]       -6.701    2.449 -2.737 0.006 -11.500 -1.902\n",
      "Load_Status[T.recovering]   -4.881    2.982 -1.637 0.102 -10.726  0.963\n",
      "Outcome[T.W]                -2.659    1.315 -2.022 0.043  -5.236 -0.082\n",
      "Pain[T.Yes]                  3.811    1.871  2.037 0.042   0.144  7.478\n",
      "Illness[T.Slightly Off]      5.203    1.886  2.759 0.006   1.507  8.900\n",
      "Illness[T.Yes]             -10.215    5.790 -1.764 0.078 -21.564  1.134\n",
      "Menstruation[T.Yes]         -0.157    1.284 -0.122 0.903  -2.673  2.359\n",
      "Nutrition[T.Okay]            5.078    1.417  3.584 0.000   2.301  7.855\n",
      "NutritionAdjustment[T.No]    6.560    4.851  1.352 0.176  -2.948 16.068\n",
      "NutritionAdjustment[T.Yes]   8.765    3.495  2.508 0.012   1.914 15.615\n",
      "Duration                     0.058    0.037  1.544 0.123  -0.016  0.131\n",
      "RPE                          0.328    0.255  1.288 0.198  -0.171  0.828\n",
      "SessionLoad                 -0.014    0.007 -1.908 0.056  -0.028  0.000\n",
      "DailyLoad                    0.008    0.002  3.532 0.000   0.003  0.012\n",
      "AcuteLoad                   -0.031    0.007 -4.137 0.000  -0.046 -0.016\n",
      "AcuteChronicRatio            5.729    3.176  1.803 0.071  -0.497 11.954\n",
      "PointsDiff                   0.126    0.039  3.219 0.001   0.049  0.203\n",
      "EWMScore                    -0.471    0.219 -2.151 0.031  -0.901 -0.042\n",
      "Group Var                   77.071    4.148                            \n",
      "=======================================================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "md2 = smf.mixedlm(\"PerformanceScore  ~  Duration + RPE + SessionLoad + \\\n",
    "    DailyLoad + AcuteLoad + AcuteChronicRatio + Load_Status + \\\n",
    "        Outcome + PointsDiff + Pain + Illness + Menstruation + \\\n",
    "       Nutrition + NutritionAdjustment + EWMScore\", df,groups = df[\"PlayerID\"])\n",
    "                 \n",
    "mdf = md2.fit()\n",
    "print(mdf.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, Menstruation has p-value 0.903, which is significantly larger than 0.1, also all levels are insignificant, we will drop it for next model. The REML is about 58.6."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 Mixed Linear Model Regression Results\n",
      "=======================================================================\n",
      "Model:               MixedLM    Dependent Variable:    PerformanceScore\n",
      "No. Observations:    562        Method:                REML            \n",
      "No. Groups:          17         Scale:                 58.5248         \n",
      "Min. group size:     3          Likelihood:            -1965.0699      \n",
      "Max. group size:     80         Converged:             Yes             \n",
      "Mean group size:     33.1                                              \n",
      "-----------------------------------------------------------------------\n",
      "                            Coef.  Std.Err.   z    P>|z|  [0.025 0.975]\n",
      "-----------------------------------------------------------------------\n",
      "Intercept                   52.513    8.271  6.349 0.000  36.301 68.724\n",
      "Load_Status[T.normal]       -6.693    2.445 -2.737 0.006 -11.486 -1.900\n",
      "Load_Status[T.recovering]   -4.864    2.976 -1.635 0.102 -10.696  0.968\n",
      "Outcome[T.W]                -2.674    1.308 -2.045 0.041  -5.237 -0.111\n",
      "Pain[T.Yes]                  3.800    1.867  2.035 0.042   0.140  7.459\n",
      "Illness[T.Slightly Off]      5.272    1.798  2.933 0.003   1.749  8.796\n",
      "Illness[T.Yes]             -10.191    5.782 -1.763 0.078 -21.524  1.141\n",
      "Nutrition[T.Okay]            5.061    1.409  3.593 0.000   2.300  7.822\n",
      "NutritionAdjustment[T.No]    6.614    4.827  1.370 0.171  -2.846 16.075\n",
      "NutritionAdjustment[T.Yes]   8.793    3.485  2.524 0.012   1.964 15.623\n",
      "Duration                     0.058    0.037  1.548 0.122  -0.015  0.131\n",
      "RPE                          0.330    0.254  1.297 0.195  -0.169  0.829\n",
      "SessionLoad                 -0.014    0.007 -1.913 0.056  -0.028  0.000\n",
      "DailyLoad                    0.008    0.002  3.539 0.000   0.003  0.012\n",
      "AcuteLoad                   -0.031    0.007 -4.234 0.000  -0.045 -0.016\n",
      "AcuteChronicRatio            5.686    3.154  1.803 0.071  -0.497 11.868\n",
      "PointsDiff                   0.127    0.039  3.258 0.001   0.051  0.203\n",
      "EWMScore                    -0.467    0.216 -2.161 0.031  -0.891 -0.043\n",
      "Group Var                   76.964    4.146                            \n",
      "=======================================================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "md3 = smf.mixedlm(\"PerformanceScore  ~  Duration + RPE + SessionLoad + \\\n",
    "    DailyLoad + AcuteLoad + AcuteChronicRatio + Load_Status + \\\n",
    "        Outcome + PointsDiff + Pain + Illness +\\\n",
    "       Nutrition + NutritionAdjustment + EWMScore\", df,groups = df[\"PlayerID\"])\n",
    "                 \n",
    "mdf = md3.fit()\n",
    "print(mdf.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, RPE has p-value 0.195, which is larger than 0.1,  we will drop it for next model. The REML is about 58.5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 Mixed Linear Model Regression Results\n",
      "=======================================================================\n",
      "Model:               MixedLM    Dependent Variable:    PerformanceScore\n",
      "No. Observations:    562        Method:                REML            \n",
      "No. Groups:          17         Scale:                 58.6464         \n",
      "Min. group size:     3          Likelihood:            -1965.4590      \n",
      "Max. group size:     80         Converged:             Yes             \n",
      "Mean group size:     33.1                                              \n",
      "-----------------------------------------------------------------------\n",
      "                            Coef.  Std.Err.   z    P>|z|  [0.025 0.975]\n",
      "-----------------------------------------------------------------------\n",
      "Intercept                   53.926    8.193  6.582 0.000  37.869 69.983\n",
      "Load_Status[T.normal]       -6.552    2.445 -2.680 0.007 -11.344 -1.760\n",
      "Load_Status[T.recovering]   -4.635    2.973 -1.559 0.119 -10.461  1.191\n",
      "Outcome[T.W]                -2.662    1.309 -2.034 0.042  -5.228 -0.097\n",
      "Pain[T.Yes]                  3.725    1.868  1.994 0.046   0.063  7.386\n",
      "Illness[T.Slightly Off]      5.373    1.798  2.989 0.003   1.850  8.896\n",
      "Illness[T.Yes]             -10.085    5.787 -1.743 0.081 -21.427  1.257\n",
      "Nutrition[T.Okay]            4.882    1.401  3.483 0.000   2.135  7.629\n",
      "NutritionAdjustment[T.No]    6.457    4.825  1.338 0.181  -2.999 15.914\n",
      "NutritionAdjustment[T.Yes]   8.640    3.486  2.479 0.013   1.808 15.471\n",
      "Duration                     0.023    0.026  0.880 0.379  -0.028  0.073\n",
      "SessionLoad                 -0.006    0.004 -1.464 0.143  -0.014  0.002\n",
      "DailyLoad                    0.007    0.002  3.342 0.001   0.003  0.011\n",
      "AcuteLoad                   -0.029    0.007 -4.046 0.000  -0.043 -0.015\n",
      "AcuteChronicRatio            5.601    3.157  1.774 0.076  -0.586 11.787\n",
      "PointsDiff                   0.127    0.039  3.262 0.001   0.051  0.204\n",
      "EWMScore                    -0.466    0.216 -2.154 0.031  -0.889 -0.042\n",
      "Group Var                   74.825    4.029                            \n",
      "=======================================================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "md4 = smf.mixedlm(\"PerformanceScore  ~  Duration + SessionLoad +DailyLoad + AcuteLoad + \\\n",
    "AcuteChronicRatio + Load_Status +  Outcome + PointsDiff + Pain + Illness +\\\n",
    "       Nutrition + NutritionAdjustment + EWMScore\", df,groups = df[\"PlayerID\"])\n",
    "                 \n",
    "mdf = md4.fit()\n",
    "print(mdf.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, Duration has p-value 0.379, which is larger than 0.1,  we will drop it for next model. The REML is about 58.6."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                Mixed Linear Model Regression Results\n",
      "======================================================================\n",
      "Model:                MixedLM   Dependent Variable:   PerformanceScore\n",
      "No. Observations:     562       Method:               REML            \n",
      "No. Groups:           17        Scale:                58.6255         \n",
      "Min. group size:      3         Likelihood:           -1963.0989      \n",
      "Max. group size:      80        Converged:            Yes             \n",
      "Mean group size:      33.1                                            \n",
      "----------------------------------------------------------------------\n",
      "                           Coef.  Std.Err.   z    P>|z|  [0.025 0.975]\n",
      "----------------------------------------------------------------------\n",
      "Intercept                  55.145    8.072  6.832 0.000  39.324 70.966\n",
      "Load_Status[T.normal]      -6.703    2.438 -2.749 0.006 -11.482 -1.924\n",
      "Load_Status[T.recovering]  -4.845    2.963 -1.635 0.102 -10.652  0.961\n",
      "Outcome[T.W]               -2.631    1.308 -2.011 0.044  -5.195 -0.066\n",
      "Pain[T.Yes]                 3.778    1.867  2.023 0.043   0.119  7.436\n",
      "Illness[T.Slightly Off]     5.426    1.796  3.021 0.003   1.905  8.946\n",
      "Illness[T.Yes]             -9.987    5.785 -1.726 0.084 -21.325  1.351\n",
      "Nutrition[T.Okay]           4.922    1.401  3.514 0.000   2.177  7.668\n",
      "NutritionAdjustment[T.No]   6.252    4.817  1.298 0.194  -3.190 15.694\n",
      "NutritionAdjustment[T.Yes]  8.622    3.485  2.474 0.013   1.792 15.453\n",
      "SessionLoad                -0.003    0.002 -1.322 0.186  -0.008  0.001\n",
      "DailyLoad                   0.007    0.002  3.240 0.001   0.003  0.011\n",
      "AcuteLoad                  -0.028    0.007 -3.983 0.000  -0.042 -0.014\n",
      "AcuteChronicRatio           5.308    3.138  1.691 0.091  -0.843 11.459\n",
      "PointsDiff                  0.129    0.039  3.298 0.001   0.052  0.205\n",
      "EWMScore                   -0.483    0.215 -2.243 0.025  -0.904 -0.061\n",
      "Group Var                  74.643    4.026                            \n",
      "======================================================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "md5 = smf.mixedlm(\"PerformanceScore  ~ SessionLoad +DailyLoad + AcuteLoad + \\\n",
    "AcuteChronicRatio + Load_Status +  Outcome + PointsDiff + Pain + Illness +\\\n",
    "       Nutrition + NutritionAdjustment + EWMScore\", df,groups = df[\"PlayerID\"])\n",
    "                 \n",
    "mdf = md5.fit()\n",
    "print(mdf.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, Sessionload has p-value 0.186, which is larger than 0.1,  we will drop it for next model. The REML is about 58.6."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 Mixed Linear Model Regression Results\n",
      "=======================================================================\n",
      "Model:               MixedLM    Dependent Variable:    PerformanceScore\n",
      "No. Observations:    562        Method:                REML            \n",
      "No. Groups:          17         Scale:                 58.7169         \n",
      "Min. group size:     3          Likelihood:            -1958.8323      \n",
      "Max. group size:     80         Converged:             Yes             \n",
      "Mean group size:     33.1                                              \n",
      "-----------------------------------------------------------------------\n",
      "                            Coef.  Std.Err.   z    P>|z|  [0.025 0.975]\n",
      "-----------------------------------------------------------------------\n",
      "Intercept                   54.331    8.054  6.746 0.000  38.546 70.116\n",
      "Load_Status[T.normal]       -6.640    2.440 -2.722 0.006 -11.422 -1.859\n",
      "Load_Status[T.recovering]   -4.826    2.965 -1.628 0.104 -10.636  0.984\n",
      "Outcome[T.W]                -2.610    1.309 -1.994 0.046  -5.176 -0.044\n",
      "Pain[T.Yes]                  3.772    1.868  2.019 0.044   0.110  7.434\n",
      "Illness[T.Slightly Off]      5.420    1.797  3.016 0.003   1.898  8.943\n",
      "Illness[T.Yes]             -10.046    5.789 -1.735 0.083 -21.392  1.301\n",
      "Nutrition[T.Okay]            4.800    1.398  3.433 0.001   2.060  7.541\n",
      "NutritionAdjustment[T.No]    5.764    4.806  1.199 0.230  -3.655 15.183\n",
      "NutritionAdjustment[T.Yes]   8.274    3.478  2.379 0.017   1.458 15.090\n",
      "DailyLoad                    0.006    0.002  3.060 0.002   0.002  0.010\n",
      "AcuteLoad                   -0.028    0.007 -3.953 0.000  -0.042 -0.014\n",
      "AcuteChronicRatio            5.410    3.140  1.723 0.085  -0.743 11.564\n",
      "PointsDiff                   0.125    0.039  3.211 0.001   0.049  0.201\n",
      "EWMScore                    -0.459    0.215 -2.139 0.032  -0.880 -0.039\n",
      "Group Var                   74.236    4.005                            \n",
      "=======================================================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "md6 = smf.mixedlm(\"PerformanceScore  ~ DailyLoad + AcuteLoad + \\\n",
    "AcuteChronicRatio + Load_Status +  Outcome + PointsDiff + Pain + Illness +\\\n",
    "       Nutrition + NutritionAdjustment + EWMScore\", df,groups = df[\"PlayerID\"])\n",
    "                 \n",
    "mdf = md6.fit()\n",
    "print(mdf.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Up until here, all p-values are less than 0.1, or at least some levels in categorical variable are significant. We will stop here, and use it as our final model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}