{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear Regression: Model Selection and Cross-Validation\n", "\n", "Functions\n", "\n", "`RandomState.permute`, `sm.OLS`, `set`, `scipy.random.norm.ppf`, `np.linspace`, `np.mean`\n", "\n", "### Exercise 39\n", "Four portfolios we have been looking at, and considering all 8 sets of\n", "regressors which range from no factor to all 3 factors, which model is preferred\n", "by AIC, BIC, GtS and StG?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2023-09-28T12:33:47.160722Z", "iopub.status.busy": "2023-09-28T12:33:47.160722Z", "iopub.status.idle": "2023-09-28T12:33:49.590078Z", "shell.execute_reply": "2023-09-28T12:33:49.589071Z" } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 40\n", "Cross-validation is a method of analyzing the in-sample forecasting ability of a\n", "cross-sectional model by using $\\alpha\\%$ of the data to estimate the model and\n", "then measuring the fit using the remaining $100-\\alpha\\%$. The most common forms\n", "are 5- and 10-fold cross-validation which use $\\alpha=20\\%$ and $10\\%$, respectively.\n", "k-fold cross validation is implemented by randomly grouping the data into\n", "k-equally-sized groups, using k-1 of the groups to estimate parameters, and\n", "then evaluating using the bin that was held out. This is then repeated so that\n", "each bin is held out.\n", "\n", "1. Implement cross-validation using the 5- and 10-fold methods for all 8 models.\n", "2. For each model, compute the full-sample sum of squared errors as well as the\n", " sum-of-squared errors using the held-out sample. Note that all data points\n", " will appear exactly once in both of these sum or squared errors. What happens\n", " to the cross-validated $R^{2}$ when computed on the held out sample when compared\n", " to the full-sample $R^{2}$? (k-fold cross validated SSE by the TSS)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2023-09-28T12:33:49.748547Z", "iopub.status.busy": "2023-09-28T12:33:49.748547Z", "iopub.status.idle": "2023-09-28T12:33:50.268702Z", "shell.execute_reply": "2023-09-28T12:33:50.268702Z" }, "tags": [] }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" }, "pycharm": { "stem_cell": { "cell_type": "raw", "metadata": { "collapsed": false }, "source": [] } } }, "nbformat": 4, "nbformat_minor": 4 }