{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Course 2 week 1 lecture notebook 01\n", "# Create a Linear Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear model using scikit-learn\n", "\n", "We'll practice using a scikit-learn model for linear regression. You will do something similar in this week's assignment (but with a logistic regression model).\n", "\n", "[sklearn.linear_model.LinearRegression()](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, import `LinearRegression`, which is a Python 'class'." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import the module 'LinearRegression' from sklearn\n", "from sklearn.linear_model import LinearRegression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, use the class to create an object of type LinearRegression." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create an object of type LinearRegression\n", "model = LinearRegression()\n", "model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate some data by importing a module 'load_data', which is implemented for you. The features in `X' are: \n", "\n", "- Age: (years)\n", "- Systolic_BP: Systolic blood pressure (mmHg)\n", "- Diastolic_BP: Diastolic blood pressure (mmHg)\n", "- Cholesterol: (mg/DL)\n", "\n", "The labels in `y` indicate whether the patient has a disease (diabetic retinopathy).\n", "- y = 1 : patient has retinopathy.\n", "- y = 0 : patient does not have retinopathy." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import the load_data function from the utils module\n", "from utils import load_data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Generate features and labels using the imported function\n", "X, y = load_data(100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Explore the data by viewing the features and the labels" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# View the features\n", "X.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot a histogram of the Age feature\n", "X['Age'].hist();" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot a histogram of the systolic blood pressure feature\n", "X['Systolic_BP'].hist();" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot a histogram of the diastolic blood pressure feature\n", "X['Diastolic_BP'].hist();" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot a histogram of the cholesterol feature\n", "X['Cholesterol'].hist();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also take a look at the labels" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# View a few values of the labels\n", "y.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot a histogram of the labels\n", "y.hist();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fit the LinearRegression using the features in `X` and the labels in `y`. To \"fit\" the model is another way of saying that we are training the model on the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Fit the linear regression model\n", "model.fit(X, y)\n", "model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- View the coefficients of the trained model.\n", "- The coefficients are the 'weights' or $\\beta$s associated with each feature\n", "- You'll use the coefficients for making predictions.\n", "$$\\hat{y} = \\beta_1x_1 + \\beta_2x_2 + ... \\beta_N x_N$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# View the coefficients of the model\n", "model.coef_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the assignment, you will do something similar, but using a logistic regression, so that the output of the prediction will be bounded between 0 and 1." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### This is the end of this practice section.\n", "\n", "Please continue on with the lecture videos!\n", "\n", "---" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }