---
title: "What happens when you set the intercept to 0 in regression models"
description: |
  What happens when you force the intercept to be 0 in a regression model and why you should (generally) never do it
preview: https://raw.githubusercontent.com/hauselin/rtutorialsite/master/_posts/2019-07-24-what-happens-when-you-remove-or-set-the-intercept-to-0-in-regression-models/what-happens-when-you-remove-or-set-the-intercept-to-0-in-regression-models_files/figure-html5/unnamed-chunk-6-1.png
author:
  - name: Hause Lin
    url: {}
date: 07-24-2019
categories: 
  - regression
  - general linear model
output:
  radix::radix_article:
    toc: true
    self_contained: false
draft: false
editor_options: 
  chunk_output_type: console
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, cache = FALSE, comment = NA, message = FALSE, warning = FALSE)
```

Get source code for this RMarkdown script [here](https://raw.githubusercontent.com/hauselin/rtutorialsite/master/_posts/2019-07-24-what-happens-when-you-remove-or-set-the-intercept-to-0-in-regression-models/what-happens-when-you-remove-or-set-the-intercept-to-0-in-regression-models.Rmd).

## Consider being a patron and supporting my work?

[Donate and become a patron](https://donorbox.org/support-my-teaching): If you find value in what I do and have learned something from my site, please consider becoming a patron. It takes me many hours to research, learn, and put together tutorials. Your support really matters.

This article answers the questions below. I will use the built-in dataset `mtcars`. 

* What is the intercept in a regression model?
* What happens when you remove or set the intercept to 0 in a regression model?
* Why you should never remove or set the intercept to 0?
* What are the effects of mean-centering a regressor/predictor?

If you need a refresher on how to interpret regression coefficients, see my other [article](https://hausetutorials.netlify.com/posts/2019-07-06-interpreting-interaction-regression-coefficients/).

```{r}
library(ggplot2) # plot regression lines
```

Have a look at the built-in `mtcars` dataset.

```{r}
head(mtcars)
```

## Fit simple regression model (with intercept)

```{r}
model1 <- lm(mpg ~ disp, mtcars) # the intercept is included by default: lm(mpg ~ 1 + disp, mtcars)
coef1 <- coef(model1) # get coefficients
coef1
```

Regression equation: 

$$mpg_{i} = 29.59 - 0.04*disp_{i}$$

Interpretation

* For every 1 unit increase in the predictor `disp`, the outcome `mpg` changes by -0.04. That is, as `disp` increases, `mpg` decreases. 
* When `disp = 0`, `mpg = 29.59`. 

$$mpg = 29.59 - 0.04*0$$

$$mpg = 29.59$$

```{r fig1}
ggplot(mtcars, aes(disp, mpg)) +
  geom_point() +
  geom_vline(xintercept = 0, col = 'grey') +
  geom_hline(yintercept = 0, col = 'grey') +
  scale_x_continuous(limits = c(-10, max(mtcars$disp))) +
  scale_y_continuous(limits = c(-10, max(mtcars$mpg))) +
  geom_abline(intercept = coef(model1)[1], slope = coef(model1)[2]) # manually plot regression line
```

## Fit simple regression model without intercept

```{r}
model0 <- lm(mpg ~ 0 + disp, mtcars) # equivalent syntax: lm(mpg ~ -1 + disp, mtcars)
coef0 <- coef(model0) # get coefficients
coef0
```

Note that after setting the intercept to 0, the relationship between `mpg` and `disp` is now **POSITIVE**, rather than negative (see above model with intercept). 

Regression equation: 

$$mpg_{i} = 0 + 0.059*disp_{i}$$

Interpretation

* For every 1 unit increase in the predictor `disp`, the outcome `mpg` changes by 0.059. That is, as `disp` increases, `mpg` **increases**. 
* When `disp = 0`, `mpg = 0`. **By removing the intercept (i.e., setting it to 0), we are forcing the regression line to go through the origin (the point where disp = 0 and mpg = 0).**

$$mpg = 0 + 0.059*0$$

$$mpg = 0$$

The regression line is forced to pass through the origin (0, 0). Therefore, unless your regressors are standardized or mean-centered, it's not a good idea to set the intercept to 0 when fitting the model. Even when your regressors are standardized or mean-centered, you should still include the intercept. 

```{r}
ggplot(mtcars, aes(disp, mpg)) +
  geom_point() +
  geom_vline(xintercept = 0, col = 'grey') +
  geom_hline(yintercept = 0, col = 'grey') +
  scale_x_continuous(limits = c(-10, max(mtcars$disp))) +
  scale_y_continuous(limits = c(-10, max(mtcars$mpg))) +
  geom_abline(intercept = 0, slope = coef(model0)[1]) # manually plot regression line
```

## Fit simple regression models with mean-centered regressor

Mean-center regressor

```{r}
mtcars$dispC <- mtcars$disp - mean(mtcars$disp) # create mean-centered variable
mean(mtcars$dispC) # mean of dispC is 0 (with some rounding error)
```

Fit model with intercept and mean-centered regressor

```{r}
model1c <- lm(mpg ~ dispC, mtcars)
coef(model1c)
```

<aside>
The intercept is now the the mean of outcome variable `mpg`. Check it by running `mean(mtcars$mpg)`.
</aside>

Fit model **without** intercept and with mean-centered regressor

```{r}
model0c <- lm(mpg ~ 0 + dispC, mtcars)
coef(model0c)
```

After mean-centering the regressor/predictor, fitting the model with or without the intercept gives the same `dispC` coefficient: -0.04

```{r}
ggplot(mtcars, aes(dispC, mpg)) +
  geom_point() +
  geom_vline(xintercept = 0, col = 'grey') +
  geom_hline(yintercept = 0, col = 'grey') +
  scale_x_continuous(limits = c(min(mtcars$dispC), max(mtcars$dispC))) +
  scale_y_continuous(limits = c(-10, max(mtcars$mpg))) +
  geom_abline(intercept = coef(model1c)[1], slope = coef(model1c)[2]) 
```

Note that the regression slope is identical to the first figure. The only difference is that the points have been shifted to the left. 

## Additional resources

* [StackExchange: When is it ok to remove the intercept in a linear regression model?](https://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-a-linear-regression-model)

* [Interpreting regression coefficients](https://hausetutorials.netlify.com/posts/2019-07-06-interpreting-interaction-regression-coefficients/).

## Support my work

[Support my work and become a patron here](https://donorbox.org/support-my-teaching)!