
Select a set of predictors with minimal multicollinearity
Source:R/non_collinear_vars.R
      non_collinear_vars.RdSelect a set of predictors with minimal multicollinearity using the variance
inflation factor (VIF) as criteria to remove collinear variables. The
algorithm will: (i) compute the VIF value of the correlation matrix
containing the variables selected in ...; (ii) arrange the
VIF values and delete the variable with the highest VIF; and (iii)
iterate step ii until VIF value is less than or equal to
max_vif.
Arguments
- .data
- The data set containing the variables. 
- ...
- Variables to be submitted to selection. If - ...is null then all the numeric variables from- .dataare used. It must be a single variable name or a comma-separated list of unquoted variables names.
- max_vif
- The maximum value for the Variance Inflation Factor (threshold) that will be accepted in the set of selected predictors. 
- missingval
- How to deal with missing values. For more information, please see - stats::cor().
Value
A data frame showing the number of selected predictors, maximum VIF value, condition number, determinant value, selected predictors and removed predictors from the original set of variables.
Examples
# \donttest{
library(metan)
# All numeric variables
non_collinear_vars(data_ge2)
#>          Parameter                                       values
#> 1       Predictors                                           10
#> 2              VIF                                         7.16
#> 3 Condition Number                                       56.797
#> 4      Determinant                                 0.0008810515
#> 5         Selected PERK, EP, CDED, NKR, PH, NR, TKW, EL, CD, ED
#> 6          Removed                          EH, CL, CW, KW, NKE
# Select variables and choose a VIF threshold to 5
non_collinear_vars(data_ge2, EH, CL, CW, KW, NKE, max_vif = 5)
#>          Parameter          values
#> 1       Predictors               4
#> 2              VIF           2.934
#> 3 Condition Number          11.248
#> 4      Determinant    0.2400583901
#> 5         Selected NKE, EH, CL, CW
#> 6          Removed              KW
# }