Rescale a variable to have specified minimum and maximum values

Helper function that rescales a continuous variable to have specified minimum and maximum values.

The function rescale a continuous variable as follows: $$Rv_i = (Nmax - Nmin)/(Omax - Omin) * (O_i - Omax) + Nmax$$ Where $Rv_i$ is the rescaled value of the ith position of the variable/ vector; $Nmax$ and $Nmin$ are the new maximum and minimum values; $Omax and Omin$ are the maximum and minimum values of the original data, and $O_i$ is the ith value of the original data.

There are basically two options to use resca to rescale a variable. The first is passing a data frame to .data argument and selecting one or more variables to be scaled using .... The function will return the original variables in .data plus the rescaled variable(s) with the prefix _res. By using the function group_by from dplyr package it is possible to rescale the variable(s) within each level of the grouping factor. The second option is pass a numeric vector in the argument values. The output, of course, will be a numeric vector of rescaled values.

Usage

resca(
  .data = NULL,
  ...,
  values = NULL,
  new_min = 0,
  new_max = 100,
  na.rm = TRUE,
  keep = TRUE
)

Arguments

.data: The dataset. Grouped data is allowed.
...: Comma-separated list of unquoted variable names that will be rescaled.
values: Optional vector of values to rescale
new_min: The minimum value of the new scale. Default is 0.
new_max: The maximum value of the new scale. Default is 100
na.rm: Remove NA values? Default to TRUE.
keep: Should all variables be kept after rescaling? If false, only rescaled variables will be kept.

Value

A numeric vector if values is used as input data or a tibble if a data frame is used as input in .data.

Author

Tiago Olivoto tiagoolivoto@gmail.com

Examples

# \donttest{
library(metan)
library(dplyr)
# Rescale a numeric vector
resca(values = c(1:5))
#> [1]   0  25  50  75 100

 # Using a data frame
head(
 resca(data_ge, GY, HM, new_min = 0, new_max = 1)
)
#> # A tibble: 6 × 7
#>   ENV   GEN   REP      GY    HM GY_res HM_res
#>   <fct> <fct> <fct> <dbl> <dbl>  <dbl>  <dbl>
#> 1 E1    G1    1      2.17  44.9  0.338  0.346
#> 2 E1    G1    2      2.50  46.9  0.414  0.445
#> 3 E1    G1    3      2.43  47.8  0.397  0.487
#> 4 E1    G2    1      3.21  45.2  0.574  0.36 
#> 5 E1    G2    2      2.93  45.3  0.512  0.365
#> 6 E1    G2    3      2.56  45.5  0.428  0.375

# Rescale within factors;
# Select variables that stats with 'N' and ends with 'L';
# Compute the mean of these variables by ENV and GEN;
# Rescale the variables that ends with 'L' whithin ENV;
data_ge2 %>%
  select(ENV, GEN, starts_with("N"), ends_with("L")) %>%
  mean_by(ENV, GEN) %>%
  group_by(ENV) %>%
  resca(ends_with("L")) %>%
  head(n = 13)
#> Adding missing grouping variables: `ENV`
#> # A tibble: 13 × 9
#> # Groups:   ENV [1]
#>    ENV   GEN      NR   NKR   NKE    EL    CL EL_res   CL_res
#>    <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl>
#>  1 A1    H1     16.3  33.3  527.  15.4  28.1   34.4 2.50e+ 1
#>  2 A1    H10    16.7  31.2  515.  16.1  31.4   69.9 8.17e+ 1
#>  3 A1    H11    15.2  34.6  530.  16.6  29.0   98.2 4.09e+ 1
#>  4 A1    H12    17.3  32.7  553.  15.2  29.8   20.9 5.52e+ 1
#>  5 A1    H13    18.7  32.9  611.  14.8  31.2    0   7.85e+ 1
#>  6 A1    H2     19.2  33.5  622.  15.0  26.6   11.0 1.42e-14
#>  7 A1    H3     18.5  34.1  610.  15.5  28.0   35.5 2.41e+ 1
#>  8 A1    H4     16.4  38.3  603.  16.0  27.4   63.1 1.42e+ 1
#>  9 A1    H5     14.5  37.4  539.  15.8  28.3   51.8 2.94e+ 1
#> 10 A1    H6     16.8  35.5  569.  16.7  31.7  100   8.74e+ 1
#> 11 A1    H7     17.9  31.5  544   15.4  30.6   34.8 6.94e+ 1
#> 12 A1    H8     17.2  32.8  542.  15.1  32.4   18.8 1   e+ 2
#> 13 A1    H9     14.9  32.3  489.  15.5  32.0   38.7 9.17e+ 1
# }