Confidence interval for correlation coefficient

Computes the half-width confidence interval for correlation coefficient using the nonparametric method proposed by Olivoto et al. (2018).

The half-width confidence interval is computed according to the following equation:

\[CI_w = 0.45304^r \times 2.25152 \times n^{-0.50089}\]

where \(n\) is the sample size and \(r\) is the correlation coefficient.

Usage

corr_ci(
  .data = NA,
  ...,
  r = NULL,
  n = NULL,
  by = NULL,
  sel.var = NULL,
  verbose = TRUE
)

Arguments

.data: The data to be analyzed. It can be a data frame (possible with grouped data passed from dplyr::group_by()) or a symmetric correlation matrix.
...: Variables to compute the confidence interval. If not informed, all the numeric variables from .data are used.
r: If data is not available, provide the value for correlation coefficient.
n: The sample size if data is a correlation matrix or if r is informed.
by: One variable (factor) to compute the function by. It is a shortcut to dplyr::group_by(). To compute the statistics by more than one grouping variable use that function.
sel.var: A variable to shows the correlation with. This will omit all the pairwise correlations that doesn't contain sel.var.
verbose: If verbose = TRUE then some results are shown in the console.

Value

A tibble containing the values of the correlation, confidence interval, upper and lower limits for all combination of variables.

References

Olivoto, T., A.D.C. Lucio, V.Q. Souza, M. Nardino, M.I. Diel, B.G. Sari, D.. K. Krysczun, D. Meira, and C. Meier. 2018. Confidence interval width for Pearson's correlation coefficient: a Gaussian-independent estimator based on sample size and strength of association. Agron. J. 110:1-8. doi:10.2134/agronj2016.04.0196

Author

Tiago Olivoto tiagoolivoto@gmail.com

Examples

# \donttest{
library(metan)

CI1 <- corr_ci(data_ge2)
#> # A tibble: 105 × 7
#>    V1    V2     Corr     n     CI    LL    UL
#>    <chr> <chr> <dbl> <int>  <dbl> <dbl> <dbl>
#>  1 PH    EH    0.932   156 0.0858 0.846 1.02 
#>  2 PH    EP    0.638   156 0.108  0.530 0.747
#>  3 PH    EL    0.380   156 0.133  0.247 0.513
#>  4 PH    ED    0.661   156 0.106  0.555 0.768
#>  5 PH    CL    0.325   156 0.139  0.186 0.464
#>  6 PH    CD    0.315   156 0.140  0.176 0.455
#>  7 PH    CW    0.505   156 0.120  0.384 0.625
#>  8 PH    KW    0.753   156 0.0988 0.655 0.852
#>  9 PH    NR    0.329   156 0.138  0.190 0.467
#> 10 PH    NKR   0.353   156 0.136  0.217 0.489
#> # … with 95 more rows

# By each level of the factor 'ENV'
CI2 <- corr_ci(data_ge2, CD, TKW, NKE,
               by = ENV,
               verbose = FALSE)
CI2
#> # A tibble: 12 × 8
#>    ENV   V1    V2       Corr     n    CI      LL      UL
#>    <fct> <chr> <chr>   <dbl> <int> <dbl>   <dbl>   <dbl>
#>  1 A1    CD    TKW    0.385     39 0.265  0.120   0.650 
#>  2 A1    CD    NKE   -0.0205    39 0.354 -0.374   0.333 
#>  3 A1    TKW   NKE   -0.589     39 0.225 -0.814  -0.363 
#>  4 A2    CD    TKW    0.518     39 0.238  0.280   0.756 
#>  5 A2    CD    NKE    0.710     39 0.205  0.505   0.915 
#>  6 A2    TKW   NKE    0.0755    39 0.338 -0.263   0.414 
#>  7 A3    CD    TKW    0.270     39 0.290 -0.0200  0.560 
#>  8 A3    CD    NKE    0.271     39 0.290 -0.0194  0.561 
#>  9 A3    TKW   NKE   -0.389     39 0.264 -0.653  -0.125 
#> 10 A4    CD    TKW    0.417     39 0.258  0.158   0.675 
#> 11 A4    CD    NKE    0.477     39 0.246  0.230   0.723 
#> 12 A4    TKW   NKE   -0.259     39 0.293 -0.552   0.0334
# }