---
title: "Generalized fiducial inference on quantiles"
author: "Stéphane Laurent"
date: '2020-11-27'
tags: R, statistics
rbloggers: yes
output:
  md_document:
    variant: markdown
    preserve_yaml: true
  html_document:
    keep_md: no
highlighter: pandoc-solarized
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  echo = TRUE,
  message = FALSE,
  warning = FALSE,
  collapse = TRUE,
  fig.width = 9, fig.height = 5, 
  fig.path="./figures/gfiExtremes-"
)
```

My new package 'gfiExtremes' is on 
[CRAN](https://cran.r-project.org/web/packages/gfiExtremes/index.html) now. 
So it is time to present it.

This package allows to get confidence intervals about the quantiles of any 
reasonable distribution (although the inference is based on a parametric model).
The statistical inference is fiducial.

To give an illustration, I'm taking a sample of length 100 randomly generated 
from a Weibull distribution:

```{r weibullSample}
set.seed(1111111111L)
X <- rweibull(100L, shape = 1.5)
plot(X, pch = 19, main = "Data")
```

The model used for the fiducial inference assumes a generalized Pareto 
distribution above a certain threshold. For an unknown value of this threshold, 
the function to use is `gfigpd2`. It runs a MCMC sampler, and one has to 
specify the length of the burnin phase, the desired length of the MCMC chains 
after the burnin, and the thin value (e.g. a thin of 2 means that one sampled 
value over two is dropped). One also has to specify the desired probability 
levels of the quantiles we are interested in.

```{r gfigpd2}
library(gfiExtremes)
chains <- gfigpd2(
  X,                                         # data
  beta = c(99, 99.5, 99.9)/100,              # probability levels
  threshold.init = 0.7,                      # initial threshold value
  burnin = 20000L, iter = 20000L, thin = 10L # MCMC chains
)
```

By default, `gfigpd2` runs four MCMC chains and they are generated in parallel.

The output of `gfigpd2` is a R object ready for analysis with the 'coda' 
package, which is loaded by 'gfiExtremes'. In particular, it has a `summary` 
method:

```{r summary}
summary(chains)
```

The 'coda' package provides the `HPDinterval` function which gives the shortest 
confidence intervals:

```{r HPDintervals}
HPDinterval(joinMCMCchains(chains))
```

Below are the true values of the Weibull quantiles; they are caught by the 
confidence intervals:

```{r quantiles}
qweibull(c(99, 99.5, 99.9)/100, shape = 1.5)
```

## Convergence diagnostics

Now one has to check that the MCMC chains have entered in their stationary 
phase. It is better to take the logarithm of the simulations of the fiducial 
distributions of the quantiles:

```{r logChains}
logChains <- as.mcmc.list(lapply(chains, log))
```

The 'ggmcmc' package is helpful here. Firstly, let's have a look at the traces:

```{r traceplot}
library(ggmcmc)
gglogChains <- ggs(logChains)
library(ggthemes)
ggs_traceplot(gglogChains) + theme_fivethirtyeight()
```

Visually, nothing indicates a departure from the convergence. Let's look at the 
estimated densities now:

```{r densities}
ggs_density(gglogChains) + theme_fivethirtyeight()
```

The running means quickly stabilize:

```{r runningMeans}
ggs_running(gglogChains) + theme_fivethirtyeight()
```

Below are the densities of the whole chains compared with the densities of their 
last part:

```{r comparePartial}
ggs_compare_partial(gglogChains) + theme_fivethirtyeight()
```

The autocorrelations nicely decrease:

```{r autocorrelations}
ggs_autocorrelation(gglogChains) + theme_fivethirtyeight()
```

Let's also have a look at the Gelman-Rubin diagnostic:

```{r gelmanRubin}
gelman.diag(logChains)
```

The upper `Rhat` are close to 1, thereby indicating a successful diagnostic.

Finally, let's look at the Heidelberger & Welch diagnostic:

```{r heidelWelch}
heidel.diag(logChains)
```

All tests passed.