[Stable]

The Percentile distribution is a non-parametric distribution defined by a set of quantiles at specified percentile values. This distribution is useful for representing empirical distributions or elicited expert knowledge when only percentile information is available. The distribution uses linear interpolation between percentiles and can be used to approximate complex distributions that may not have simple parametric forms.

dist_percentile(x, percentile)

Arguments

x

A list of values

percentile

A list of percentiles

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_percentile.html

In the following, let \(X\) be a Percentile random variable defined by values \(x_1, x_2, \ldots, x_n\) at percentiles \(p_1, p_2, \ldots, p_n\) where \(0 \le p_i \le 100\).

Support: \([\min(x_i), \max(x_i)]\) if \(\min(p_i) > 0\) or \(\max(p_i) < 100\), otherwise support is approximated from the specified percentiles.

Mean: Approximated numerically using spline interpolation and numerical integration:

$$ E(X) \approx \int_0^1 Q(u) du $$

where \(Q(u)\) is a spline function interpolating the percentile values.

Variance: Approximated numerically.

Probability density function (p.d.f): Approximated numerically using kernel density estimation from generated samples.

Cumulative distribution function (c.d.f): Defined by linear interpolation:

$$ F(t) = \begin{cases} p_1/100 & \text{if } t < x_1 \\ p_i/100 + \frac{(t - x_i)(p_{i+1} - p_i)}{100(x_{i+1} - x_i)} & \text{if } x_i \le t < x_{i+1} \\ p_n/100 & \text{if } t \ge x_n \end{cases} $$

Quantile function: Defined by linear interpolation:

$$ Q(u) = x_i + \frac{(100u - p_i)(x_{i+1} - x_i)}{p_{i+1} - p_i} $$

for \(p_i/100 \le u \le p_{i+1}/100\).

Examples

dist <- dist_normal()
percentiles <- seq(0.01, 0.99, by = 0.01)
x <- vapply(percentiles, quantile, double(1L), x = dist)
dist_percentile(list(x), list(percentiles*100))
#> <distribution[1]>
#> [1] percentile[99]