The GPD distribution is commonly used to model the tails of distributions, particularly in extreme value theory.
The Pickands–Balkema–De Haan theorem states that for a large class of distributions, the tail (above some threshold) can be approximated by a GPD.
dist_gpd(location, scale, shape)We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_gpd.html
In the following, let \(X\) be a Generalized Pareto random variable with
parameters location = \(a\), scale = \(b > 0\), and
shape = \(s\).
Support: \(x \ge a\) if \(s \ge 0\), \(a \le x \le a - b/s\) if \(s < 0\)
Mean: $$ E(X) = a + \frac{b}{1 - s} \quad \textrm{for } s < 1 $$ \(E(X) = \infty\) for \(s \ge 1\)
Variance: $$ \textrm{Var}(X) = \frac{b^2}{(1-s)^2(1-2s)} \quad \textrm{for } s < 0.5 $$ \(\textrm{Var}(X) = \infty\) for \(s \ge 0.5\)
Probability density function (p.d.f):
For \(s = 0\): $$ f(x) = \frac{1}{b}\exp\left(-\frac{x-a}{b}\right) \quad \textrm{for } x \ge a $$
For \(s \ne 0\): $$ f(x) = \frac{1}{b}\left(1 + s\frac{x-a}{b}\right)^{-1/s - 1} $$ where \(1 + s(x-a)/b > 0\)
Cumulative distribution function (c.d.f):
For \(s = 0\): $$ F(x) = 1 - \exp\left(-\frac{x-a}{b}\right) \quad \textrm{for } x \ge a $$
For \(s \ne 0\): $$ F(x) = 1 - \left(1 + s\frac{x-a}{b}\right)^{-1/s} $$ where \(1 + s(x-a)/b > 0\)
Quantile function:
For \(s = 0\): $$ Q(p) = a - b\log(1-p) $$
For \(s \ne 0\): $$ Q(p) = a + \frac{b}{s}\left[(1-p)^{-s} - 1\right] $$
Median:
For \(s = 0\): $$ \textrm{Median}(X) = a + b\log(2) $$
For \(s \ne 0\): $$ \textrm{Median}(X) = a + \frac{b}{s}\left(2^s - 1\right) $$
Skewness and Kurtosis: No closed-form expressions; approximated numerically.
dist <- dist_gpd(location = 0, scale = 1, shape = 0)
dist
#> <distribution[1]>
#> [1] GPD(0, 1, 0)
mean(dist)
#> [1] 1
variance(dist)
#> [1] 1
generate(dist, 10)
#> [[1]]
#> [1] 0.02262269 1.99372116 0.62823119 0.30756015 1.10615511 0.19318979
#> [7] 0.45540875 0.15962817 0.73876885 2.20885641
#>
density(dist, 2)
#> [1] 0.1353353
density(dist, 2, log = TRUE)
#> [1] -2
cdf(dist, 4)
#> [1] 0.9816844
quantile(dist, 0.7)
#> [1] 1.203973