The multinomial distribution is a generalization of the binomial
distribution to multiple categories. It is perhaps easiest to think
that we first extend a dist_bernoulli() distribution to include more
than two categories, resulting in a dist_categorical() distribution.
We then extend repeat the Categorical experiment several (\(n\))
times.
dist_multinomial(size, prob)We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_multinomial.html
In the following, let \(X = (X_1, ..., X_k)\) be a Multinomial
random variable with success probability prob = \(p\). Note that
\(p\) is vector with \(k\) elements that sum to one. Assume
that we repeat the Categorical experiment size = \(n\) times.
Support: Each \(X_i\) is in \(\{0, 1, 2, ..., n\}\).
Mean: The mean of \(X_i\) is \(n p_i\).
Variance: The variance of \(X_i\) is \(n p_i (1 - p_i)\). For \(i \neq j\), the covariance of \(X_i\) and \(X_j\) is \(-n p_i p_j\).
Probability mass function (p.m.f):
$$ P(X_1 = x_1, ..., X_k = x_k) = \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k} $$
where \(\sum_{i=1}^k x_i = n\) and \(\sum_{i=1}^k p_i = 1\).
Cumulative distribution function (c.d.f):
$$ P(X_1 \le q_1, ..., X_k \le q_k) = \sum_{\substack{x_1, \ldots, x_k \ge 0 \\ x_i \le q_i \text{ for all } i \\ \sum_{i=1}^k x_i = n}} \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k} $$
The c.d.f. is computed as a finite sum of the p.m.f. over all integer vectors in the support that satisfy the componentwise inequalities.
Moment generating function (m.g.f):
$$ E(e^{t'X}) = \left(\sum_{i=1}^k p_i e^{t_i}\right)^n $$
where \(t = (t_1, ..., t_k)\) is a vector of the same dimension as \(X\).
Skewness: The skewness of \(X_i\) is
$$ \frac{1 - 2p_i}{\sqrt{n p_i (1 - p_i)}} $$
Excess Kurtosis: The excess kurtosis of \(X_i\) is
$$ \frac{1 - 6p_i(1 - p_i)}{n p_i (1 - p_i)} $$
dist <- dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4)))
dist
#> <distribution[2]>
#> [1] Multinomial(4)[3] Multinomial(3)[3]
mean(dist)
#> [,1] [,2] [,3]
#> [1,] 1.2 2.0 0.8
#> [2,] 0.3 1.5 1.2
variance(dist)
#> [,1] [,2] [,3]
#> [1,] 0.84 1.00 0.64
#> [2,] 0.27 0.75 0.72
generate(dist, 10)
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 1 2
#> [2,] 0 3 1
#> [3,] 0 2 2
#> [4,] 2 1 1
#> [5,] 1 3 0
#> [6,] 1 3 0
#> [7,] 1 3 0
#> [8,] 1 2 1
#> [9,] 1 1 2
#> [10,] 1 3 0
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 0 0 3
#> [2,] 0 2 1
#> [3,] 0 2 1
#> [4,] 0 1 2
#> [5,] 0 2 1
#> [6,] 0 2 1
#> [7,] 1 1 1
#> [8,] 0 1 2
#> [9,] 1 1 1
#> [10,] 0 0 3
#>
density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))))
#> d
#> 1 0.18
#> 2 0.30
density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))), log = TRUE)
#> d
#> 1 -1.714798
#> 2 -1.203973
cdf(dist, cbind(1,2,1))
#> [1] 0.180 0.495