[Stable]

The multinomial distribution is a generalization of the binomial distribution to multiple categories. It is perhaps easiest to think that we first extend a dist_bernoulli() distribution to include more than two categories, resulting in a dist_categorical() distribution. We then extend repeat the Categorical experiment several (\(n\)) times.

dist_multinomial(size, prob)

Arguments

size

The number of draws from the Categorical distribution.

prob

The probability of an event occurring from each draw.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_multinomial.html

In the following, let \(X = (X_1, ..., X_k)\) be a Multinomial random variable with success probability prob = \(p\). Note that \(p\) is vector with \(k\) elements that sum to one. Assume that we repeat the Categorical experiment size = \(n\) times.

Support: Each \(X_i\) is in \(\{0, 1, 2, ..., n\}\).

Mean: The mean of \(X_i\) is \(n p_i\).

Variance: The variance of \(X_i\) is \(n p_i (1 - p_i)\). For \(i \neq j\), the covariance of \(X_i\) and \(X_j\) is \(-n p_i p_j\).

Probability mass function (p.m.f):

$$ P(X_1 = x_1, ..., X_k = x_k) = \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k} $$

where \(\sum_{i=1}^k x_i = n\) and \(\sum_{i=1}^k p_i = 1\).

Cumulative distribution function (c.d.f):

$$ P(X_1 \le q_1, ..., X_k \le q_k) = \sum_{\substack{x_1, \ldots, x_k \ge 0 \\ x_i \le q_i \text{ for all } i \\ \sum_{i=1}^k x_i = n}} \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k} $$

The c.d.f. is computed as a finite sum of the p.m.f. over all integer vectors in the support that satisfy the componentwise inequalities.

Moment generating function (m.g.f):

$$ E(e^{t'X}) = \left(\sum_{i=1}^k p_i e^{t_i}\right)^n $$

where \(t = (t_1, ..., t_k)\) is a vector of the same dimension as \(X\).

Skewness: The skewness of \(X_i\) is

$$ \frac{1 - 2p_i}{\sqrt{n p_i (1 - p_i)}} $$

Excess Kurtosis: The excess kurtosis of \(X_i\) is

$$ \frac{1 - 6p_i(1 - p_i)}{n p_i (1 - p_i)} $$

Examples

dist <- dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4)))

dist
#> <distribution[2]>
#> [1] Multinomial(4)[3] Multinomial(3)[3]
mean(dist)
#>      [,1] [,2] [,3]
#> [1,]  1.2  2.0  0.8
#> [2,]  0.3  1.5  1.2
variance(dist)
#>      [,1] [,2] [,3]
#> [1,] 0.84 1.00 0.64
#> [2,] 0.27 0.75 0.72

generate(dist, 10)
#> [[1]]
#>       [,1] [,2] [,3]
#>  [1,]    1    1    2
#>  [2,]    0    3    1
#>  [3,]    0    2    2
#>  [4,]    2    1    1
#>  [5,]    1    3    0
#>  [6,]    1    3    0
#>  [7,]    1    3    0
#>  [8,]    1    2    1
#>  [9,]    1    1    2
#> [10,]    1    3    0
#> 
#> [[2]]
#>       [,1] [,2] [,3]
#>  [1,]    0    0    3
#>  [2,]    0    2    1
#>  [3,]    0    2    1
#>  [4,]    0    1    2
#>  [5,]    0    2    1
#>  [6,]    0    2    1
#>  [7,]    1    1    1
#>  [8,]    0    1    2
#>  [9,]    1    1    1
#> [10,]    0    0    3
#> 

density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))))
#>      d
#> 1 0.18
#> 2 0.30
density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))), log = TRUE)
#>           d
#> 1 -1.714798
#> 2 -1.203973

cdf(dist, cbind(1,2,1))
#> [1] 0.180 0.495