The Multinomial distribution

The multinomial distribution is a generalization of the binomial distribution to multiple categories. It is perhaps easiest to think that we first extend a dist_bernoulli() distribution to include more than two categories, resulting in a dist_categorical() distribution. We then extend repeat the Categorical experiment several ($n$) times.

dist_multinomial(size, prob)

Arguments

size: The number of draws from the Categorical distribution.
prob: The probability of an event occurring from each draw.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_multinomial.html

In the following, let $X = (X_1, ..., X_k)$ be a Multinomial random variable with success probability prob = $p$. Note that $p$ is vector with $k$ elements that sum to one. Assume that we repeat the Categorical experiment size = $n$ times.

Support: Each $X_i$ is in $\{0, 1, 2, ..., n\}$.

Mean: The mean of $X_i$ is $n p_i$.

Variance: The variance of $X_i$ is $n p_i (1 - p_i)$. For $i \neq j$, the covariance of $X_i$ and $X_j$ is $-n p_i p_j$.

Probability mass function (p.m.f):

$$ P(X_1 = x_1, ..., X_k = x_k) = \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k} $$

where $\sum_{i=1}^k x_i = n$ and $\sum_{i=1}^k p_i = 1$.

Cumulative distribution function (c.d.f):

$$ P(X_1 \le q_1, ..., X_k \le q_k) = \sum_{\substack{x_1, \ldots, x_k \ge 0 \\ x_i \le q_i \text{ for all } i \\ \sum_{i=1}^k x_i = n}} \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k} $$

The c.d.f. is computed as a finite sum of the p.m.f. over all integer vectors in the support that satisfy the componentwise inequalities.

Moment generating function (m.g.f):

$$ E(e^{t'X}) = \left(\sum_{i=1}^k p_i e^{t_i}\right)^n $$

where $t = (t_1, ..., t_k)$ is a vector of the same dimension as $X$.

Skewness: The skewness of $X_i$ is

$$ \frac{1 - 2p_i}{\sqrt{n p_i (1 - p_i)}} $$

Excess Kurtosis: The excess kurtosis of $X_i$ is

$$ \frac{1 - 6p_i(1 - p_i)}{n p_i (1 - p_i)} $$

Examples

dist <- dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4)))

dist
#> <distribution[2]>
#> [1] Multinomial(4)[3] Multinomial(3)[3]
mean(dist)
#>      [,1] [,2] [,3]
#> [1,]  1.2  2.0  0.8
#> [2,]  0.3  1.5  1.2
variance(dist)
#>      [,1] [,2] [,3]
#> [1,] 0.84 1.00 0.64
#> [2,] 0.27 0.75 0.72

generate(dist, 10)
#> [[1]]
#>       [,1] [,2] [,3]
#>  [1,]    1    1    2
#>  [2,]    0    3    1
#>  [3,]    0    2    2
#>  [4,]    2    1    1
#>  [5,]    1    3    0
#>  [6,]    1    3    0
#>  [7,]    1    3    0
#>  [8,]    1    2    1
#>  [9,]    1    1    2
#> [10,]    1    3    0
#> 
#> [[2]]
#>       [,1] [,2] [,3]
#>  [1,]    0    0    3
#>  [2,]    0    2    1
#>  [3,]    0    2    1
#>  [4,]    0    1    2
#>  [5,]    0    2    1
#>  [6,]    0    2    1
#>  [7,]    1    1    1
#>  [8,]    0    1    2
#>  [9,]    1    1    1
#> [10,]    0    0    3
#> 

density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))))
#>      d
#> 1 0.18
#> 2 0.30
density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))), log = TRUE)
#>           d
#> 1 -1.714798
#> 2 -1.203973

cdf(dist, cbind(1,2,1))
#> [1] 0.180 0.495

Arguments

Details

See also

Examples