Maturing lifecycle

dist_multinomial(size, prob)

Arguments

size

integer, say \(N\), specifying the total number of objects that are put into \(K\) boxes in the typical multinomial experiment. For dmultinom, it defaults to sum(x).

prob

numeric non-negative vector of length \(K\), specifying the probability for the \(K\) classes; is internally normalized to sum 1. Infinite and missing values are not allowed.

Details

The multinomial distribution is a generalization of the binomial distribution to multiple categories. It is perhaps easiest to think that we first extend a dist_bernoulli() distribution to include more than two categories, resulting in a categorical distribution. We then extend repeat the Categorical experiment several (\(n\)) times.

We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.

In the following, let \(X = (X_1, ..., X_k)\) be a Multinomial random variable with success probability p = \(p\). Note that \(p\) is vector with \(k\) elements that sum to one. Assume that we repeat the Categorical experiment size = \(n\) times.

Support: Each \(X_i\) is in \({0, 1, 2, ..., n}\).

Mean: The mean of \(X_i\) is \(n p_i\).

Variance: The variance of \(X_i\) is \(n p_i (1 - p_i)\). For \(i \neq j\), the covariance of \(X_i\) and \(X_j\) is \(-n p_i p_j\).

Probability mass function (p.m.f):

$$ P(X_1 = x_1, ..., X_k = x_k) = \frac{n!}{x_1! x_2! ... x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot ... \cdot p_k^{x_k} $$

Cumulative distribution function (c.d.f):

Omitted for multivariate random variables for the time being.

Moment generating function (m.g.f):

$$ E(e^{tX}) = \left(\sum_{i=1}^k p_i e^{t_i}\right)^n $$

See also

stats::Multinomial

Examples

dist <- dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4))) dist
#> <distribution[2]> #> [1] Multinomial(4)[3] Multinomial(3)[3]
mean(dist)
#> ...1 ...2 ...3 #> 1 1.2 2.0 0.8 #> 2 0.3 1.5 1.2
variance(dist)
#> [[1]] #> [,1] [,2] [,3] #> [1,] 0.84 -0.6 -0.24 #> [2,] -0.60 1.0 -0.40 #> [3,] -0.24 -0.4 0.64 #> #> [[2]] #> [,1] [,2] [,3] #> [1,] 0.27 -0.15 -0.12 #> [2,] -0.15 0.75 -0.60 #> [3,] -0.12 -0.60 0.72 #>
generate(dist, 10)
#> [[1]] #> [,1] [,2] [,3] #> [1,] 1 2 1 #> [2,] 3 1 0 #> [3,] 2 2 0 #> [4,] 0 2 2 #> [5,] 0 2 2 #> [6,] 0 4 0 #> [7,] 3 0 1 #> [8,] 1 2 1 #> [9,] 0 2 2 #> [10,] 0 3 1 #> #> [[2]] #> [,1] [,2] [,3] #> [1,] 0 2 1 #> [2,] 0 2 1 #> [3,] 0 2 1 #> [4,] 0 1 2 #> [5,] 0 0 3 #> [6,] 0 3 0 #> [7,] 0 1 2 #> [8,] 0 2 1 #> [9,] 0 2 1 #> [10,] 1 1 1 #>
# TODO: Needs fixing to support multiple inputs # density(dist, 2) # density(dist, 2, log = TRUE)