[Stable]

Binomial distributions are used to represent situations can that can be thought as the result of \(n\) Bernoulli experiments (here the \(n\) is defined as the size of the experiment). The classical example is \(n\) independent coin flips, where each coin flip has probability p of success. In this case, the individual probability of flipping heads or tails is given by the Bernoulli(p) distribution, and the probability of having \(x\) equal results (\(x\) heads, for example), in \(n\) trials is given by the Binomial(n, p) distribution. The equation of the Binomial distribution is directly derived from the equation of the Bernoulli distribution.

dist_binomial(size, prob)

Arguments

size

The number of trials. Must be an integer greater than or equal to one. When size = 1L, the Binomial distribution reduces to the Bernoulli distribution. Often called n in textbooks.

prob

The probability of success on each trial, prob can be any value in [0, 1].

Details

We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.

The Binomial distribution comes up when you are interested in the portion of people who do a thing. The Binomial distribution also comes up in the sign test, sometimes called the Binomial test (see stats::binom.test()), where you may need the Binomial C.D.F. to compute p-values.

In the following, let \(X\) be a Binomial random variable with parameter size = \(n\) and p = \(p\). Some textbooks define \(q = 1 - p\), or called \(\pi\) instead of \(p\).

Support: \(\{0, 1, 2, ..., n\}\)

Mean: \(np\)

Variance: \(np \cdot (1 - p) = np \cdot q\)

Probability mass function (p.m.f):

$$ P(X = k) = {n \choose k} p^k (1 - p)^{n-k} $$

Cumulative distribution function (c.d.f):

$$ P(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n \choose i} p^i (1 - p)^{n-i} $$

Moment generating function (m.g.f):

$$ E(e^{tX}) = (1 - p + p e^t)^n $$

Examples

dist <- dist_binomial(size = 1:5, prob = c(0.05, 0.5, 0.3, 0.9, 0.1))

dist
#> <distribution[5]>
#> [1] B(1, 0.05) B(2, 0.5)  B(3, 0.3)  B(4, 0.9)  B(5, 0.1) 
mean(dist)
#> [1] 0.05 1.00 0.90 3.60 0.50
variance(dist)
#> [1] 0.0475 0.5000 0.6300 0.3600 0.4500
skewness(dist)
#> [1]  4.1294832  0.0000000  0.5039526 -1.3333333  1.1925696
kurtosis(dist)
#> [1] 15.0526316 -1.0000000 -0.4126984  1.2777778  1.0222222

generate(dist, 10)
#> [[1]]
#>  [1] 1 0 0 0 0 1 0 0 0 0
#> 
#> [[2]]
#>  [1] 2 0 0 2 1 2 1 1 1 1
#> 
#> [[3]]
#>  [1] 1 0 2 0 2 0 1 0 1 0
#> 
#> [[4]]
#>  [1] 2 4 4 3 3 4 3 3 4 1
#> 
#> [[5]]
#>  [1] 0 2 0 0 0 1 0 0 0 2
#> 

density(dist, 2)
#> [1] 0.0000 0.2500 0.1890 0.0486 0.0729
density(dist, 2, log = TRUE)
#> [1]      -Inf -1.386294 -1.666008 -3.024132 -2.618667

cdf(dist, 4)
#> [1] 1.00000 1.00000 1.00000 1.00000 0.99999

quantile(dist, 0.7)
#> [1] 0 1 1 4 1