Binomial distributions are used to represent situations can that can
be thought as the result of \(n\) Bernoulli experiments (here the
\(n\) is defined as the size
of the experiment). The classical
example is \(n\) independent coin flips, where each coin flip has
probability p
of success. In this case, the individual probability of
flipping heads or tails is given by the Bernoulli(p) distribution,
and the probability of having \(x\) equal results (\(x\) heads,
for example), in \(n\) trials is given by the Binomial(n, p) distribution.
The equation of the Binomial distribution is directly derived from
the equation of the Bernoulli distribution.
dist_binomial(size, prob)
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
The Binomial distribution comes up when you are interested in the portion
of people who do a thing. The Binomial distribution
also comes up in the sign test, sometimes called the Binomial test
(see stats::binom.test()
), where you may need the Binomial C.D.F. to
compute p-values.
In the following, let \(X\) be a Binomial random variable with parameter
size
= \(n\) and p
= \(p\). Some textbooks define \(q = 1 - p\),
or called \(\pi\) instead of \(p\).
Support: \(\{0, 1, 2, ..., n\}\)
Mean: \(np\)
Variance: \(np \cdot (1 - p) = np \cdot q\)
Probability mass function (p.m.f):
$$ P(X = k) = {n \choose k} p^k (1 - p)^{n-k} $$
Cumulative distribution function (c.d.f):
$$ P(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n \choose i} p^i (1 - p)^{n-i} $$
Moment generating function (m.g.f):
$$ E(e^{tX}) = (1 - p + p e^t)^n $$
dist <- dist_binomial(size = 1:5, prob = c(0.05, 0.5, 0.3, 0.9, 0.1))
dist
#> <distribution[5]>
#> [1] B(1, 0.05) B(2, 0.5) B(3, 0.3) B(4, 0.9) B(5, 0.1)
mean(dist)
#> [1] 0.05 1.00 0.90 3.60 0.50
variance(dist)
#> [1] 0.0475 0.5000 0.6300 0.3600 0.4500
skewness(dist)
#> [1] 4.1294832 0.0000000 0.5039526 -1.3333333 1.1925696
kurtosis(dist)
#> [1] 15.0526316 -1.0000000 -0.4126984 1.2777778 1.0222222
generate(dist, 10)
#> [[1]]
#> [1] 1 0 0 0 0 1 0 0 0 0
#>
#> [[2]]
#> [1] 2 0 0 2 1 2 1 1 1 1
#>
#> [[3]]
#> [1] 1 0 2 0 2 0 1 0 1 0
#>
#> [[4]]
#> [1] 2 4 4 3 3 4 3 3 4 1
#>
#> [[5]]
#> [1] 0 2 0 0 0 1 0 0 0 2
#>
density(dist, 2)
#> [1] 0.0000 0.2500 0.1890 0.0486 0.0729
density(dist, 2, log = TRUE)
#> [1] -Inf -1.386294 -1.666008 -3.024132 -2.618667
cdf(dist, 4)
#> [1] 1.00000 1.00000 1.00000 1.00000 0.99999
quantile(dist, 0.7)
#> [1] 0 1 1 4 1