Binomial distributions are used to represent situations can that can be thought as the result of $$n$$ Bernoulli experiments (here the $$n$$ is defined as the size of the experiment). The classical example is $$n$$ independent coin flips, where each coin flip has probability p of success. In this case, the individual probability of flipping heads or tails is given by the Bernoulli(p) distribution, and the probability of having $$x$$ equal results ($$x$$ heads, for example), in $$n$$ trials is given by the Binomial(n, p) distribution. The equation of the Binomial distribution is directly derived from the equation of the Bernoulli distribution.

dist_binomial(size, prob)

Arguments

size

The number of trials. Must be an integer greater than or equal to one. When size = 1L, the Binomial distribution reduces to the Bernoulli distribution. Often called n in textbooks.

prob

The probability of success on each trial, prob can be any value in [0, 1].

Details

We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.

The Binomial distribution comes up when you are interested in the portion of people who do a thing. The Binomial distribution also comes up in the sign test, sometimes called the Binomial test (see stats::binom.test()), where you may need the Binomial C.D.F. to compute p-values.

In the following, let $$X$$ be a Binomial random variable with parameter size = $$n$$ and p = $$p$$. Some textbooks define $$q = 1 - p$$, or called $$\pi$$ instead of $$p$$.

Support: $$\{0, 1, 2, ..., n\}$$

Mean: $$np$$

Variance: $$np \cdot (1 - p) = np \cdot q$$

Probability mass function (p.m.f):

$$P(X = k) = {n \choose k} p^k (1 - p)^{n-k}$$

Cumulative distribution function (c.d.f):

$$P(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n \choose i} p^i (1 - p)^{n-i}$$

Moment generating function (m.g.f):

$$E(e^{tX}) = (1 - p + p e^t)^n$$

Examples

dist <- dist_binomial(size = 1:5, prob = c(0.05, 0.5, 0.3, 0.9, 0.1))

dist
#> <distribution[5]>
#> [1] B(1, 0.05) B(2, 0.5)  B(3, 0.3)  B(4, 0.9)  B(5, 0.1)
mean(dist)
#> [1] 0.05 1.00 0.90 3.60 0.50
variance(dist)
#> [1] 0.0475 0.5000 0.6300 0.3600 0.4500
skewness(dist)
#> [1]  4.1294832  0.0000000  0.5039526 -1.3333333  1.1925696
kurtosis(dist)
#> [1] 15.0526316 -1.0000000 -0.4126984  1.2777778  1.0222222

generate(dist, 10)
#> [[1]]
#>  [1] 1 0 0 0 0 1 0 0 0 0
#>
#> [[2]]
#>  [1] 2 0 0 2 1 2 1 1 1 1
#>
#> [[3]]
#>  [1] 1 0 2 0 2 0 1 0 1 0
#>
#> [[4]]
#>  [1] 2 4 4 3 3 4 3 3 4 1
#>
#> [[5]]
#>  [1] 0 2 0 0 0 1 0 0 0 2
#>

density(dist, 2)
#> [1] 0.0000 0.2500 0.1890 0.0486 0.0729
density(dist, 2, log = TRUE)
#> [1]      -Inf -1.386294 -1.666008 -3.024132 -2.618667

cdf(dist, 4)
#> [1] 1.00000 1.00000 1.00000 1.00000 0.99999

quantile(dist, 0.7)
#> [1] 0 1 1 4 1