%matplotlib inline
from d2l import mxnet as d2l
from IPython import display
from math import erf, factorial
import numpy as npA reference tour of the distributions used throughout the book — what they look like, when they apply, and how to sample / evaluate them in code.
Imports and plotting helpers are shared across the PMF, PDF, CDF, and sampling examples below.
P(X=1) = p, P(X=0) = 1-p. Mean p, variance p(1-p):
Equally likely categories. Maximum entropy on a finite set with no prior knowledge:
Density \frac{1}{b-a} on [a, b]. Source of pseudo-random samples for Monte Carlo and dropout:
Sum of n iid Bernoullis. Bell-shaped for large n (Gaussian limit):
n, p = 10, 0.2
# Compute binomial coefficient
def binom(n, k):
comb = 1
for i in range(min(k, n - k)):
comb = comb * (n - i) // (i + 1)
return comb
pmf = np.array([p**i * (1-p)**(n - i) * binom(n, i) for i in range(n + 1)])
d2l.plt.stem([i for i in range(n + 1)], pmf)
d2l.plt.xlabel('x')
d2l.plt.ylabel('p.m.f.')
d2l.plt.show()Rare events: P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}. Approximates binomial with n large, p small, np \to \lambda:
The cumulative distribution sums the probability of observing up to k events:
F(k)=P(X \le k).
Sampling turns the distribution into count data: nonnegative integers with mean and variance both near \lambda.
\mathcal{N}(\mu, \sigma^2) — bell curve. CLT makes it the limit of many small contributions; that’s why it’s everywhere:
p = 0.2
ns = [1, 10, 100, 1000]
d2l.plt.figure(figsize=(10, 3))
for i in range(4):
n = ns[i]
pmf = np.array([p**i * (1-p)**(n-i) * binom(n, i) for i in range(n + 1)])
d2l.plt.subplot(1, 4, i + 1)
d2l.plt.stem([(i - n*p)/np.sqrt(n*p*(1 - p)) for i in range(n + 1)], pmf)
d2l.plt.xlim([-4, 4])
d2l.plt.xlabel('x')
d2l.plt.ylabel('p.m.f.')
d2l.plt.title("n = {}".format(n))
d2l.plt.show()Changing \mu shifts the bell curve; changing \sigma spreads it. Samples concentrate near the mean and thin out in the tails.